TEXTractor provides the functionality to extract text and/or metadata from 76 file types (PDF, Office, Images, Email, and more).
Please find the full list of supported file types here.
Built using a modified version of the Toxy library (https://github.com/bmlpg/toxy).