TEXTractor provides the functionality to extract text and/or metadata from 76 file types (PDF, Office, Images, Email, and more).
Please find the full list of supported file types here.
Built using a modified version of the Toxy library (https://github.com/bmlpg/toxy).
Try now: link
(or install "TEXTractor Demo" from Forge)
Document Structured Extraction Enhancements:
BSD-3 license (https://opensource.org/licenses/BSD-3-Clause)