textractor
Service icon

TEXTractor

Stable version 2.6.0 (Compatible with OutSystems 11)
Uploaded
 on 16 May (3 days ago)
 by 
5.0
 (1 rating)
textractor

TEXTractor

Details
Extract text and/or metadata from 76 file types (PDF, Office, Images, Email, and more).
Read more

TEXTractor provides the functionality to extract text and/or metadata from 76 file types (PDF, Office, Images, Email, and more).

Please find the full list of supported file types here.

Built using a modified version of the Toxy library (https://github.com/bmlpg/toxy).

Release notes (2.6.0)
  • Improved PDF extraction to capture and return dominant paragraph font styles and sizes (e.g., SourceSansPro-Bold_14). This structural metadata lays the groundwork for advanced, layout-aware semantic chunking.
  • Updated NuGet package dependencies to latest versions.
License (2.6.0)
Reviews (1)
by 
2025-11-03
in version 1.0.0
Useful tool for getting metadata for a file, making the job easy. This came just in time.
Team
Other assets in this category