Available Actions
OCR Capabilities (Tesseract 5)
TEXTractor supports text extraction from scanned PDFs and from the following image formats: bmp, gif, jpeg, pbm, png, tiff, webp.
English is the default language, but you can choose to use any of the tesseract supported languages (https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html).
Language trained data files are automatically fetched from github tessdata_fast repository (https://github.com/tesseract-ocr/tessdata_fast) and cached in the front-end temp directory.
Security & Privacy
All processing is performed entirely in-memory within the server context. No data is persisted, and no data ever leaves your environment.
Available actions: