Stable version 1.0.0 (Compatible with OutSystems 11)

Uploaded

on 24 April 2019

5.0

(2 ratings)

Details

Optical character recognition or optical character reader, often abbreviated as OCR, is the mechanical or electronic conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example from a television broadcast)

Nowadays, the Optical Character Recognition is the preferred way to digitize documents, instead of entering the metadata of the documents manually, because the OCR will identify the text in the documents which are fed into the document management system and allows you to do something with the plain text, without even reading it by yourself. For JavaScript, there's a popular solution based on the Tesseract OCR engine, we are talking about the Tesseract.js project. Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. This library supports over 60 languages, automatic text orientation and script detection, a simple interface for reading paragraph, word, and character bounding boxes. Tesseract.js can run either in a browser and on a server with NodeJS which makes it available on a lot of platforms.

Release notes (1.0.0)

Reviews (1)