[Simple OCR] How to fix  error on Simple OCR "Could not initialize tesseract"
Question
simple-ocr
Web icon
Forge component by Takasi Moriya

Hi, 

I´m working with the forge component "Simple OCR" https://www.outsystems.com/forge/component-overview/3086/simple-ocr

And I am having problems to convert the image text.

Note: In my personal environment it works perfectly, but in another environment I get this error message: "Could not initialize tesseract."

Any Ideia about what I can do to fix it?


ExtractTextFromMemImage action of SimpleOCR extension takes three parameters.
DataPath is the third parameter of it.
You can specify the path by an expression like below.

Path_GetApplicationDirectory.ApplicationDirectory + "\tessdata" 

Path_GetApplicationDirectory is an action of FileSystem.
You might find something by checking the data file by using File_Exists action of FileSystem.

It seems that trained data file was not able to read.

Is the following sample component able to perform in your environment?
https://www.outsystems.com/forge/component-overview/3500/simple-ocr-sample

Specifying DataPath parameter with exact directory path may solve your problem.
You might need Forge's FileSystem to build the exact directory path.
https://www.outsystems.com/forge/component-overview/68/filesystem

I hope it helps you.

Hi, @Takasi Moriya 

Thanks for your attention.

1. Yes, I was guided by Simple OCR Sample in the two environments where I used the component.

2. "Specifying DataPath parameter with exact directory path may solve your problem."   Could you clarify more about the directory? "trained data file was not able to read." Means that I need some configuration on this resources? 

Regards, Jessica Marques.

I have no experience with the OCr component, but why does your resource have a \ in the name, where the path has / (as they should have)?

Hi, these prints are from the sample built by Takashi, and I entered a new language just like the features that were already (jpn and eng). The name of the resources is the same as the one I uploaded from this page: https://tesseract-ocr.github.io/tessdoc/Data-Files

Regards, 

Jessica. 

ExtractTextFromMemImage action of SimpleOCR extension takes three parameters.
DataPath is the third parameter of it.
You can specify the path by an expression like below.

Path_GetApplicationDirectory.ApplicationDirectory + "\tessdata" 

Path_GetApplicationDirectory is an action of FileSystem.
You might find something by checking the data file by using File_Exists action of FileSystem.

when i download the jpn, i get 43mb, but your file is only 2,4...


Could it be a corrupt file?

Hi @Takasi Moriya 

Now it works!

Thanks a lot the help. 

Best Regards, 

Jessica. 

Community GuidelinesBe kind and respectful, give credit to the original source of content, and search for duplicates before posting.