Hi Support,
Seems like in some of the few past releases, the extension of the language files for the Default OCR engine (the one available in SDK) has changed from ".dat" to ".traineddata".
Can you confirm that's the only change (related to using the default OCR from SDK)?
Are the same files used in both x86 and x64 (previously, with .dat, this was the case) ?
p.s.
For new readers, this is kind of an addon to this topic: viewtopic.php?t=33535
-žarko
(Default) OCR in SDK, .dat files in Tesseract folder -> .traineddata SOLVED
Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan
Forum rules
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.
When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
DO NOT post your license/serial key, or your activation code - these forums, and all posts within, are public and we will be forced to immediately deactivate your license.
When experiencing some errors, use the IAUX_Inst::FormatHRESULT method to see their description and include it in your post along with the error code.
-
- User
- Posts: 1372
- Joined: Thu Sep 05, 2019 12:35 pm
-
- Site Admin
- Posts: 17960
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
Re: (Default) OCR in SDK, .dat files in Tesseract folder -> .traineddata
Hello zarkogajic,
I've passed your above enquiry to our devs working on the OCR engines - and we will post here a further update as soon as it's available!
Kind regards,
Stefan
I've passed your above enquiry to our devs working on the OCR engines - and we will post here a further update as soon as it's available!
Kind regards,
Stefan
-
- Site Admin
- Posts: 2353
- Joined: Thu Jun 30, 2005 4:11 pm
- Location: Canada
Re: (Default) OCR in SDK, .dat files in Tesseract folder -> .traineddata
Its occured because we updated the Tesseract engine. The new Tesseract uses the *.traineddata files instead of the older *.dat. And seems both formats are incompatible, unfortunately. And yes, both x86 and x64 use the same lang-files as well.
Vasyl Yaremyn
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
-
- User
- Posts: 1372
- Joined: Thu Sep 05, 2019 12:35 pm
Re: (Default) OCR in SDK, .dat files in Tesseract folder -> .traineddata
Hi Vasyl,
Thanks
Btw,
I've simply renamed the file extension and all seems to work.
-žarko
Thanks
Btw,
What do you mean?And seems both formats are incompatible, unfortunately.
I've simply renamed the file extension and all seems to work.
-žarko
-
- Site Admin
- Posts: 17960
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
Re: (Default) OCR in SDK, .dat files in Tesseract folder -> .traineddata
Hello zarkogajic,
I will ask Vasyl to clarify however if it works for you with just renaming the files - that's great!
Kind regards,
Stefan
I will ask Vasyl to clarify however if it works for you with just renaming the files - that's great!
Kind regards,
Stefan
-
- User
- Posts: 1372
- Joined: Thu Sep 05, 2019 12:35 pm
-
- Site Admin
- Posts: 2353
- Joined: Thu Jun 30, 2005 4:11 pm
- Location: Canada
Re: (Default) OCR in SDK, .dat files in Tesseract folder -> .traineddata
I will ask my colleagues about your tricky method. Still not sure it is correct to just rename old lang files...
Vasyl Yaremyn
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
-
- Site Admin
- Posts: 2353
- Joined: Thu Jun 30, 2005 4:11 pm
- Location: Canada
Re: (Default) OCR in SDK, .dat files in Tesseract folder -> .traineddata
Our dev said that it was an upgrade of tesseract modules from v4 to newer v5. And newer tesseract uses different format for lang-files. Technically, the container is the same, but the data inside might be different. So it looks like the v5 tesseract is able to open and read those files, but there is a chance that some necessary data might be absent...
Vasyl Yaremyn
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Tracker Software Products
Project Developer
Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
-
- User
- Posts: 1372
- Joined: Thu Sep 05, 2019 12:35 pm
Re: (Default) OCR in SDK, .dat files in Tesseract folder -> .traineddata SOLVED
Hi Vasyl,
Clear, thanks. Case closed.
-žarko
Clear, thanks. Case closed.
-žarko
-
- Site Admin
- Posts: 17960
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London