PDF-X OCR SDK is a New product from us and intended to compliment our existing PDF and Imaging Tools to provide the Developer with an expanding set of professional tools for Optical Character Recognition tasks
Moderators:TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan
datapath = Application.StartupPath & "\ocrdats"
Dim region As PXO_InputField
region.blackList = VarPtrAny(wlist)
region.whiteList = VarPtrAny(blist)
region.top = 110
region.bottom = 200
region.nPage = 1
region.left = 106
region.right = 209
Dim options As new PXO_Options
options.blackList = VarPtrAny(wlist)
options.whiteList = VarPtrAny(blist)
options.ImageFlags = OcrCommon.OCR_ImageProcessingFlags.OCR_Image_NoRotate '& OcrCommon.OCR_ImageProcessingFlags.OCR_Image_SuppressOutput 'SuppressOutput only saves a hidden text pdf (will combine with original image pdf). Also turned off rotate (is actually deskew) so text positions match up with image when it is merged back together in a separate function.
options.lang = 0 '0 = defaults to english, not currently supporting other languages
options.raster_dpi = 300 'less than 300 and OCR suffers
options.RegionMode = OcrCommon.OCR_RegionMode.OCR_Auto
options.DataPath = VarPtrAny(datapath)
options.accMode = 0
'9/21/12 allows specifying a page number to ocr. Is 0 based
Dim pg As Integer 'pg will either be a pointer to newpagelist or 0
pg = 0
'9/21/12 end page number
Dim textout As String = ""
textout = VarPtrAny(textout)
'res = OCR_MakeSearchable(doc, options, pg) 'runs the actual OCR whole page ocr working
res = OCR_GetField(doc, options, textout, region, 1)
'res = OCR_GetText(doc, options, textout, pg, )
'SysFreeString(textout)
OCR_SaveA(doc, "OCRFinal.pdf")
as you can see i've made two/three different attempts see commented out res = lines
It is unnecessary to append '\ocrdats' to the data directory; this is hardcoded into the library, so that if you specify "c:\myproject", OCR will look for language files in "c:\myproject\ocrdats". In your case, it will be looking for "(Applicationpath)\ocrdats\ocrdats\".
I realize this is an unnecessary restriction, but since we released with this behaviour in place we haven't seen fit to remove it.
It'll be changed in the new SDK versions coming out in 2013, so that the path provided will be exactly the path to search for language files.
Here's a quick and dirty C# example which does include both a call to OCR_MakeSearchable() and OCR_GetFields().
I realize you posted VB code, but you mentioned both C# and VB and at the moment (under a big deadline) it is most expedient to give this particular example.
It might help if you could provide the PDF you are trying to OCR, or one that at least reprdoces the problems you are having. Is the error code returned from OCR_GetFields() or is it from OCR_MakeSearchable()? Remember coordinates are in PDF points, and specifying "1" for the flags (last argument) means the top left corner is the origin, with the positive Y axis pointing down.
You do not have the required permissions to view the files attached to this post.
If you still have problems - perhaps strip your code down to that which is relevant and send to us - along with a specific PDF and I am sure walter can apply a little 'magic.
cheers
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.