Hello, we have questions about the OCR functions with C #, are there functions to explore the pdf
• Identify the objects of the pdf
• Identify their nature (here image). What other types does it exist?
• What is your information about 'image? her size ? his coordinates ? in which unit? What other information do we ²have?
thank you,
Best Regards.
OCR function
Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan
-
- Site Admin
- Posts: 17960
- Joined: Mon Jan 12, 2009 8:07 am
- Location: London
Re: OCR function
Hello AUdros,
Please take a look at what our OCR module has to offer here:
https://help.pdf-xchange.com/pdfxp ... odule.html
It will allow you to perform the actual OCR process on the pages. If you want to gather information about elements and structure of the pages - please use the Core API SDK (also part of the PRO SDK bundle which you need for the OCR SDK).
Regards,
Stefan
Please take a look at what our OCR module has to offer here:
https://help.pdf-xchange.com/pdfxp ... odule.html
It will allow you to perform the actual OCR process on the pages. If you want to gather information about elements and structure of the pages - please use the Core API SDK (also part of the PRO SDK bundle which you need for the OCR SDK).
Regards,
Stefan
-
- User
- Posts: 77
- Joined: Fri Jun 08, 2018 1:39 pm
Re: OCR function
Hello
-I could use the code of the OCR API to retrieve a symbol of a text,
my goal is to be able to frame a symbol to choose, but I can not find the coordinates of the symble, my code is quoted below.
is there a way to compare images?
Thank you
/////////////////////////////////////////////////////////////////////
IntPtr pg;
PDFXOCR.PDFXOCR_Funcs.OCR_RasterPageSettings pRasterSettings;
PDFXOCR.PDFXOCR_Funcs.OCRp_Page(pdf, 0, ref options, out pg, out pRasterSettings);
uint pRegionCount;
PDFXOCR.PDFXOCR_Funcs.OCRp_RegionCountFromPage(pg, out pRegionCount);
for (uint i = 0; i <= pRegionCount; i++) {
IntPtr pRegionResults;
PDFXOCR.PDFXOCR_Funcs.OCRp_GetRegionFromPage(pg, i,out pRegionResults);
uint pSymbolCount;
PDFXOCR.PDFXOCR_Funcs.OCRp_SymbolCountFromRegion(pRegionResults,out pSymbolCount);
if (pSymbolCount > 0) {
for(uint j=0;j<=pSymbolCount;j++){
PDFXOCR.PDFXOCR_Funcs.OCR_SymbolBox pSymbolBox;
PDFXOCR.PDFXOCR_Funcs.OCRp_GetSymbolFromRegion(pRegionResults,j,out pSymbolBox);
Console.Write(pSymbolBox.wcSymbol);
if (pSymbolBox.wcSymbol == "z")
{
uint nFreeText = ((PDFXEdit.IPXS_Inst)pdfCtl.Inst.GetExtension("PXS")).StrToAtom("FreeText");
PDFXEdit.PXC_Rect rc ;
rc.left =pdfCtl.Width - pSymbolBox.rcBound.left ;//I can't get exact coordinate!!
rc.right = rc.left + (-pSymbolBox.rcBound.right + pSymbolBox.rcBound.left);
rc.top = 800;// pSymbolBox.rcBound.top;
rc.bottom = rc.top - (pSymbolBox.rcBound.bottom - pSymbolBox.rcBound.top);// pSymbolBox.rcBound.bottom;
IPXC_Annotation pAnnotF = pdfCtl.Doc.CoreDoc.Pages[0].InsertNewAnnot(nFreeText, ref rc, 0);
PDFXEdit.IPXC_AnnotData_FreeText SQDataF = (PDFXEdit.IPXC_AnnotData_FreeText)pAnnotF.Data;
SQDataF.Opacity = 0.7;
SQDataF.DefaultFontSize = pSymbolBox.pointsize;
SQDataF.DefaultTextAlign = (int)PDFXEdit.UIX_AlignFlags.UIX_Align_Center;
var borderF = new PDFXEdit.PXC_AnnotBorder();
borderF.nWidth = 2f;
borderF.nStyle = PDFXEdit.PXC_AnnotBorderStyle.ABS_Solid;
SQDataF.set_Border(borderF);
pAnnotF.Data = SQDataF;
int nID = pdfCtl.Inst.Str2ID("op.annots.addNew", false);
PDFXEdit.IOperation pOp = pdfCtl.Inst.CreateOp(nID);
PDFXEdit.ICabNode input = pOp.Params.Root["Input"];
input.Add().v = pAnnotF;
pOp.Do();
}
}
}
}
-I could use the code of the OCR API to retrieve a symbol of a text,
my goal is to be able to frame a symbol to choose, but I can not find the coordinates of the symble, my code is quoted below.
is there a way to compare images?
Thank you
/////////////////////////////////////////////////////////////////////
IntPtr pg;
PDFXOCR.PDFXOCR_Funcs.OCR_RasterPageSettings pRasterSettings;
PDFXOCR.PDFXOCR_Funcs.OCRp_Page(pdf, 0, ref options, out pg, out pRasterSettings);
uint pRegionCount;
PDFXOCR.PDFXOCR_Funcs.OCRp_RegionCountFromPage(pg, out pRegionCount);
for (uint i = 0; i <= pRegionCount; i++) {
IntPtr pRegionResults;
PDFXOCR.PDFXOCR_Funcs.OCRp_GetRegionFromPage(pg, i,out pRegionResults);
uint pSymbolCount;
PDFXOCR.PDFXOCR_Funcs.OCRp_SymbolCountFromRegion(pRegionResults,out pSymbolCount);
if (pSymbolCount > 0) {
for(uint j=0;j<=pSymbolCount;j++){
PDFXOCR.PDFXOCR_Funcs.OCR_SymbolBox pSymbolBox;
PDFXOCR.PDFXOCR_Funcs.OCRp_GetSymbolFromRegion(pRegionResults,j,out pSymbolBox);
Console.Write(pSymbolBox.wcSymbol);
if (pSymbolBox.wcSymbol == "z")
{
uint nFreeText = ((PDFXEdit.IPXS_Inst)pdfCtl.Inst.GetExtension("PXS")).StrToAtom("FreeText");
PDFXEdit.PXC_Rect rc ;
rc.left =pdfCtl.Width - pSymbolBox.rcBound.left ;//I can't get exact coordinate!!
rc.right = rc.left + (-pSymbolBox.rcBound.right + pSymbolBox.rcBound.left);
rc.top = 800;// pSymbolBox.rcBound.top;
rc.bottom = rc.top - (pSymbolBox.rcBound.bottom - pSymbolBox.rcBound.top);// pSymbolBox.rcBound.bottom;
IPXC_Annotation pAnnotF = pdfCtl.Doc.CoreDoc.Pages[0].InsertNewAnnot(nFreeText, ref rc, 0);
PDFXEdit.IPXC_AnnotData_FreeText SQDataF = (PDFXEdit.IPXC_AnnotData_FreeText)pAnnotF.Data;
SQDataF.Opacity = 0.7;
SQDataF.DefaultFontSize = pSymbolBox.pointsize;
SQDataF.DefaultTextAlign = (int)PDFXEdit.UIX_AlignFlags.UIX_Align_Center;
var borderF = new PDFXEdit.PXC_AnnotBorder();
borderF.nWidth = 2f;
borderF.nStyle = PDFXEdit.PXC_AnnotBorderStyle.ABS_Solid;
SQDataF.set_Border(borderF);
pAnnotF.Data = SQDataF;
int nID = pdfCtl.Inst.Str2ID("op.annots.addNew", false);
PDFXEdit.IOperation pOp = pdfCtl.Inst.CreateOp(nID);
PDFXEdit.ICabNode input = pOp.Params.Root["Input"];
input.Add().v = pAnnotF;
pOp.Do();
}
}
}
}
-
- User
- Posts: 5522
- Joined: Fri Nov 21, 2014 8:27 am
Re: OCR function
Hello Audros,
Are you using the Editor SDK? Because this code is not correct at all:
Cheers,
Alex
Are you using the Editor SDK? Because this code is not correct at all:
Code: Select all
PDFXEdit.PXC_Rect rc ;
rc.left =pdfCtl.Width - pSymbolBox.rcBound.left ;//I can't get exact coordinate!!
rc.right = rc.left + (-pSymbolBox.rcBound.right + pSymbolBox.rcBound.left);
rc.top = 800;// pSymbolBox.rcBound.top;
rc.bottom = rc.top - (pSymbolBox.rcBound.bottom - pSymbolBox.rcBound.top);// pSymbolBox.rcBound.bottom;
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
-
- User
- Posts: 77
- Joined: Fri Jun 08, 2018 1:39 pm
Re: OCR function
Hello,
yes i use the sdk editor, how can i get coordinate Sir, we have the OCR sdk too. Thanks.
Best regards
yes i use the sdk editor, how can i get coordinate Sir, we have the OCR sdk too. Thanks.
Best regards
-
- User
- Posts: 5522
- Joined: Fri Nov 21, 2014 8:27 am
Re: OCR function
Hello Audros,
Then why don't you use the https://sdkhelp.pdf-xchange.com/vi ... t_OCRPages operation to OCR the document - not the entire OCR SDK?
Then you will have the text content items that you can modify as you wish. Basically you will have to get the IPXC_PageText from the IPXC_Page interface. And then you will get access to all of the symbol coordinates that you need.
Cheers,
Alex
Then why don't you use the https://sdkhelp.pdf-xchange.com/vi ... t_OCRPages operation to OCR the document - not the entire OCR SDK?
Then you will have the text content items that you can modify as you wish. Basically you will have to get the IPXC_PageText from the IPXC_Page interface. And then you will get access to all of the symbol coordinates that you need.
Cheers,
Alex
Subscribe at:
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ
https://www.youtube.com/channel/UC-TwAMNi1haxJ1FX3LvB4CQ