Walter,
I'm using PXC_TextOutA in the DLL Lib to place the OCR text using the OCR word coordinates I receive. However, my text placement doesn't turn out as smooth as yours. Is it possible to share your algorithm for placing the OCR words into a PDF document so that then line up behind in image in a smooth way. That is, so the selected text appears to be all in one smooth line, instead of a jagged line like my placement gives.
--Jeff
Text Placement Algorithm
Moderators: TrackerSupp-Daniel, Tracker Support, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Tracker Supp-Stefan
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: Text Placement Algorithm
We vertically align text to a common baseline
Hope that helps.
-Walter
Hope that helps.
-Walter
Re: Text Placement Algorithm
How do you establish your common base line among all words in a line.
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: Text Placement Algorithm
Hi Jeff,
The baseline can be straight from OCR layout analysis, or you can obviously calculate it yourself from symbol or word coordinates using a simple average.
Using the baseline calculated by OCR, you can place symbols the same way we do with the PXC library function:
Omitting OCR details, error checking, etc, the code looks something like:
The baseline can be straight from OCR layout analysis, or you can obviously calculate it yourself from symbol or word coordinates using a simple average.
Using the baseline calculated by OCR, you can place symbols the same way we do with the PXC library function:
Code: Select all
PDFXCLIB_API HRESULT PXC_API PXC_TextOutExW(_PXCContent* content, LPCPXC_PointF origin, LPCWSTR lpwszText, LONG cbLen, const double* lpDX);
Code: Select all
WCHAR* symbolstring;
double* symbolDXList; // array of X-axis offset, in pts, of each character from first character.
int NumSymbols;
PXC_PointF ptOrigin; // this is the origin of the first character; the Y coordinate is obviously the baseline, X is the start of the string of characters
GetResults(...); // fill symbolstring, symbolDXList, ptOrigin, and NumSymbols - ie, place OCR results in a couple of arrays
_PXCContent* pContent = (_PXCContent*)pWritePage; // pWritePage is _PXCPage type
PXC_TextOptions topt;
memset(&topt, 0, sizeof(topt));
topt.fontSize = xx; // size in pts
topt.fontID = output_font_id; // output font ID added to pdf already
topt.nTextPosition = TextPosition_Baseline;
PXC_TextRenderingMode rmode;
rmode = TextRenderingMode_None;
PXC_SetTextOptions(pContent, &topt);
PXC_SetTextRMode(pContent, rmode, &oldmode);
oldcolour = PXC_SetFillColor(pContent, RGB(0,0,0));
PXC_TextOutExW(pContent, &ptOrigin, symbolstring, NumSymbols, symbolDXlist);
PXC_SetTextRMode(pContent, oldmode, NULL); // revert to old mode - optional
PXC_SetFillColor(pContent, oldcolour); // revert to old colour - optional
PXC_SetTextOptions(pContent, &told); // revert to old options - optional
Re: Text Placement Algorithm
As always, thanks much!
- Will - Tracker Supp
- Site Admin
- Posts: 6815
- Joined: Mon Oct 15, 2012 9:21 pm
- Location: London, UK
- Contact:
Re: Text Placement Algorithm
Hi Jeff,
I'll pass the message along to Walter
Cheers,
Will
I'll pass the message along to Walter
Cheers,
Will
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com