Hi
I have a portable version of PDF-Xchange viewer (latest) running under Win 8. Using the OCR function I am able to make a pdf searchable. Is there a way, however, of accessing the text overlay e.g. as a .txt document or any other format?
Thanks
OCR - any way of accessing the text overlay as a .txt doc?
Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan
-
- User
- Posts: 5
- Joined: Sun Aug 22, 2010 7:35 am
-
- User
- Posts: 381
- Joined: Mon Jun 13, 2011 5:10 pm
Re: OCR - any way of accessing the text overlay as a .txt do
OCR text is essentially the same as visible text, except that it is not rendered. You can extract text by selecting it with the mouse, and copying / pasting, or you can use the Viewer's javascript provisions. I have attached a simple script that extracts text from the current page and outputs it to a text file.
Simply hit "Ctrl-J" within the Viewer to bring up the javascript console, and paste the contents of the attached script (which is a javascript script compressed with 7Zip). Press the run button and it will prompt you for an output filename to save the plain text results to. You can modify the script as you see fit, for example to save to a text file without user intervention.
Our Viewer replicates much of the functionality of the Adobe Javascript API, so you can check their reference manual for information on usage:
http://www.adobe.com/devnet/acrobat/pdf ... erence.pdf
Simply hit "Ctrl-J" within the Viewer to bring up the javascript console, and paste the contents of the attached script (which is a javascript script compressed with 7Zip). Press the run button and it will prompt you for an output filename to save the plain text results to. You can modify the script as you see fit, for example to save to a text file without user intervention.
Our Viewer replicates much of the functionality of the Adobe Javascript API, so you can check their reference manual for information on usage:
http://www.adobe.com/devnet/acrobat/pdf ... erence.pdf
You do not have the required permissions to view the files attached to this post.
-
- User
- Posts: 5
- Joined: Sun Aug 22, 2010 7:35 am
Re: OCR - any way of accessing the text overlay as a .txt do
Great Thanks Walter! I appreciate the quick feedback.
occam
occam
-
- Site Admin
- Posts: 6815
- Joined: Mon Oct 15, 2012 9:21 pm
- Location: London, UK
Re: OCR - any way of accessing the text overlay as a .txt do
If posting files to this forum, you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded.
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com
Thank you.
Best regards
Will Travaglini
Tracker Support (Europe)
Tracker Software Products Ltd.
http://www.tracker-software.com