Greek text not recognized in search box

The PDF-XChange Viewer for End Users
+++ FREE +++

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
nicospanas
User
Posts: 2
Joined: Mon Nov 20, 2017 2:08 pm

Greek text not recognized in search box

Post by nicospanas »

Hi, I have a PDF file that is in both Greek and English characters, but mostly in Greek. It reads correctly, but when searching from the search box, Greek characters are not recognized. If I copy paste some of the Greek text in the search box it appears as gibberish (e.g. Eíäåßîåéò/Áíôåíäåßîåéò:). How can I tackle this problem?

PDF can be found here: http://www.isathens.gr/images/documents/nat_form.pdf
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17892
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Greek text not recognized in search box

Post by Tracker Supp-Stefan »

Hello nicospanas,

Welcome to our forums and thanks for the enquiry.

Text inside PDF file is quite a complex structure, and there are components that guarantee that the text will be displayed well, but there is a separate portion that allows text to be extracted correctly. That last bit of info can be skipped if a minimal file size is required, or if text extraction is to be prevented. Unfortunately there is little we can do - as the problem is with the file and not at our end. Adobe as well is unable to copy text from this file correctly:
text_extraction.png
You can e.g. try to OCR those pages - and see if you can select text after this, or you can contact the document producer and tell them of the troubles you are experiencing and they might be able to provide another version with all text information included.

Regards,
Stefan
nicospanas
User
Posts: 2
Joined: Mon Nov 20, 2017 2:08 pm

Re: Greek text not recognized in search box

Post by nicospanas »

Thank you for your reply. I have tried OCR, but didn't work. I will see if I can find the producer. Cheers
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17892
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Greek text not recognized in search box

Post by Tracker Supp-Stefan »

Hello nicospanas,

Try to select to rasterize the current content of the file before OCR-ing. That way once the OCR process completes - the only text layer will be the invisible one added by the OCR tool, and the one you are currently having problems with will not interfere with your selection.

Regards,
Stefan
Post Reply