Ligatures

The PDF-XChange Viewer for End Users
+++ FREE +++

Moderators: TrackerSupp-Daniel, Tracker Support, Paul - Tracker Supp, Vasyl-Tracker Dev Team, Chris - Tracker Supp, Sean - Tracker, Ivan - Tracker Software, Tracker Supp-Stefan

Post Reply
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Ligatures

Post by wmm »

A ligature is a typographical entity where two or three separate characters are connected into one. The most common English example is the sequence "fi", where the top of the "f" is extended to merge with the dot of the "i".

I regularly work with a large document produced by LaTeX with numerous ligatures. Adobe reader handles them transparently. For example, if I search for "first", Adobe reader finds all the occurrences, even if the "fi" is a ligature, and if I copy and paste text containing a ligature, the pasted text has the separate characters.

PDF-Xchange Viewer, on the other hand, treats the ligatures as distinct single characters. If I search for "first", I find only instances where the the ligature does not occur (in text set in a monospaced font, for instance, or with an initial capital), and copying and pasting such text inserts either the ligature or an obscure escape sequence, depending on the application into which I'm pasting.

The copy/paste issue isn't very significant; I can work around that fairly easily. The failure of search, however, is a major concern and could prevent me from using PDF-Xchange Viewer as my principal PDF reader. Can this be fixed in a forthcoming release?

(If you need an example, here's a version of the document:
http://www.open-std.org/jtc1/sc22/wg21/ ... /n2606.pdf
)
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

Foxit Reader suffers from the same problem, but there is a simple work around — search for "first".
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Post by wmm »

Thanks for the suggestion, but unless I misunderstood what you were saying it doesn't work. I assume that what you meant was that I should enclose the string in quotes. I did that; the "find" command finds nothing, and the "search" command finds only the occurrences without the "fi" ligature.
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

Yes. I mean use the string in quotes, which uses the fi ligature from Alphabetic Presentation forms.

Find next from the toolbar finds the next occurence. The Search command finds 382 entries. I am using the latest version. Check for updates if you're not.
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Post by wmm »

Bhikkhu Pesala wrote:Yes. I mean use the string in quotes, which uses the fi ligature from Alphabetic Presentation forms.
I'm not sure exactly what you're saying here. What I did was to open the full search pane, type a double-quote followed by the five characters f-i-r-s-t followed by another double-quote, and then hit Search Now. That resulted in 1015 entries, all of them in monospaced font or with initial capital. (The "find" command, with that string, finds nothing.) What did you mean by "Alphabetic Presentation" forms?
Bhikkhu Pesala wrote:Find next from the toolbar finds the next occurence. The Search command finds 382 entries. I am using the latest version. Check for updates if you're not.
If I copy and paste the word "first" from an occurrence containing the ligature and search for that (with or without quotes), I get 382 entries -- only occurrences with the ligature, none of the ones in the list of monospaced or initial-capital forms.

Adobe reader, when I search for first (typing all five characters) finds (very slowly!) 1385 entries, including both with and without the ligature. (I don't know why that's 12 fewer than the union of the results from the PDF-Xchange Viewer searches.)

(I'm using the latest version, too (2.0 build 38).)
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

Build 38 is not even announced yet — it must be very new. I just updated.

The ff, fi, fl, ffi, ffl, ligatures are in the Unicode character set called Alphabetic Presentation Forms.

Your document contains a mixture of "fi" ligatures (382) and "f i" as two separate characters (1015 occurences). The monospaced font uses "f i" while the proportional font uses the ligatures. However, no ligatures are used for example in the word "effect" which occurs many times.

Adobe Reader 7.1 finds the total less 12 (1385 occurences) almost instantly here. I'm not sure why it is missing those 12. It also finds 1385 occurences if I search for "first" with the ligature, i.e. it ignores the distinction, which is less useful in my opinion, though I can see the other POV too. Joe Blogs doesn't care whether ligatures are used or not — he just wants to find what he's looking for.

If you use an OpenType font, the text string will be separate letters f·i·r·s·t but any f·i pairs will be replaced with ligatures. That's why spell-check doesn't fail when using OpenType fonts, but it does if you're inserting the Alphabetic Presentation Forms into your document.
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Post by wmm »

Bhikkhu Pesala wrote:Build 38 is not even announced yet — it must be very new. I just updated.
Yes. I just downloaded and started using PDF-Xchange Viewer yesterday for the first time, so when you mentioned that you were using the latest version, I figured that I had it, too. Just to make sure, though, I checked for updates, and surprisingly there was a new one. It was built last night, according to the "about" blurb.
Bhikkhu Pesala wrote:The ff, fi, fl, ffi, ffl, ligatures are in the Unicode character set called Alphabetic Presentation Forms.
Ah, thanks.
Bhikkhu Pesala wrote:Your document contains a mixture of "fi" ligatures (382) and "f i" as two separate characters (1015 occurences). The monospaced font uses "f i" while the proportional font uses the ligatures. However, no ligatures are used for example in the word "effect" which occurs many times.

Adobe Reader 7.1 finds the total less 12 (1385 occurences) almost instantly here. I'm not sure why it is missing those 12. It also finds 1385 occurences if I search for "first" with the ligature, i.e. it ignores the distinction, which is less useful in my opinion, though I can see the other POV too. Joe Blogs doesn't care whether ligatures are used or not — he just wants to find what he's looking for.
Just call me "Joe," then :wink: -- I need to reliably find all the places a given term is used (not "first," obviously -- that was just an easy example), from a string I type in.

(Adobe Reader is finding the results "almost instantly" now here, too; I guess it built an index the first time. That time, though, it took several times longer than PDF-Xchange Viewer's search, maybe 10-12 seconds.)
Bhikkhu Pesala wrote:If you use an OpenType font, the text string will be separate letters f·i·r·s·t but any f·i pairs will be replaced with ligatures. That's why spell-check doesn't fail when using OpenType fonts, but it does if you're inserting the Alphabetic Presentation Forms into your document.
Unfortunately, I'm not the author of the document, just a consumer, so I don't control how it's produced. I don't know why "fi" ligatures are used but "ff" ones are not, for instance, and I can't choose which fonts are used.

Thanks for the background. Hopefully the developers will be able to do something relatively soon to handle ligatures more usefully. It's possible to work around the problems using the "advanced search" capabilities, but it's a pain, and I'm afraid I'm going to be misled if I don't happen to notice that what I'm searching for contains a ligature. (I've been working with versions of this document for years with Adobe Reader and never had to worry about ligatures before.)
User avatar
Ivan - Tracker Software
Site Admin
Posts: 3549
Joined: Thu Jul 08, 2004 10:36 pm
Location: Vancouver Island - Canada
Contact:

Re: Ligatures

Post by Ivan - Tracker Software »

Support for ligatures will be added in an upcoming build together with improving the text editor for supporting east asian languages, etc.

HTH
Tracker Software (Project Director)

When attaching files to any message - please ensure they are archived and posted as a .ZIP, .RAR or .7z format - or they will not be posted - thanks.
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Post by wmm »

Ivan - Tracker Software wrote:Supporting of ligatures will be added into one of the next build together with improving text editor for supporting east asian languages, etc.
That's great! Thanks so much.
quant
User
Posts: 151
Joined: Fri Jan 18, 2008 2:48 pm

Re: Ligatures

Post by quant »

Hi,

just want to add support for this, I didn't know before what was going on ...
I often search for "finance" or sth like that and then realized that many of them were missed, so instead I had to search for "nance".

Thanks
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

I'm glad to hear that this will be fixed sometime.

I suggest that a search for a string containing regular text should find words with regular text and words with ligatures, but a search for a string containing ligatures should find only ligatures.
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Post by wmm »

Bhikkhu Pesala wrote:I'm glad to hear that this will be fixed sometime.

I suggest that a search for a string containing regular text should find words with regular text and words with ligatures, but a search for a string containing ligatures should find only ligatures.
I understand the functionality reason for wanting search to work that way, but I think it would be very confusing to people if searching for a string they copied and pasted found fewer occurrences than if they typed the same string. If they weren't aware of the existence of a ligature in the copied string, it would just seem like a bug.

If this functionality is to be provided, I don't think it should be by default; it should either have its own option or at least be tied to another "exact search" option (like "match case" or "whole words only").
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

I think you're right. Adobe's way of doing it is probably best for most users.
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

This issue still affects the latest build. I think it is quite a signficant issue that needs to be fixed sooner rather than later. I have no problem finding words like "effort" that include ligatures if I use Adobe Reader 7, but I cannot find them using PDF-XChange Viewer, unless I type "effort" in the Find toolbar (using the Alphabetical Presentation Form, or ligature).
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Post by wmm »

Yes, it's been well over a year now since Ivan assured us that support would be "added into one of the next build." This is a really significant handicap, and a fix would be very much appreciated.
User avatar
Chris - Tracker Supp
Site Admin
Posts: 795
Joined: Tue Apr 14, 2009 11:33 pm

Re: Ligatures

Post by Chris - Tracker Supp »

Hi wmm,

I just wanted to comment that our development team is hard at work every day usually putting in more hours than the average bear and new feature requests are being heard, worked on and added all the time, alot of times having to be prioritized or worked on in groups of related functionality especially from a programming perspective. Ivan has stated above that it will be included when they get to address more advanced text editing core functionalities and right to left language support and the like. I assure you that it will be addressed and I think a little patience and appreciation for the work these guys actually do is important please understand it be realized as Ivan as stated.

Regards,
Chris
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.


Chris Attrell
Tracker Sales & Support North America
http://www.tracker-software.com
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Post by wmm »

Yes, I wasn't intending to cast aspersions -- I'm in software development myself, and I understand that things have to be done in priority order and that what's important to me individually may not be what's needed by the user community at large. I was just expressing some disappointment that the feature turned out not to be as imminent as the earlier, very encouraging response had led me to believe. I'm pleased to hear that it's still on the roadmap. Thanks for the clarification.
User avatar
Chris - Tracker Supp
Site Admin
Posts: 795
Joined: Tue Apr 14, 2009 11:33 pm

Re: Ligatures

Post by Chris - Tracker Supp »

Not a problem wmm,

And we understand your comment as well it's always a juggling game. And we appreciate your patience and will do our best to incorporate this feature as soon as we can.

Regards,

Chris
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.


Chris Attrell
Tracker Sales & Support North America
http://www.tracker-software.com
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

Though this may not be a bug, users accustomed to the behaviour in Adobe Reader will regard it as a bug since searching for words like "first" or "difficult" won't find them if Alphabetic Presentation Forms were used.
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
User avatar
Tracker Supp-Stefan
Site Admin
Posts: 17817
Joined: Mon Jan 12, 2009 8:07 am
Location: London
Contact:

Re: Ligatures

Post by Tracker Supp-Stefan »

Agreed Bhikkhu,
that someone used to those Alphabetic Forms will count this as a bug :)
Will check with Ivan for any more precise plans when this might get implemented.

Best regards,
Stefan
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

Fixed in build 2.0.0043.0 :)
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
Cadillakin
User
Posts: 110
Joined: Thu Apr 02, 2009 12:21 am

Re: Ligatures

Post by Cadillakin »

Bhikkhu Pesala wrote:Fixed in build 2.0.0043.0 :)
Not fixed.

The file I included as an attachment in this thread; https://forum.pdf-xchange.com/ ... =35&t=7614 still cannot be searched properly for "traffic."
User avatar
Bhikkhu Pesala
User
Posts: 1776
Joined: Tue May 29, 2007 9:29 am
Location: East London
Contact:

Re: Ligatures

Post by Bhikkhu Pesala »

The bug is fixed. There is something else wrong with that document. Try searching for "Trafic" using Adobe Reader.
Windows 10 Home 64-bit • AMD Ryzen 5 3400G, 8 Gb
Review: http://www.softerviews.org/PDF-XChange.html
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2352
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: Ligatures

Post by Vasyl-Tracker Dev Team »

Hi guys,

Here is a misunderstanding: the search in the attached document is not proper really.
I tried to search the "traffic." (with dot-symbol on the end):
in Adobe: 4 instances found, in PDF-XChange: 3 instances.

We will investigate this trouble.

Thanks.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
wmm
User
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Post by wmm »

Wonderful! Thank you so much! Now PDF-Xchange Viewer is perfect! :D
halabund
User
Posts: 44
Joined: Mon Aug 27, 2007 7:13 am

Re: Ligatures

Post by halabund »

There's still room for improvement in this area though. Adobe Reader can ignore accents on letters (e.g. matches á when searching for a, or ά for α, etc.), even when the accent is a separate glyph in the document, can do stemming to a certain level, etc. Stemming is not a big deal for me, but the ability to ignore accents is quite useful.

On the other hand, XChange viewer searches noticeably faster.
User avatar
Vasyl-Tracker Dev Team
Site Admin
Posts: 2352
Joined: Thu Jun 30, 2005 4:11 pm
Location: Canada

Re: Ligatures

Post by Vasyl-Tracker Dev Team »

Hi,

We will try to add the option for ignoring accents on letters into the new version(V3).

Best
Regards.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Post Reply