Forum

Ligatures

The PDF-XChange Viewer for End Users
+++ FREE +++

Moderators: Tracker Support, Lesya - Tracker, Vasyl-Tracker Dev Team, Paul - Tracker Supp, Chris - Tracker Supp, Tracker Supp-Stefan, Ivan - Tracker Software

Ligatures

Postby wmm on Sat Jun 21, 2008 1:07 pm

A ligature is a typographical entity where two or three separate characters are connected into one. The most common English example is the sequence "fi", where the top of the "f" is extended to merge with the dot of the "i".

I regularly work with a large document produced by LaTeX with numerous ligatures. Adobe reader handles them transparently. For example, if I search for "first", Adobe reader finds all the occurrences, even if the "fi" is a ligature, and if I copy and paste text containing a ligature, the pasted text has the separate characters.

PDF-Xchange Viewer, on the other hand, treats the ligatures as distinct single characters. If I search for "first", I find only instances where the the ligature does not occur (in text set in a monospaced font, for instance, or with an initial capital), and copying and pasting such text inserts either the ligature or an obscure escape sequence, depending on the application into which I'm pasting.

The copy/paste issue isn't very significant; I can work around that fairly easily. The failure of search, however, is a major concern and could prevent me from using PDF-Xchange Viewer as my principal PDF reader. Can this be fixed in a forthcoming release?

(If you need an example, here's a version of the document:
http://www.open-std.org/jtc1/sc22/wg21/ ... /n2606.pdf
)
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby Bhikkhu Pesala on Sat Jun 21, 2008 3:11 pm

Foxit Reader suffers from the same problem, but there is a simple work around — search for "first".
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby wmm on Sat Jun 21, 2008 3:52 pm

Thanks for the suggestion, but unless I misunderstood what you were saying it doesn't work. I assume that what you meant was that I should enclose the string in quotes. I did that; the "find" command finds nothing, and the "search" command finds only the occurrences without the "fi" ligature.
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby Bhikkhu Pesala on Sat Jun 21, 2008 4:21 pm

Yes. I mean use the string in quotes, which uses the fi ligature from Alphabetic Presentation forms.

Find next from the toolbar finds the next occurence. The Search command finds 382 entries. I am using the latest version. Check for updates if you're not.
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby wmm on Sat Jun 21, 2008 6:04 pm

Bhikkhu Pesala wrote:Yes. I mean use the string in quotes, which uses the fi ligature from Alphabetic Presentation forms.


I'm not sure exactly what you're saying here. What I did was to open the full search pane, type a double-quote followed by the five characters f-i-r-s-t followed by another double-quote, and then hit Search Now. That resulted in 1015 entries, all of them in monospaced font or with initial capital. (The "find" command, with that string, finds nothing.) What did you mean by "Alphabetic Presentation" forms?

Bhikkhu Pesala wrote:Find next from the toolbar finds the next occurence. The Search command finds 382 entries. I am using the latest version. Check for updates if you're not.


If I copy and paste the word "first" from an occurrence containing the ligature and search for that (with or without quotes), I get 382 entries -- only occurrences with the ligature, none of the ones in the list of monospaced or initial-capital forms.

Adobe reader, when I search for first (typing all five characters) finds (very slowly!) 1385 entries, including both with and without the ligature. (I don't know why that's 12 fewer than the union of the results from the PDF-Xchange Viewer searches.)

(I'm using the latest version, too (2.0 build 38).)
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby Bhikkhu Pesala on Sat Jun 21, 2008 7:03 pm

Build 38 is not even announced yet — it must be very new. I just updated.

The ff, fi, fl, ffi, ffl, ligatures are in the Unicode character set called Alphabetic Presentation Forms.

Your document contains a mixture of "fi" ligatures (382) and "f i" as two separate characters (1015 occurences). The monospaced font uses "f i" while the proportional font uses the ligatures. However, no ligatures are used for example in the word "effect" which occurs many times.

Adobe Reader 7.1 finds the total less 12 (1385 occurences) almost instantly here. I'm not sure why it is missing those 12. It also finds 1385 occurences if I search for "first" with the ligature, i.e. it ignores the distinction, which is less useful in my opinion, though I can see the other POV too. Joe Blogs doesn't care whether ligatures are used or not — he just wants to find what he's looking for.

If you use an OpenType font, the text string will be separate letters f·i·r·s·t but any f·i pairs will be replaced with ligatures. That's why spell-check doesn't fail when using OpenType fonts, but it does if you're inserting the Alphabetic Presentation Forms into your document.
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby wmm on Sat Jun 21, 2008 8:03 pm

Bhikkhu Pesala wrote:Build 38 is not even announced yet — it must be very new. I just updated.


Yes. I just downloaded and started using PDF-Xchange Viewer yesterday for the first time, so when you mentioned that you were using the latest version, I figured that I had it, too. Just to make sure, though, I checked for updates, and surprisingly there was a new one. It was built last night, according to the "about" blurb.

Bhikkhu Pesala wrote:The ff, fi, fl, ffi, ffl, ligatures are in the Unicode character set called Alphabetic Presentation Forms.


Ah, thanks.

Bhikkhu Pesala wrote:Your document contains a mixture of "fi" ligatures (382) and "f i" as two separate characters (1015 occurences). The monospaced font uses "f i" while the proportional font uses the ligatures. However, no ligatures are used for example in the word "effect" which occurs many times.

Adobe Reader 7.1 finds the total less 12 (1385 occurences) almost instantly here. I'm not sure why it is missing those 12. It also finds 1385 occurences if I search for "first" with the ligature, i.e. it ignores the distinction, which is less useful in my opinion, though I can see the other POV too. Joe Blogs doesn't care whether ligatures are used or not — he just wants to find what he's looking for.


Just call me "Joe," then :wink: -- I need to reliably find all the places a given term is used (not "first," obviously -- that was just an easy example), from a string I type in.

(Adobe Reader is finding the results "almost instantly" now here, too; I guess it built an index the first time. That time, though, it took several times longer than PDF-Xchange Viewer's search, maybe 10-12 seconds.)

Bhikkhu Pesala wrote:If you use an OpenType font, the text string will be separate letters f·i·r·s·t but any f·i pairs will be replaced with ligatures. That's why spell-check doesn't fail when using OpenType fonts, but it does if you're inserting the Alphabetic Presentation Forms into your document.


Unfortunately, I'm not the author of the document, just a consumer, so I don't control how it's produced. I don't know why "fi" ligatures are used but "ff" ones are not, for instance, and I can't choose which fonts are used.

Thanks for the background. Hopefully the developers will be able to do something relatively soon to handle ligatures more usefully. It's possible to work around the problems using the "advanced search" capabilities, but it's a pain, and I'm afraid I'm going to be misled if I don't happen to notice that what I'm searching for contains a ligature. (I've been working with versions of this document for years with Adobe Reader and never had to worry about ligatures before.)
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby Ivan - Tracker Software on Sun Jun 22, 2008 12:07 pm

Support for ligatures will be added in an upcoming build together with improving the text editor for supporting east asian languages, etc.

HTH
Tracker Software (Project Director)

When attaching files to any message - please ensure they are archived and posted as a .ZIP, .RAR or .7z format - or they will not be posted - thanks.
Ivan - Tracker Software
Site Admin
 
Posts: 3029
Joined: Thu Jul 08, 2004 10:36 pm
Location: Vancouver Island - Canada

Re: Ligatures

Postby wmm on Sun Jun 22, 2008 12:56 pm

Ivan - Tracker Software wrote:Supporting of ligatures will be added into one of the next build together with improving text editor for supporting east asian languages, etc.


That's great! Thanks so much.
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby quant on Sun Jun 22, 2008 1:27 pm

Hi,

just want to add support for this, I didn't know before what was going on ...
I often search for "finance" or sth like that and then realized that many of them were missed, so instead I had to search for "nance".

Thanks
quant
User
 
Posts: 152
Joined: Fri Jan 18, 2008 2:48 pm

Re: Ligatures

Postby Bhikkhu Pesala on Tue Jun 24, 2008 4:20 am

I'm glad to hear that this will be fixed sometime.

I suggest that a search for a string containing regular text should find words with regular text and words with ligatures, but a search for a string containing ligatures should find only ligatures.
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby wmm on Tue Jun 24, 2008 10:12 am

Bhikkhu Pesala wrote:I'm glad to hear that this will be fixed sometime.

I suggest that a search for a string containing regular text should find words with regular text and words with ligatures, but a search for a string containing ligatures should find only ligatures.


I understand the functionality reason for wanting search to work that way, but I think it would be very confusing to people if searching for a string they copied and pasted found fewer occurrences than if they typed the same string. If they weren't aware of the existence of a ligature in the copied string, it would just seem like a bug.

If this functionality is to be provided, I don't think it should be by default; it should either have its own option or at least be tied to another "exact search" option (like "match case" or "whole words only").
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby Bhikkhu Pesala on Tue Jun 24, 2008 4:44 pm

I think you're right. Adobe's way of doing it is probably best for most users.
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby Bhikkhu Pesala on Tue Aug 04, 2009 6:37 pm

This issue still affects the latest build. I think it is quite a signficant issue that needs to be fixed sooner rather than later. I have no problem finding words like "effort" that include ligatures if I use Adobe Reader 7, but I cannot find them using PDF-XChange Viewer, unless I type "effort" in the Find toolbar (using the Alphabetical Presentation Form, or ligature).
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby wmm on Tue Aug 04, 2009 9:03 pm

Yes, it's been well over a year now since Ivan assured us that support would be "added into one of the next build." This is a really significant handicap, and a fix would be very much appreciated.
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby Chris - Tracker Supp on Tue Aug 04, 2009 9:12 pm

Hi wmm,

I just wanted to comment that our development team is hard at work every day usually putting in more hours than the average bear and new feature requests are being heard, worked on and added all the time, alot of times having to be prioritized or worked on in groups of related functionality especially from a programming perspective. Ivan has stated above that it will be included when they get to address more advanced text editing core functionalities and right to left language support and the like. I assure you that it will be addressed and I think a little patience and appreciation for the work these guys actually do is important please understand it be realized as Ivan as stated.

Regards,
Chris
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.


Chris Attrell
Tracker Sales & Support North America
http://www.tracker-software.com
Chris - Tracker Supp
User
 
Posts: 786
Joined: Tue Apr 14, 2009 11:33 pm

Re: Ligatures

Postby wmm on Tue Aug 04, 2009 9:29 pm

Yes, I wasn't intending to cast aspersions -- I'm in software development myself, and I understand that things have to be done in priority order and that what's important to me individually may not be what's needed by the user community at large. I was just expressing some disappointment that the feature turned out not to be as imminent as the earlier, very encouraging response had led me to believe. I'm pleased to hear that it's still on the roadmap. Thanks for the clarification.
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby Chris - Tracker Supp on Tue Aug 04, 2009 10:57 pm

Not a problem wmm,

And we understand your comment as well it's always a juggling game. And we appreciate your patience and will do our best to incorporate this feature as soon as we can.

Regards,

Chris
If posting files to this forum - you must archive the files to a ZIP, RAR or 7z file or they will not be uploaded - thank you.


Chris Attrell
Tracker Sales & Support North America
http://www.tracker-software.com
Chris - Tracker Supp
User
 
Posts: 786
Joined: Tue Apr 14, 2009 11:33 pm

Re: Ligatures

Postby Bhikkhu Pesala on Wed Sep 30, 2009 7:24 am

Though this may not be a bug, users accustomed to the behaviour in Adobe Reader will regard it as a bug since searching for words like "first" or "difficult" won't find them if Alphabetic Presentation Forms were used.
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby Tracker Supp-Stefan on Wed Sep 30, 2009 1:56 pm

Agreed Bhikkhu,
that someone used to those Alphabetic Forms will count this as a bug :)
Will check with Ivan for any more precise plans when this might get implemented.

Best regards,
Stefan
Tracker Supp-Stefan
Site Admin
 
Posts: 7929
Joined: Mon Jan 12, 2009 8:07 am
Location: London

Re: Ligatures

Postby Bhikkhu Pesala on Wed Dec 30, 2009 6:30 am

Fixed in build 2.0.0043.0 :)
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby Cadillakin on Wed Dec 30, 2009 8:12 am

Bhikkhu Pesala wrote:Fixed in build 2.0.0043.0 :)

Not fixed.

The file I included as an attachment in this thread; viewtopic.php?f=35&t=7614 still cannot be searched properly for "traffic."
Cadillakin
User
 
Posts: 110
Joined: Thu Apr 02, 2009 12:21 am

Re: Ligatures

Postby Bhikkhu Pesala on Wed Dec 30, 2009 8:56 am

The bug is fixed. There is something else wrong with that document. Try searching for "Trafic" using Adobe Reader.
Windows 7 64-bit • AMD A10-6800K, 8 Gbyte RAM
Review: http://www.softerviews.org/PDF-XChange.html
Bhikkhu Pesala
User
 
Posts: 1567
Joined: Tue May 29, 2007 9:29 am
Location: East London

Re: Ligatures

Postby Vasyl-Tracker Dev Team on Wed Dec 30, 2009 12:31 pm

Hi guys,

Here is a misunderstanding: the search in the attached document is not proper really.
I tried to search the "traffic." (with dot-symbol on the end):
in Adobe: 4 instances found, in PDF-XChange: 3 instances.

We will investigate this trouble.

Thanks.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Vasyl-Tracker Dev Team
Site Admin
 
Posts: 1436
Joined: Thu Jun 30, 2005 4:11 pm
Location: Ukraine

Re: Ligatures

Postby wmm on Wed Dec 30, 2009 12:33 pm

Wonderful! Thank you so much! Now PDF-Xchange Viewer is perfect! :D
wmm
User
 
Posts: 14
Joined: Sat Jun 21, 2008 12:25 pm

Re: Ligatures

Postby halabund on Mon Jan 04, 2010 1:57 pm

There's still room for improvement in this area though. Adobe Reader can ignore accents on letters (e.g. matches á when searching for a, or ά for α, etc.), even when the accent is a separate glyph in the document, can do stemming to a certain level, etc. Stemming is not a big deal for me, but the ability to ignore accents is quite useful.

On the other hand, XChange viewer searches noticeably faster.
halabund
User
 
Posts: 44
Joined: Mon Aug 27, 2007 7:13 am

Re: Ligatures

Postby Vasyl-Tracker Dev Team on Tue Jan 05, 2010 2:27 pm

Hi,

We will try to add the option for ignoring accents on letters into the new version(V3).

Best
Regards.
Vasyl Yaremyn
Tracker Software Products
Project Developer

Please archive any files posted to a ZIP, 7z or RAR file or they will be removed and not posted.
Vasyl-Tracker Dev Team
Site Admin
 
Posts: 1436
Joined: Thu Jun 30, 2005 4:11 pm
Location: Ukraine


Return to PDF-XChange Viewer (End Users)

Who is online

Users browsing this forum: No registered users and 3 guests