Summary: | LTR languages searches text backwards | ||
---|---|---|---|
Product: | [Applications] okular | Reporter: | Dotan Cohen <kde-2011.08> |
Component: | general | Assignee: | Okular developers <okular-devel> |
Status: | CONFIRMED --- | ||
Severity: | wishlist | CC: | aacid, adam.golanski, dragon, eladhen2, Fahad.alsaidi, matitiahu.allouche, med.medin.2014, mh.firouzjah, munzirtaha, nadavkav, nate, ohadcn, olivier, overman.supermundane, postix, shimi.chen, simonandric5, syn_org939, tsm.7 |
Priority: | NOR | Keywords: | rtl, usability |
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Ubuntu | ||
OS: | Unspecified | ||
See Also: |
https://bugs.kde.org/show_bug.cgi?id=407133 https://bugs.kde.org/show_bug.cgi?id=439791 |
||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
Hebrew-language PDF document.
arabic text |
Description
Dotan Cohen
2009-09-18 02:38:00 UTC
Can you attach a sample document showing the issue? Created attachment 38661 [details]
Hebrew-language PDF document.
All Hebrew PDF documents display the issue. Here is one.
I can confirm this. I don't think the fix is that complicated either. I'm not really familiar with the libraries (and I'm a GNOME user), but a quick search reveals that KDE has reliable BiDi support since 3.0.1. In particular, I found this function: QCString QHebrewCodec::fromUnicode ( const QString & uc, int & lenInOut ) const [virtual] (from http://doc.trolltech.com/3.3/qhebrewcodec.html) which I could guess, that if the search string piped through it before the search takes place, it would fix the problem. (Although there might be a more suitable Qt bidi function which fixes Hebrew, Arabic and all other RTL languages in one go.) Gadi It seems that bug #128609 is for the same issue, but for KPDF instead of Okular. One of these bugs should be duped of the other. I will leave it to the devs to decide which. Thanks. PDF's objective is to reflect the exact appearance of text. For Hebrew, it means that the glyphs are stored in visual order. If your PDF viewer accepts user input in logical order (which is the case in Windows and Linux), it should transform search arguments (captured from a user dialog) from logical to visual order before performing the search. For Arabic, there is the additional issue that the glyphs represent letter shapes, and you must perform "shaping", in addition to reordering, on the search arguments to choose the proper glyphs for each Arabic letter. i might be related to https://bugs.kde.org/show_bug.cgi?id=184399 *** Bug 282849 has been marked as a duplicate of this bug. *** *** Bug 331785 has been marked as a duplicate of this bug. *** Here is a quick patch to fix this problem. https://git.reviewboard.kde.org/r/125442/ Thanks This bug needs retest against Poppler >= 0.40 because there of this: https://bugs.freedesktop.org/show_bug.cgi?id=55977 Using poppler 0.42.0, typing hebrew put the search box in right-to-left but I must write the word in left to right (so backward) so that it matches. Created attachment 100228 [details]
arabic text
you can search using this word: "بسم" in attached arabic text pdf if you find it, it means it is fixed in upstream otherwise the problem in okular. See my comment about hebrew: it didn't work because of the said reasons. This bug is still present in Mint Cinnamon 18 (and presumably in all of the Ubuntu 16.04 family). It should be noted that the similar bug in Evince, Atril and some others, that stemmed from Poppler, are fixed as of Ubuntu 16.04/ Mint 18. *** Bug 386468 has been marked as a duplicate of this bug. *** this bug also effect the copying the RTL text. the copied text is reversed. Fahad submitted a patch for this, which I've migrated to Phabricator: https://phabricator.kde.org/D10298 I think the problem form QT interface for poppler. please see this bug https://bugs.freedesktop.org/show_bug.cgi?id=105015 I think I've found where is the problem. It is from TextPagePrivate::correctTextOrder(), it sorts words & characters to be LTR using theses compareTinyTextEntityY & compareTinyTextEntityX. This approach doesn't fit with RTL text. I proposed another patch to fix this bug, here https://phabricator.kde.org/D10455 *** Bug 429869 has been marked as a duplicate of this bug. *** same problem for another rtl language Persian. Linux/KDE Plasma: 5.15.38-1-Manjaro(64-bit) (available in About System) KDE Plasma Version: 5.24.5 KDE Frameworks Version: 5.93 Qt Version: 5.15.3 *** Bug 442046 has been marked as a duplicate of this bug. *** *** Bug 457448 has been marked as a duplicate of this bug. *** |