Created attachment 172420 [details] PDF document (zipped) with rotated page Highlighter and Text extractor fail on rotated pdf-pages SUMMARY Some pdf documents include rotated pages (see attached pdf-file). The Highlighter tool fails to properly highlight the corresponding text. The text selection tool (including the Area selector) fail to properly select the corresponding text. Rotating clockwise does not correct the errors. STEPS TO REPRODUCE 1. Open attached pdf file 2. Go to page 7, which is rotated 3.a. Select highlighter and try to highlight e.g. "Homo sapiens" 3.b. Select "Text selection" and try to copy the first peptide sequence ("KEFSEV...") 3.c. Select the "Area selection" and try to copy the same first sequence 4. Repeat the steps with the page rotated clockwise OBSERVED RESULT 3.a. The highlighting is very wrong and very annoying. 3.b. Nothing useful is selected 3.c. If the area is narrow enough, then something usable can be selected. However, the text is rebersed! "V Y K E L D Q I R G E V E S F E K " => should be "KEFSEV..." Note: MS Edge is actually able to select the proper text: "KEFSEVEGRIQDLEKYV", even without rotating the page! EXPECTED RESULT Highlighter should highlight the proper line. Text selection should select line by line. SOFTWARE/OS VERSIONS Windows: 10 ADDITIONAL INFORMATION Is partly related to: https://bugs.kde.org/show_bug.cgi?id=334297 - but that bug does not mention the Text selection; - the page in this bug is also slightly more complex;
I found the same bug today. I wonder if this issue is related to bug 407133 SOFTWARE/OS VERSIONS Windows 11 Pro Okular version: 23.801.1522.0
There is a Poppler bug that may impact this issue as well (although the corresponding document is not rotated): poppler/poppler/Issues/1547 "Text selection doesn't follow the document's structure" Note: The Poppler site will transition to new infrastructures in March/April 2025 and I do not know if the following link remains valid: https://gitlab.freedesktop.org/poppler/poppler/-/issues/1547 The following information may be useful: "-colspacing -- how much spacing we allow after a word before considering adjacent text to be a new column, as a fraction of the font size (default is 0.7, old releases had a 0.3 default)." But, as noted above, that document does not have rotated text in one of the columns.
I have seen this today, it did not work for me on normally positioned pages, either. I believe that this might not be related to rotation.