Bug 455857 - Can't search for text in PDF
Summary: Can't search for text in PDF
Status: REPORTED
Alias: None
Product: okular
Classification: Applications
Component: general (show other bugs)
Version: 22.04.2
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-06-23 18:16 UTC by makosol
Modified: 2022-06-23 22:50 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
PDF (3.70 MB, application/pdf)
2022-06-23 18:16 UTC, makosol
Details

Note You need to log in before you can comment on or make changes to this bug.
Description makosol 2022-06-23 18:16:52 UTC
Created attachment 150103 [details]
PDF

SUMMARY
Can't search for text in PDF (cf attached PDF)


STEPS TO REPRODUCE
1. Open attach PDF
2. Search for "cuisine" for example 

OBSERVED RESULT
okular search can't find the word

EXPECTED RESULT
the word is found

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 5.25
(available in About System)
KDE Plasma Version: 5.25
KDE Frameworks Version: 5.95
Qt Version: 5.15.4

ADDITIONAL INFORMATION
KDE NEON
Comment 1 Albert Astals Cid 2022-06-23 22:50:43 UTC
That's beause we think the PDF says CU ISINE, firefox thinks the same. Recreating text from PDF is a dark art because all PDF gives you is position of each character and character, so it's sometimes not easy to figure out if two characters form a word or not, in this case we're failing, but potentially fixing this could make other places where we correctly detect two words think it's only one, needs investigation.