Bug 455857

Summary: Can't search for text in PDF
Product: [Applications] okular Reporter: makosol
Component: generalAssignee: Okular developers <okular-devel>
Status: REPORTED ---    
Severity: normal CC: aacid
Priority: NOR    
Version First Reported In: 22.04.2   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: PDF

Description makosol 2022-06-23 18:16:52 UTC
Created attachment 150103 [details]
PDF

SUMMARY
Can't search for text in PDF (cf attached PDF)


STEPS TO REPRODUCE
1. Open attach PDF
2. Search for "cuisine" for example 

OBSERVED RESULT
okular search can't find the word

EXPECTED RESULT
the word is found

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 5.25
(available in About System)
KDE Plasma Version: 5.25
KDE Frameworks Version: 5.95
Qt Version: 5.15.4

ADDITIONAL INFORMATION
KDE NEON
Comment 1 Albert Astals Cid 2022-06-23 22:50:43 UTC
That's beause we think the PDF says CU ISINE, firefox thinks the same. Recreating text from PDF is a dark art because all PDF gives you is position of each character and character, so it's sometimes not easy to figure out if two characters form a word or not, in this case we're failing, but potentially fixing this could make other places where we correctly detect two words think it's only one, needs investigation.