Bug 321869 - Can not find words or copy text in some PDF files.
Summary: Can not find words or copy text in some PDF files.
Status: RESOLVED UPSTREAM
Alias: None
Product: okular
Classification: Applications
Component: PDF backend (show other bugs)
Version: 0.16.4
Platform: Fedora RPMs Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords: investigated, triaged
Depends on:
Blocks:
 
Reported: 2013-07-02 18:33 UTC by Mark van Rossum
Modified: 2018-09-19 14:23 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
pdf affected by the bug (113.17 KB, application/pdf)
2014-01-06 14:57 UTC, zaharid
Details
Tex source of file affected by bug (5.02 KB, text/x-tex)
2014-01-06 18:41 UTC, zaharid
Details
A file with T1 fonts, compiled from attachment 84485 (175.16 KB, application/octet-stream)
2014-01-06 18:57 UTC, Yuri Chornoivan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark van Rossum 2013-07-02 18:33:15 UTC
Hi
I'm start to use okular instead of acroread, but noticed the following:

I have now a PDF file that display correctly, but I can not search it. The search simply returns a blank.

Copy of normal text and paste in an editor gave "❈ ❇♦✉❝s❡✐♥"  (ie. symbol font characters)
The pdf was created with pdfsam-console (Ver. 2.4.0e)/ iText 2.1.7 by 1T3XT

Acroread can both find and copy text from the file.
Unfortunately the file is somewhat confidential, so I can't attach it here, but I'm happy
to send it to a developer.


Reproducible: Always
Comment 1 Albert Astals Cid 2013-07-02 18:45:55 UTC
You can send it to me if you want.
Comment 2 zaharid 2014-01-06 14:57:00 UTC
Created attachment 84480 [details]
pdf affected by the bug
Comment 3 zaharid 2014-01-06 14:58:42 UTC
This bug affects me as well, on some pfds generated by pdflatex (but not all), as well as others. May have to do with the fonts.
Comment 4 Yuri Chornoivan 2014-01-06 15:23:34 UTC
(In reply to comment #3)
> This bug affects me as well, on some pfds generated by pdflatex (but not
> all), as well as others. May have to do with the fonts.

The document is unsearchable even in commercial software like Foxit PDF Reader.

Can you try to use

\usepackage[T1]{fontenc}

in the preamble?
Comment 5 zaharid 2014-01-06 18:40:35 UTC
\usepackage[T1]{fontenc}
is already in the preamble.
The file is searchable with things like the firefox viewer, so it's a bug in Okular _and_ Foxit.
Comment 6 zaharid 2014-01-06 18:41:29 UTC
Created attachment 84485 [details]
Tex source of file affected by bug

This is exported by LyX.
Comment 7 Yuri Chornoivan 2014-01-06 18:57:24 UTC
Created attachment 84486 [details]
A file with T1 fonts, compiled from attachment 84485 [details]

The file that has been attached can be compiled into perfectly searchable PDF iff the TeX distribution has Type 1 (not Type 3 as in 84480) fonts. Any modern TeX distribution has these fonts.

The file is compiled in the default TeXLive installation, without any changes (Mageia 3, TeXLive 2012).
Comment 8 Luigi Toscano 2014-05-08 16:42:47 UTC
So, it seems that when you install cm-super as Yuri suggested, something is fixed in the file and it works, but still the old file should work somehow. Could you please open an upstream poppler bug (with article.pdf at least)?
Comment 9 genet 2015-10-27 09:57:12 UTC
I had the same issue, and installing cm-super fixed it. Was this bug reported upstream? Thanks anyway for the workaround.
Comment 10 Andrew Crouthamel 2018-09-19 14:23:42 UTC
This bug has had its resolution changed, but accidentally has been left in NEEDSINFO status. I am thus closing this bug and setting the status as RESOLVED to reflect the resolution change.