Bug 417274 - Add OCR to PDF
Summary: Add OCR to PDF
Status: REPORTED
Alias: None
Product: okular
Classification: Applications
Component: New backend wishes (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR wishlist
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-07 15:42 UTC by sandu7ip
Modified: 2024-06-04 09:27 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
example pdf (1.74 MB, application/pdf)
2020-02-07 15:42 UTC, sandu7ip
Details

Note You need to log in before you can comment on or make changes to this bug.
Description sandu7ip 2020-02-07 15:42:31 UTC
Created attachment 125743 [details]
example pdf

This is not a bug report but a feature request:

- make PDF searchable and annotable from Okular.

Many PDFs are scanned versions of old manuscripts. For instance, I cannot annotate the attached PDF.
Comment 1 Yuri Chornoivan 2020-02-07 15:58:17 UTC
Hi,

The file attached for sure can be easily annotated with the current version of Okular. Just press F6 and go.

As it comes to the search capabilities, can you try to add OCR layer to your PDF using gscan2pdf first?

Thanks in advance for your answer.
Comment 2 sandu7ip 2020-02-07 16:26:13 UTC
(In reply to Yuri Chornoivan from comment #1)
> The file attached for sure can be easily annotated with the current version
> of Okular. Just press F6 and go.

Perhaps you can add a note, or draw something, but you can't underline or highlight a line.

> As it comes to the search capabilities, can you try to add OCR layer to your
> PDF using gscan2pdf first?

I have actually used `ocrmypdf`.

I would prefer it to be incorporated in Okular. So does Adobe Acrobat, it allows one to make a PDF searchable if it isn't.
Comment 3 Yuri Chornoivan 2020-02-07 16:31:15 UTC
A useful thread in our mailing list:

https://okular-devel.kde.narkive.com/5GHpBFqS/ocr-tool-for-okular
Comment 4 silopolis 2022-07-23 11:01:48 UTC
That'd be awesome !
Comment 5 Manuel López-Ibáñez 2024-06-04 09:27:17 UTC
Apart from search and performing OCR of a whole document, another user case is using the Area Selection to select (parts of) an image and being able to copy the OCRed text within the image. Currently the only two options are "Copy to Clipboard" (as an image) and "Save to file..." (as an image).