417274 – Add OCR to PDF

Bug 417274 - Add OCR to PDF

Summary: Add OCR to PDF

Status:	REPORTED

Alias:	None

Product:	okular
Classification:	Applications
Component:	New backend wishes (show other bugs)
Version:	unspecified
Platform:	Other Linux

Importance:	NOR wishlist
Target Milestone:	---
Assignee:	Okular developers

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-02-07 15:42 UTC by sandu7ip
Modified:	2024-06-04 09:27 UTC (History)
CC List:	6 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
example pdf (1.74 MB, application/pdf) 2020-02-07 15:42 UTC, sandu7ip	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description sandu7ip 2020-02-07 15:42:31 UTC

Created attachment 125743 [details]
example pdf

This is not a bug report but a feature request:

- make PDF searchable and annotable from Okular.

Many PDFs are scanned versions of old manuscripts. For instance, I cannot annotate the attached PDF.

Comment 1 Yuri Chornoivan 2020-02-07 15:58:17 UTC

Hi,

The file attached for sure can be easily annotated with the current version of Okular. Just press F6 and go.

As it comes to the search capabilities, can you try to add OCR layer to your PDF using gscan2pdf first?

Thanks in advance for your answer.

Comment 2 sandu7ip 2020-02-07 16:26:13 UTC

(In reply to Yuri Chornoivan from comment #1)
> The file attached for sure can be easily annotated with the current version
> of Okular. Just press F6 and go.

Perhaps you can add a note, or draw something, but you can't underline or highlight a line.

> As it comes to the search capabilities, can you try to add OCR layer to your
> PDF using gscan2pdf first?

I have actually used `ocrmypdf`.

I would prefer it to be incorporated in Okular. So does Adobe Acrobat, it allows one to make a PDF searchable if it isn't.

Comment 3 Yuri Chornoivan 2020-02-07 16:31:15 UTC

A useful thread in our mailing list:

https://okular-devel.kde.narkive.com/5GHpBFqS/ocr-tool-for-okular

Comment 4 silopolis 2022-07-23 11:01:48 UTC

That'd be awesome !

Comment 5 Manuel López-Ibáñez 2024-06-04 09:27:17 UTC

Apart from search and performing OCR of a whole document, another user case is using the Area Selection to select (parts of) an image and being able to copy the OCRed text within the image. Currently the only two options are "Copy to Clipboard" (as an image) and "Save to file..." (as an image).