Bug 117629

Summary:	New tool to perform OCR of an image and store result in metadata
Product:	[Applications] digikam	Reporter:	arthur.lutz
Component:	Plugin-Generic-OcrTextConverter	Assignee:	Digikam Developers <digikam-bugs-null>
Status:	RESOLVED FIXED
Severity:	wishlist	CC:	as9902613, caulier.gilles, harshilpatel1973, jose_oliver, Paul.Daily.001
Priority:	NOR
Version:	6.4.0
Target Milestone:	---
Platform:	Manjaro
OS:	Linux
Latest Commit:		Version Fixed In:	8.0.0
Sentry Crash Report:

Description arthur.lutz 2005-12-03 23:58:37 UTC

Version:            (using KDE KDE 3.4.3)
Installed from:    Debian testing/unstable Packages

it would be neat to have a plugin that takes an image, tries an OCR and
if any reasonable words come out of it, records them in the exif or jfif.

It would be perfect for an archive of all boring documents you get and spend ages
looking for, you could have just a raw scan repository and find stuff in it with
beagle/kat/search program.

Comment 1 caulier.gilles 2008-12-06 19:14:17 UTC

What's about this file ? i think it's out of subject with digiKam

Gilles

Comment 2 Andi Clemens 2008-12-06 19:26:25 UTC

I can't think of any practical use. digiKam is mainly for photo management program, at least my photos do not contain text that needs to be extracted :-)

+1 WONTFIX

Andi

Comment 3 Mikolaj Machowski 2008-12-06 21:20:24 UTC

+1 for WONTFIX

Comment 4 Agron 2020-04-13 22:49:56 UTC

Please reopen this bug / WishForANewTool.

Many people take pictures of products, signs, notes, screenshots, etc. basically images with text in them. With OCR we can generate tags, add a description or a caption automatically. This would make searching much much better. Just like what google does with PDF documents.

Even better, it would be nice if OCR could to tell a document apart from a photograph. I would like to put documents in a different album.

Comment 5 caulier.gilles 2020-06-06 15:36:41 UTC

I plan to write later an external dplugin to operate OCR using Neural Network based on Tesserrac :

https://github.com/tesseract-ocr/tesseract

Gilles Caulier

Comment 6 ThisIsPaulDaily 2020-11-22 01:18:10 UTC

I have a significant amount of "Snapchat" screenshots that would be great to filter the text on. 

I also screenshot many important things and it would be good to have searchable OCR. 

Is there a bounty for this? I recognize that you don't feel the need for it, but as people migrate from Google photos to local solutions now that Google is a paid software, I think you'll find more people looking for this kind of feature.

Comment 7 caulier.gilles 2021-01-23 06:47:54 UTC

Old KDE scan application Kooka has a tesseract plugin :

 https://invent.kde.org/graphics/kooka/-/tree/master/plugins/ocr

Gilles Caulier

Comment 8 José Oliver-Didier 2021-03-10 18:47:07 UTC

Noticed that this bug got included in the summer of code event, and I belive this would be an extremely helpful tool. Services such as Google Photos and OneDrive Photos leverage such OCR functionalities. Of usefulness is that OneDrive displays the recognized text in the info pane. Here are some questions meant to aid in the development of such functionality for Digikam:
-	How would the text be stored using existing metadata fields or would new fields be created under the digikam XMP schema namespace?
-	Would the location of the words in the image also be stored so that they cab highlighted like face regions? 
-	Would the user be able to correct any OCR errors? 
-	Would it be capable of recognizing a type of document and adding a keyword tag (Example: receipt, screenshot, business card, invoice, bank check, blueprint, ect)?
-	Beyond, printed material (letters, receipts), would it be able to read text in photos in which a sign appears (Example: A street sign)?
-	Aware this may be beyond the scope of OCR, but it would be interesting if barcodes/qr codes could be read and such information could be also stored within the file. The Metadata Working Group Spec provides for storing barcode regions with type=BarCode (ref page 54, MWG Working group spec 2010)

Comment 9 caulier.gilles 2022-09-08 07:50:29 UTC

Implemented while Google Summer of Code 2022:

https://community.kde.org/GSoC/2022/StatusReports/QuocHungTran

Gilles Caulier