Bug 135253

Summary: add support for Tesseract-OCR
Product: [Applications] kooka Reporter: Marc Collin <marc.collin>
Component: generalAssignee: Klaas Freitag <freitag>
Status: CONFIRMED ---    
Severity: wishlist CC: esigra
Priority: NOR    
Version: 0.44   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Marc Collin 2006-10-07 16:03:21 UTC
Version:           0.44 (using KDE 3.5.4 "release 88.1" , openSUSE )
Compiler:          Target: x86_64-suse-linux
OS:                Linux (x86_64) release 2.6.18-5-default

hp free in opensource a very good ocr

Tesseract OCR

that could be very nice if kooka support it

http://sourceforge.net/projects/tesseract-ocr/
Comment 1 ralf@skolelinux.de 2008-02-19 20:46:49 UTC
I agree, tesseract-ocr is now a debian package. 
It is the engine for Google's ocropus project.

try: apt-get install tesseract-ocr
tesseract-ocr --help

call: tesseract inputfilename outputfilename 
Comment 2 Mike Anderton 2008-02-24 07:17:29 UTC
i came here to file this wish as well.. we definitely need Tesseract OCR. It was developed by HP and google later picked it up.
Comment 3 Viesturs Zarins 2008-02-24 14:54:52 UTC
*** This bug has been confirmed by popular vote. ***
Comment 4 Mike Anderton 2008-03-03 06:10:41 UTC
Ocropus is a document analysis and OCR program that uses Tesseract as a block-wise recognizer.

http://code.google.com/p/ocropus/

seeing as kooka does not do document analysis, kooka should piggyback on ocropus and let that gets the characters with tesseract