| Summary: | Exporting PDF with OCR text recognition based on tesseract doesn't work anymore | ||
|---|---|---|---|
| Product: | [Applications] Skanpage | Reporter: | Nicola Jelmorini <jelmorini> |
| Component: | general | Assignee: | Alexander Stippich <a.stippich> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | ||
| Priority: | NOR | ||
| Version First Reported In: | 24.05.0 | ||
| Target Milestone: | --- | ||
| Platform: | Neon | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
| Attachments: | The window of the functionality "Export PDF" | ||
|
Description
Nicola Jelmorini
2024-06-15 11:36:07 UTC
Have you checked that you still have the corresponding tesseract language files installed? (In reply to Alexander Stippich from comment #1) > Have you checked that you still have the corresponding tesseract language > files installed? Hi, I have made no changes to my system. The following tesseract packages are installed since the beginning on my system: ===================================================================================================== nicola@nicola-XPS-13-9360:~ ➤ apt list --installed | grep tesseract WARNING: apt does not have a stable CLI interface. Use with caution in scripts. libtesseract4/jammy,now 4.1.1-2.1build1 amd64 [installato, automatico] tesseract-ocr-eng/jammy,jammy,now 1:4.00~git30-7274cfa-1.1 all [installato, automatico] tesseract-ocr-ita/jammy,jammy,now 1:4.00~git30-7274cfa-1.1 all [installato] tesseract-ocr-osd/jammy,jammy,now 1:4.00~git30-7274cfa-1.1 all [installato, automatico] tesseract-ocr/jammy,now 4.1.1-2.1build1 amd64 [installato] nicola@nicola-XPS-13-9360:~ ===================================================================================================== The tesseract dependency was bumped to 5 fpr 24.05. Is tesseract5 available in Ubuntu 22.04? (In reply to Alexander Stippich from comment #3) > The tesseract dependency was bumped to 5 fpr 24.05. Is tesseract5 available > in Ubuntu 22.04? Unfortunately no. In Ubuntu 22.04 there is the package "libtesseract4". On the "Ubuntu packages" website I see that the package "libtesseract5" is included starting Ubuntu 23.10 (mantic). I'm using KDE Neon that upgrades between LTS editions only, thus, I suppose that tesseract5 will be available when KDE Neon will be upgraded to Ubuntu 24.04 (noble). Or you know other viable options? There is a ppa that should work, but I have not tested it: https://launchpad.net/~alex-p/+archive/ubuntu/tesseract-ocr5?field.series_filter=jammy Created attachment 171010 [details]
The window of the functionality "Export PDF"
The screenshot shows that there is no language selection in the window for the OCR text recognition.
(In reply to Alexander Stippich from comment #5) > There is a ppa that should work, but I have not tested it: > https://launchpad.net/~alex-p/+archive/ubuntu/tesseract-ocr5?field. > series_filter=jammy The PPA installs indeed the version 5 of tesseract and the languages I need, as you can see here: ===================================================================================================== nicola@nicola-XPS-13-9360:~ ➤ apt list --installed | grep tesseract libtesseract5/jammy,now 5.4.1-1ppa1~jammy1 amd64 [installato, automatico] tesseract-ocr-eng/jammy,jammy,now 1:5.0.0~git39-6572757-2ppa1~jammy1 all [installato, automatico] tesseract-ocr-ita/jammy,jammy,now 1:5.0.0~git39-6572757-2ppa1~jammy1 all [installato] tesseract-ocr-osd/jammy,jammy,now 1:5.0.0~git39-6572757-2ppa1~jammy1 all [installato, automatico] tesseract-ocr/jammy,now 5.4.1-1ppa1~jammy1 amd64 [installato] nicola@nicola-XPS-13-9360:~ ➤ tesseract --list-langs List of available languages in "/usr/share/tesseract-ocr/5/tessdata/" (3): eng ita osd nicola@nicola-XPS-13-9360:~ ===================================================================================================== But unfortunately the issue is still present: no languages selection available for the OCR text recognition. The screenshot "The window of the functionality Export PDF" that I have uploaded, shows you that the "Export PDF" window is missing the language selection list. I'm afraid that you have to wait for KDE Neon being rebased to 24.04 (In reply to Alexander Stippich from comment #8) > I'm afraid that you have to wait for KDE Neon being rebased to 24.04 OK, I understand and I can live with it. The wait shouldn't be too long. Thank you anyway for your support. Is this still an issue with KDE neon based on 24.04? (In reply to Alexander Stippich from comment #10) > Is this still an issue with KDE neon based on 24.04? I'm sorry, I have completely forgotten to give you a feedback after my upgrade to 24.04. Anyway, I have good news: this issue, after the upgrade, is gone 👍. For me, this bug report can be closed now. Thank you. Thanks for the feedback! |