SUMMARY *** NOTE: If you are reporting a crash, please try to attach a backtrace with debug symbols. See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports *** Skanpage crashes if OCR is selected when saving the file but tesseract-ocr is not actually installed. STEPS TO REPRODUCE 1. start Skanpage 2. scan something with text 3. when saving the PDF either leave the option to add OCR checked or check it OBSERVED RESULT Skanpage crashes. EXPECTED RESULT Skanpage test whether tesseract is actually usable. If it's not, grey out the OCR option and inform user they don't have it installed on the system SOFTWARE/OS VERSIONS Operating System: openSUSE Tumbleweed 20230501 KDE Plasma Version: 5.27.4 KDE Frameworks Version: 5.105.0 Qt Version: 5.15.9 Kernel Version: 6.2.12-1-default (64-bit) Graphics Platform: Wayland Processors: 6 × Intel® Core™ i5-9600K CPU @ 3.70GHz Memory: 31,1 GiB of RAM Graphics Processor: AMD Radeon RX 6600 XT Manufacturer: Gigabyte Technology Co., Ltd. Product Name: Z390 GAMING X ADDITIONAL INFORMATION
Tesseract is required for OCR; if it's not installed, it won't work. But if the OCR controls are appearing anyway despite Tesseract not being installed, that sounds like a bug in the app, as it shouldn't be happening. I just tested this by removing the `tesseract-devel package` and rebuilding the app, and I correctly don't see the OCR controls. Maybe the reverse is not working, and it fails to hide them at runtime if compiled with Tesseract support included but the package isn't actually installed on the user's machine. Unfortunately I can't easily test this as my distro (Fedora KDE) makes Tesseract a mandatory package and it can't be removed at runtime.
(In reply to Nate Graham from comment #1) > Tesseract is required for OCR; if it's not installed, it won't work. But if > the OCR controls are appearing anyway despite Tesseract not being installed, > that sounds like a bug in the app, as it shouldn't be happening. I just > tested this by removing the `tesseract-devel package` and rebuilding the > app, and I correctly don't see the OCR controls. > > Maybe the reverse is not working, and it fails to hide them at runtime if > compiled with Tesseract support included but the package isn't actually > installed on the user's machine. Unfortunately I can't easily test this as > my distro (Fedora KDE) makes Tesseract a mandatory package and it can't be > removed at runtime. I believe this is what is happening. Skanpage is compiled with tesseract support but tesseract is not installed.
This also crashes Skanpage on Debian 12 Bookworm when using "Export PDF." The "Enable optical character recognition (OCR)" option was checked by default (I did not check it) and no languages were listed (not even English). I had installed Skanpage via Discover from the Debian repo. I had not installed tesseract-ocr. Next I apt installed tesseract-ocr, exited and restarted Skanpage. When I clicked Export PDF the Enable OCR option was checked like before but "American English [eng]" also appeared with an unchecked checkbox beside it. I left "American English [eng]" unchecked and "Enable optical character recognition (OCR)" checked, and was able to generate a PDF without the crash. I also tried "Save All" instead of "Export PDF" before installing tesseract-ocr, and that worked without crashing. It presumably makes no attempt at OCR. Operating System: Debian GNU/Linux 12 KDE Plasma Version: 5.27.5 KDE Frameworks Version: 5.103.0 Qt Version: 5.15.8 Kernel Version: 6.1.0-13-amd64 (64-bit) Graphics Platform: Wayland
Sorry for the late response. You are right, the availability of Tesseract is currently only checked at compile-time. When Skanpage is compiled with Tesseract present, but not added as a dependency, this results in the observed behavior.
Tesseract is now mandatory for Skanpage. While Skanpage could still be packaged incorrectly, it should not happen anymore in the future.