Summary: | Baloo file indexer crashes on addition of new file | ||
---|---|---|---|
Product: | [Frameworks and Libraries] frameworks-baloo | Reporter: | Peter Kreussel <privat> |
Component: | Baloo File Daemon | Assignee: | baloo-bugs-null |
Status: | RESOLVED FIXED | ||
Severity: | crash | CC: | kde, stefan.bruens, tagwerk19, vbhunt |
Priority: | VHI | Keywords: | drkonqi |
Version: | 6.2.0 | ||
Target Milestone: | --- | ||
Platform: | Arch Linux | ||
OS: | Linux | ||
Latest Commit: | https://invent.kde.org/frameworks/baloo/-/commit/ff9d8d66d24c382f34dbc1d38c36519ad0ae1db5 | Version Fixed In: | |
Sentry Crash Report: |
Description
Peter Kreussel
2024-05-27 14:47:32 UTC
(In reply to Peter Kreussel from comment #0) > #25 0x00007c88e493b5b5 in createReadHandlerHelper > (device=device@entry=0x55b64f6434d0, format=..., autoDetectImageFormat=true, > ignoresFormatAndExtension=false) at > /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:230 > #26 0x00007c88e493e1f0 in QImageReaderPrivate::initHandler > (this=0x55b64f774830) at > /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:548 > #27 0x00007c88e493fb58 in QImageReader::canRead > (this=this@entry=0x7ffe64038368) at > /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:1123 > #28 0x00007c88e11743ce in KFileMetaData::PngExtractor::extract > (this=<optimized out>, result=0x7ffe640386d0) at > /usr/src/debug/kfilemetadata/kfilemetadata-6.2.0/src/extractors/pngextractor.cpp:57 Seems likely it is a "strange" .png that is tripping up something down in the Qt internals... It would make sense to look for a PNG validator. Googling finds me http://www.libpng.org/pub/png/apps/pngcheck.html which seems to be available on Neon (at least) According to pngcheck, all pngs are "OK" *after* the scan. This has definitely not happened earlier. I changed nothing on my system except updating qt6 in Arch Linux. ( https://gitlab.archlinux.org/archlinux/packaging/packages/qt6-base/-/commits/main ). So probably not much KDE devs can do... Thanks for looking at this, though. (In reply to Peter Kreussel from comment #2) > According to pngcheck, all pngs are "OK" *after* the scan. Do you mean that the file might have been "half written" (or something) and Baloo scanned it and crashed - but when the scan was finished, it was OK, according to pngcheck. Does Baloo fail when you restart it and "touch" the file to trigger a reindex? If you have a scan (without anything sensitive) that fails, could you attach it to the Bug and I can see if it fails for me? > Do you mean that the file might have been "half written" (or something) and
> Baloo scanned it and crashed - but when the scan was finished, it was OK,
> according to pngcheck.
That's what is suspected.
But in fact, baloo crashes when I touch *one* of the two PNGs.
They are called "paper.1.png" (the one causing crash) and "paper.1.edited.png" (no crash) and are identical in md5 sum, strangely enough.
Full path: /home/peter/papers/20240527_1641_38/paper.1.png
It must be a problem with the filename then?
I have no problem attaching a file except that they are 13 M in size an I cannot attach them here.
I tried an image hoster, but the files are not identical after download, so that is no use.
(In reply to Peter Kreussel from comment #4) > It must be a problem with the filename then? I've not encountered issues with filenames like "paper.1.png". Not to say that it is not possible, all sorts of things are possible 8-/ Are you indexing hidden files? (and content indexing?). Could it be that you are getting a thumbnail generated and Baloo is crashing when trying to index that... You can check whether Baloo has indexed the originals with balooshow -x paper.png balooshow -x paper.1.png although maybe "balooshow6" rather than "balooshow" and also watch the indexing process with "balooctl monitor" If you are indexing hidden files/folders, it is sensible to exclude the .cache and .local/shared/Trash folders as a follow-on step... I'll have a go at installing paperwork and exploring. Were you scanning or importing something? (In reply to tagwerk19 from comment #5) > (In reply to Peter Kreussel from comment #4) > > It must be a problem with the filename then? > I've not encountered issues with filenames like "paper.1.png". Not to say > that it is not possible, all sorts of things are possible 8-/ Such things are possible, especially for me. :-) > Are you indexing hidden files? (and content indexing?). Could it be that you > are getting a thumbnail generated and Baloo is crashing when trying to index > that... I do not have "index hidden files" option checked. > You can check whether Baloo has indexed the originals with > balooshow -x paper.png > balooshow -x paper.1.png > although maybe "balooshow6" rather than "balooshow" and also watch the > indexing process with "balooctl monitor" Balooshow say the file is not indexed: balooshow6 /home/peter/papers/20240603_1006_16/paper.1.png 2e18b8b2f2b7d27 791379239 48335755 /home/peter/papers/20240603_1006_16/paper.1.png: No index information found > If you are indexing hidden files/folders, it is sensible to exclude the > .cache and .local/shared/Trash folders as a follow-on step... I have not enabled indexing hidden files and folder. I have now purged my Baloo index an added "/home/peter/papers" as only indexed folder, which caused zillions of crashes... > I'll have a go at installing paperwork and exploring. Were you scanning or > importing something? I was scanning an A4 page. Strange enough, all that.. Peter (In reply to tagwerk19 from comment #5) > ... I'll have a go at installing paperwork and exploring ... I've done that although I'm not sure how clean an installation I managed. I failed at the pacstrap step on a fresh install. All the same, pulling an old installation off a backup and bringing it up-to-date with Plasma 6, installing paperwork, sane, avahi and tesseract, works (for me...) I'd say, not with a large collection of test documents and with a fair number of crashes caught and reported by paperwork but Baloo indexing seems OK. Not sure what to suggest. (In reply to tagwerk19 from comment #7) > (In reply to tagwerk19 from comment #5) > > ... I'll have a go at installing paperwork and exploring ... > I've done that although I'm not sure how clean an installation I managed. I > failed at the pacstrap step on a fresh install. > > All the same, pulling an old installation off a backup and bringing it > up-to-date with Plasma 6, installing paperwork, sane, avahi and tesseract, > works (for me...) > > I'd say, not with a large collection of test documents and with a fair > number of crashes caught and reported by paperwork but Baloo indexing seems > OK. > > Not sure what to suggest. Thanks for all your troubles. What we found out is that baloo, perhaps only with qt 6.7.1, crashes on *some* png files. On my Arch Linux system... I have to disable indexing the paperwork directory. Would be nice to have the OCR'd text, which lies there in plain text files, in my KDE filesearch results, but that obviously does not work at the moment. I am not enough into C++/QT/KDE programming to go deeper into this. Apparently the KFileMetaData PNG extractor crashes when trying to extract metadata from a file. The problem is the Qt PBF plugin, notable lines from the backtrace: ---- #12 0x00007c88e588ce5b in QMessageLogger::fatal (this=this@entry=0x7ffe640369b0, msg=msg@entry=0x7c88e4f00890 "QFontDatabase: Must construct a QGuiApplication before accessing QFontDatabase") at /usr/src/debug/qt6-base/qtbase/src/corelib/global/qlogging.cpp:889 #13 0x00007c88e48e2220 in QFontDatabasePrivate::ensureFontDatabase () at /usr/src/debug/qt6-base/qtbase/src/gui/text/qfontdatabase.cpp:1333 #20 0x00007c48cf7b599a in Style::load (this=<optimized out>, fileName=...) at ../QtPBFImagePlugin-3.0/src/style.cpp:598 #21 0x00007c48cf7aaf66 in PBFPlugin::PBFPlugin (this=0x55b64f663120, this=<optimized out>) at ../QtPBFImagePlugin-3.0/src/pbfplugin.cpp:16 #23 0x00007c88e5b9f6c3 in QLibraryPrivate::pluginInstance (this=0x55b64f6bdb00) at /usr/src/debug/qt6-base/qtbase/src/corelib/plugin/qlibrary.cpp:516 #27 0x00007c88e493fb58 in QImageReader::canRead (this=this@entry=0x7ffe64038368) at /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:1123 --- IMHO, the Qt PBF plugin is misbehaving here, as it violates the contract from QImageReader: https://doc.qt.io/qt-6/qimagereader.html#canRead > canRead() is a lightweight function that only does a quick test to see if the image data is valid. This is a known problem of the PBF plugin: https://github.com/tumic0/QtPBFImagePlugin/issues/7 baloo_file_extractor was changed from QGuiApplication to QCoreApplication recently: https://invent.kde.org/frameworks/baloo/-/merge_requests/192 (In reply to Peter Kreussel from comment #8) > ... I have to disable indexing the paperwork directory ... Or perhaps disable indexing of .png's Edit the ~/.config/baloofilerc file and add a "*.png" to the list of "exclude filters" (In reply to Stefan Brüns from comment #9) > This is a known problem of the PBF plugin: > https://github.com/tumic0/QtPBFImagePlugin/issues/7 It looks as if a fix there is not going to happen :-/ Seems possible that this will catch people in other areas, not just people using Paperwork. Something that fails for "paper.1.png" and works for an identical (according to the file hash) "paper.1.edited.png" is just a bit too slippery... Git commit ff9d8d66d24c382f34dbc1d38c36519ad0ae1db5 by David Edmundson. Committed on 06/06/2024 at 14:19. Pushed by davidedmundson into branch 'master'. Revert "[Extractor] Change to QCoreApplication" This reverts commit e8cf89c912c97d6affb3b3242958747664968226. M +1 -0 src/file/extractor/CMakeLists.txt M +12 -2 src/file/extractor/main.cpp https://invent.kde.org/frameworks/baloo/-/commit/ff9d8d66d24c382f34dbc1d38c36519ad0ae1db5 *** Bug 488759 has been marked as a duplicate of this bug. *** |