Bug 487628 - Baloo file indexer crashes on addition of new file
Summary: Baloo file indexer crashes on addition of new file
Status: RESOLVED FIXED
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: Baloo File Daemon (show other bugs)
Version: 6.2.0
Platform: Arch Linux Linux
: VHI crash
Target Milestone: ---
Assignee: baloo-bugs-null
URL:
Keywords: drkonqi
: 488759 (view as bug list)
Depends on:
Blocks:
 
Reported: 2024-05-27 14:47 UTC by Peter Kreussel
Modified: 2024-06-19 20:59 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Kreussel 2024-05-27 14:47:32 UTC
Application: baloo_file_extractor (6.2.0)

Qt Version: 6.7.1
Frameworks Version: 6.2.0
Operating System: Linux 6.9.2-arch1-1 x86_64
Windowing System: X11
Distribution: "Arch Linux"
DrKonqi: 6.0.5 [CoredumpBackend]

-- Information about the crash:
crashes every time i scan a file with paperwork, which adds a png, jpeg and some text file.

The crash can be reproduced every time.

-- Backtrace:
Application: Baloo-Dateiinfosammler (baloo_file_extractor), signal: Aborted
Content of s_kcrashErrorMessage: std::unique_ptr<char []> = {get() = <optimized out>}
Downloading separate debug info for /usr/lib/kf6/baloo_file_extractor...
[New LWP 238987]
[New LWP 238988]
Downloading separate debug info for /usr/lib/libKF6FileMetaData.so.3...
Downloading separate debug info for /usr/lib/libKF6BalooEngine.so.6...
Downloading separate debug info for /usr/lib/liblmdb.so...
Downloading separate debug info for /usr/lib/libb2.so.1...
Downloading separate debug info for /usr/lib/qt6/plugins/kf6/kfilemetadata/kfilemetadata_pngextractor.so...
Downloading separate debug info for /usr/lib/qt6/plugins/kf6/kfilemetadata/kfilemetadata_exiv2extractor.so...
Downloading separate debug info for /usr/lib/libx265.so.199...
Downloading separate debug info for /usr/lib/qt6/plugins/imageformats/libqpdf.so...
Downloading separate debug info for /usr/lib/qt6/plugins/imageformats/../../../libQt6Pdf.so.6...
Downloading separate debug info for /usr/lib/qt6/plugins/imageformats/../../../libmng.so.2...
Downloading separate debug info for system-supplied DSO at 0x7c88e61b9000...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/usr/lib/kf6/baloo_file_extractor'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44	      return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
[Current thread is 1 (Thread 0x7c88e12ac980 (LWP 238987))]

Cannot QML trace cores :(
Downloading source file /usr/src/debug/baloo/baloo-6.2.0/src/file/extractor/main.cpp...
Downloading source file /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qtimerinfo_unix.cpp...
Downloading source file /usr/src/debug/qt6-base/build/src/corelib/Core_autogen/7GB2EGQPHR/../../../../../qtbase/src/corelib/kernel/qsingleshottimer_p.h...
Downloading source file /usr/src/debug/qt6-base/build/src/corelib/Core_autogen/7GB2EGQPHR/moc_qsingleshottimer_p.cpp...
Downloading source file /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qobjectdefs_impl.h...
Downloading source file /usr/src/debug/baloo/baloo-6.2.0/src/file/extractor/app.cpp...
Downloading source file /usr/src/debug/kfilemetadata/kfilemetadata-6.2.0/src/extractors/pngextractor.cpp...
Downloading source file /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp...
Downloading source file /usr/src/debug/qt6-base/qtbase/src/corelib/plugin/qfactoryloader.cpp...
Downloading source file /usr/src/debug/qt6-base/qtbase/src/corelib/plugin/qlibrary.cpp...
Downloading source file /usr/src/debug/qtpbfimageplugin/build6/moc_pbfplugin.cpp...
Downloading source file /usr/src/debug/qtpbfimageplugin/build6/../QtPBFImagePlugin-3.0/src/pbfplugin.cpp...
Downloading source file /usr/src/debug/qtpbfimageplugin/build6/../QtPBFImagePlugin-3.0/src/style.cpp...
Downloading source file /usr/src/debug/qtpbfimageplugin/build6/../QtPBFImagePlugin-3.0/src/font.cpp...
Downloading source file /usr/src/debug/qt6-base/qtbase/src/gui/text/qfontdatabase.cpp...
Downloading source file /usr/src/debug/qt6-base/qtbase/src/corelib/global/qlogging.cpp...
Downloading source file /usr/src/debug/qt6-base/qtbase/src/corelib/global/qglobal.cpp...
Downloading source file /usr/src/debug/glibc/glibc/stdlib/abort.c...
Downloading source file /usr/src/debug/glibc/glibc/signal/../sysdeps/posix/raise.c...
[Current thread is 1 (Thread 0x7c88e12ac980 (LWP 238987))]

Thread 2 (Thread 0x7c48e08006c0 (LWP 238988)):
#0  0x00007c88e531c39d in __GI___poll (fds=0x55b64f56b260, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x00007c88e3f768fd in g_main_context_poll_unlocked (priority=2147483647, context=0x7c48d8000c60, timeout=<optimized out>, fds=0x55b64f56b260, n_fds=2) at ../glib/glib/gmain.c:4521
#2  g_main_context_iterate_unlocked.isra.0 (context=context@entry=0x7c48d8000c60, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/glib/gmain.c:4212
#3  0x00007c88e3f13f95 in g_main_context_iteration (context=0x7c48d8000c60, may_block=1) at ../glib/glib/gmain.c:4282
#4  0x00007c88e5ba28bd in QEventDispatcherGlib::processEvents (this=0x7c48d8000b70, flags=...) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qeventdispatcher_glib.cpp:394
#5  0x00007c88e594f0de in QEventLoop::processEvents (this=0x7c48e07ffb60, flags=...) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qeventloop.cpp:100
#6  QEventLoop::exec (this=0x7c48e07ffb60, flags=...) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qeventloop.cpp:182
#7  0x00007c88e5a3a4b0 in QThread::exec (this=this@entry=0x7c88e5212b20 <QGlobalStatic<QtGlobalStatic::Holder<(anonymous namespace)::Q_QGS__q_manager> >::instance()::holder>) at /usr/src/debug/qt6-base/qtbase/src/corelib/global/qflags.h:74
#8  0x00007c88e5189dfe in QDBusConnectionManager::run (this=0x7c88e5212b20 <QGlobalStatic<QtGlobalStatic::Holder<(anonymous namespace)::Q_QGS__q_manager> >::instance()::holder>) at /usr/src/debug/qt6-base/qtbase/src/dbus/qdbusconnectionmanager.cpp:144
#9  0x00007c88e5ac96b7 in operator() (__closure=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/thread/qthread_unix.cpp:326
#10 (anonymous namespace)::terminate_on_exception<QThreadPrivate::start(void*)::<lambda()> > (t=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/thread/qthread_unix.cpp:262
#11 QThreadPrivate::start (arg=0x7c88e5212b20 <QGlobalStatic<QtGlobalStatic::Holder<(anonymous namespace)::Q_QGS__q_manager> >::instance()::holder>) at /usr/src/debug/qt6-base/qtbase/src/corelib/thread/qthread_unix.cpp:285
#12 0x00007c88e52a6ded in start_thread (arg=<optimized out>) at pthread_create.c:447
#13 0x00007c88e532a0dc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

Thread 1 (Thread 0x7c88e12ac980 (LWP 238987)):
[KCrash Handler]
#5  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#6  0x00007c88e52a8eb3 in __pthread_kill_internal (threadid=<optimized out>, signo=6) at pthread_kill.c:78
#7  0x00007c88e5250a30 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#8  0x00007c88e52384c3 in __GI_abort () at abort.c:79
#9  0x00007c88e588c6b7 in qAbort () at /usr/src/debug/qt6-base/qtbase/src/corelib/global/qglobal.cpp:136
#10 qt_message_fatal<QString&> (context=<optimized out>, message=...) at /usr/src/debug/qt6-base/qtbase/src/corelib/global/qlogging.cpp:2052
#11 qt_message(QtMsgType, const QMessageLogContext &, const char *, typedef __va_list_tag __va_list_tag *) (msgType=msgType@entry=QtFatalMsg, context=..., msg=msg@entry=0x7c88e4f00890 "QFontDatabase: Must construct a QGuiApplication before accessing QFontDatabase", ap=ap@entry=0x7ffe640368c0) at /usr/src/debug/qt6-base/qtbase/src/corelib/global/qlogging.cpp:374
#12 0x00007c88e588ce5b in QMessageLogger::fatal (this=this@entry=0x7ffe640369b0, msg=msg@entry=0x7c88e4f00890 "QFontDatabase: Must construct a QGuiApplication before accessing QFontDatabase") at /usr/src/debug/qt6-base/qtbase/src/corelib/global/qlogging.cpp:889
#13 0x00007c88e48e2220 in QFontDatabasePrivate::ensureFontDatabase () at /usr/src/debug/qt6-base/qtbase/src/gui/text/qfontdatabase.cpp:1333
#14 0x00007c88e4bb5abf in QFontDatabase::families (writingSystem=writingSystem@entry=QFontDatabase::Any) at /usr/src/debug/qt6-base/qtbase/src/gui/text/qfontdatabase.cpp:1434
#15 0x00007c48cf7b9596 in fonts () at ../QtPBFImagePlugin-3.0/src/font.cpp:62
#16 0x00007c48cf7c2835 in matchFamily (family=...) at ../QtPBFImagePlugin-3.0/src/font.cpp:101
#17 Font::fromJsonArray (json=...) at ../QtPBFImagePlugin-3.0/src/font.cpp:123
#18 0x00007c48cf7adba7 in Style::Layer::Layout::Layout (this=0x7ffe64036e50, json=..., this=<optimized out>, json=<optimized out>) at ../QtPBFImagePlugin-3.0/src/style.cpp:369
#19 0x00007c48cf7afd97 in Style::Layer::Layer (this=0x7ffe64037440, json=..., this=<optimized out>, json=<optimized out>) at ../QtPBFImagePlugin-3.0/src/style.cpp:500
#20 0x00007c48cf7b599a in Style::load (this=<optimized out>, fileName=...) at ../QtPBFImagePlugin-3.0/src/style.cpp:598
#21 0x00007c48cf7aaf66 in PBFPlugin::PBFPlugin (this=0x55b64f663120, this=<optimized out>) at ../QtPBFImagePlugin-3.0/src/pbfplugin.cpp:16
#22 0x00007c48cf7c3a57 in qt_plugin_instance () at /usr/src/debug/qtpbfimageplugin/build6/moc_pbfplugin.cpp:127
#23 0x00007c88e5b9f6c3 in QLibraryPrivate::pluginInstance (this=0x55b64f6bdb00) at /usr/src/debug/qt6-base/qtbase/src/corelib/plugin/qlibrary.cpp:516
#24 0x00007c88e59bf02a in QFactoryLoader::instance (this=this@entry=0x7c88e5046da0 <QGlobalStatic<QtGlobalStatic::Holder<QImageReaderWriterHelpers::(anonymous namespace)::Q_QGS_irhLoader> >::instance()::holder>, index=index@entry=29) at /usr/src/debug/qt6-base/qtbase/src/corelib/plugin/qfactoryloader.cpp:555
#25 0x00007c88e493b5b5 in createReadHandlerHelper (device=device@entry=0x55b64f6434d0, format=..., autoDetectImageFormat=true, ignoresFormatAndExtension=false) at /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:230
#26 0x00007c88e493e1f0 in QImageReaderPrivate::initHandler (this=0x55b64f774830) at /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:548
#27 0x00007c88e493fb58 in QImageReader::canRead (this=this@entry=0x7ffe64038368) at /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:1123
#28 0x00007c88e11743ce in KFileMetaData::PngExtractor::extract (this=<optimized out>, result=0x7ffe640386d0) at /usr/src/debug/kfilemetadata/kfilemetadata-6.2.0/src/extractors/pngextractor.cpp:57
#29 0x000055b62c368411 in Baloo::App::index (this=this@entry=0x7ffe64038fd0, tr=0x55b64f558340, url=..., id=id@entry=206564554516954407) at /usr/src/debug/baloo/baloo-6.2.0/src/file/extractor/app.cpp:180
#30 0x000055b62c36976d in Baloo::App::processNextFile (this=0x7ffe64038fd0) at /usr/include/c++/14.1.1/bits/unique_ptr.h:193
#31 0x00007c88e59a17b7 in QtPrivate::QSlotObjectBase::call (this=0x55b64f557210, r=0x7ffe64038fd0, a=0x7ffe64038a58, this=<optimized out>, r=<optimized out>, a=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qobjectdefs_impl.h:469
#32 doActivate<false> (sender=<optimized out>, signal_index=<optimized out>, argv=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qobject.cpp:4086
#33 0x00007c88e58cf534 in QSingleShotTimer::timeout (this=0x55b64f53d680) at /usr/src/debug/qt6-base/build/src/corelib/Core_autogen/7GB2EGQPHR/moc_qsingleshottimer_p.cpp:139
#34 QSingleShotTimer::timerEvent (this=0x55b64f53d680) at /usr/src/debug/qt6-base/build/src/corelib/Core_autogen/7GB2EGQPHR/../../../../../qtbase/src/corelib/kernel/qsingleshottimer_p.h:116
#35 0x00007c88e598c089 in QObject::event (this=0x55b64f53d680, e=0x7ffe64038c00) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qobject.cpp:1427
#36 0x00007c88e5944de3 in doNotify (receiver=<optimized out>, event=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qcoreapplication.cpp:1235
#37 QCoreApplication::notify (this=<optimized out>, receiver=<optimized out>, event=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qcoreapplication.cpp:1218
#38 QCoreApplication::notifyInternal2 (receiver=0x55b64f53d680, event=0x7ffe64038c00) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qcoreapplication.cpp:1134
#39 0x00007c88e5ac3c08 in QCoreApplication::sendEvent (receiver=<optimized out>, event=0x7ffe64038c00) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qcoreapplication.cpp:1575
#40 QTimerInfoList::activateTimers (this=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qtimerinfo_unix.cpp:434
#41 0x00007c88e5ba4579 in timerSourceDispatch (source=<optimized out>) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qeventdispatcher_glib.cpp:150
#42 0x00007c88e3f14a89 in g_main_dispatch (context=0x55b64f523730) at ../glib/glib/gmain.c:3344
#43 0x00007c88e3f769b7 in g_main_context_dispatch_unlocked (context=0x55b64f523730) at ../glib/glib/gmain.c:4152
#44 g_main_context_iterate_unlocked.isra.0 (context=context@entry=0x55b64f523730, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/glib/gmain.c:4217
#45 0x00007c88e3f13f95 in g_main_context_iteration (context=0x55b64f523730, may_block=1) at ../glib/glib/gmain.c:4282
#46 0x00007c88e5ba28e2 in QEventDispatcherGlib::processEvents (this=0x55b64f5236d0, flags=...) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qeventdispatcher_glib.cpp:396
#47 0x00007c88e594f0de in QEventLoop::processEvents (this=0x7ffe64038ee0, flags=...) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qeventloop.cpp:100
#48 QEventLoop::exec (this=0x7ffe64038ee0, flags=...) at /usr/src/debug/qt6-base/qtbase/src/corelib/kernel/qeventloop.cpp:182
#49 0x00007c88e594942d in QCoreApplication::exec () at /usr/src/debug/qt6-base/qtbase/src/corelib/global/qflags.h:74
#50 0x000055b62c35e374 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/baloo/baloo-6.2.0/src/file/extractor/main.cpp:33

Reported using DrKonqi
Comment 1 tagwerk19 2024-06-02 09:16:17 UTC
(In reply to Peter Kreussel from comment #0)
> #25 0x00007c88e493b5b5 in createReadHandlerHelper
> (device=device@entry=0x55b64f6434d0, format=..., autoDetectImageFormat=true,
> ignoresFormatAndExtension=false) at
> /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:230
> #26 0x00007c88e493e1f0 in QImageReaderPrivate::initHandler
> (this=0x55b64f774830) at
> /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:548
> #27 0x00007c88e493fb58 in QImageReader::canRead
> (this=this@entry=0x7ffe64038368) at
> /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:1123
> #28 0x00007c88e11743ce in KFileMetaData::PngExtractor::extract
> (this=<optimized out>, result=0x7ffe640386d0) at
> /usr/src/debug/kfilemetadata/kfilemetadata-6.2.0/src/extractors/pngextractor.cpp:57
Seems likely it is a "strange" .png that is tripping up something down in the Qt internals...

It would make sense to look for a PNG validator. Googling finds me
    http://www.libpng.org/pub/png/apps/pngcheck.html
which seems to be available on Neon (at least)
Comment 2 Peter Kreussel 2024-06-02 13:10:52 UTC
According to pngcheck, all pngs  are "OK" *after* the scan.

This has definitely not happened earlier. I changed nothing on my system except updating qt6 in Arch Linux.
( https://gitlab.archlinux.org/archlinux/packaging/packages/qt6-base/-/commits/main ).

So probably not much KDE devs can do...
Thanks for looking at this, though.
Comment 3 tagwerk19 2024-06-02 20:49:59 UTC
(In reply to Peter Kreussel from comment #2)
> According to pngcheck, all pngs  are "OK" *after* the scan.
Do you mean that the file might have been "half written" (or something) and Baloo scanned it and crashed - but when the scan was finished, it was OK, according to pngcheck.

Does Baloo fail when you restart it and "touch" the file to trigger a reindex?

If you have a scan (without anything sensitive) that fails, could you attach it to the Bug and I can see if it fails for me?
Comment 4 Peter Kreussel 2024-06-03 08:15:47 UTC
> Do you mean that the file might have been "half written" (or something) and
> Baloo scanned it and crashed - but when the scan was finished, it was OK,
> according to pngcheck.
That's what is suspected. 
But in fact, baloo crashes when I touch *one* of the two PNGs. 

They are called "paper.1.png" (the one causing crash) and "paper.1.edited.png" (no crash) and are identical in md5 sum, strangely enough.
Full path: /home/peter/papers/20240527_1641_38/paper.1.png

It must be a problem with the filename then?

I have no problem attaching a file except that they are 13 M in size an I cannot attach them here.
I tried an image hoster, but the files are not identical after download, so that is no use.
Comment 5 tagwerk19 2024-06-03 12:32:58 UTC
(In reply to Peter Kreussel from comment #4)
> It must be a problem with the filename then?  
I've not encountered issues with filenames like "paper.1.png". Not to say that it is not possible, all sorts of things are possible 8-/

Are you indexing hidden files? (and content indexing?). Could it be that you are getting a thumbnail generated and Baloo is crashing when trying to index that...

You can check whether Baloo has indexed the originals with   
        balooshow -x paper.png
        balooshow -x paper.1.png
although maybe "balooshow6" rather than "balooshow" and also watch the indexing process with "balooctl monitor"

If you are indexing hidden files/folders, it is sensible to exclude the .cache and .local/shared/Trash folders as a follow-on step...

I'll have a go at installing paperwork and exploring. Were you scanning or importing something?
Comment 6 Peter Kreussel 2024-06-03 13:09:24 UTC
(In reply to tagwerk19 from comment #5)
> (In reply to Peter Kreussel from comment #4)
> > It must be a problem with the filename then?  
> I've not encountered issues with filenames like "paper.1.png". Not to say
> that it is not possible, all sorts of things are possible 8-/
Such things are possible, especially for me. :-)

> Are you indexing hidden files? (and content indexing?). Could it be that you
> are getting a thumbnail generated and Baloo is crashing when trying to index
> that...
I do not have "index hidden files" option checked.

> You can check whether Baloo has indexed the originals with   
>         balooshow -x paper.png
>         balooshow -x paper.1.png
> although maybe "balooshow6" rather than "balooshow" and also watch the
> indexing process with "balooctl monitor"
Balooshow say the file is not indexed:
balooshow6  /home/peter/papers/20240603_1006_16/paper.1.png 
2e18b8b2f2b7d27 791379239 48335755 /home/peter/papers/20240603_1006_16/paper.1.png: No index information found

> If you are indexing hidden files/folders, it is sensible to exclude the
> .cache and .local/shared/Trash folders as a follow-on step...
I have not enabled indexing hidden files and folder.

I have now purged my Baloo index an added "/home/peter/papers" as only indexed folder, which caused zillions of crashes...
 
> I'll have a go at installing paperwork and exploring. Were you scanning or
> importing something?
I was scanning an A4 page.

Strange enough, all that..
Peter
Comment 7 tagwerk19 2024-06-05 07:32:42 UTC
(In reply to tagwerk19 from comment #5)
> ... I'll have a go at installing paperwork and exploring ...
I've done that although I'm not sure how clean an installation I managed. I failed at the pacstrap step on a fresh install.

All the same, pulling an old installation off a backup and bringing it up-to-date with Plasma 6, installing paperwork, sane, avahi and tesseract, works (for me...)
    
I'd say, not with a large collection of test documents and with a fair number of crashes caught and reported by paperwork but Baloo indexing seems OK.

Not sure what to suggest.
Comment 8 Peter Kreussel 2024-06-05 07:47:14 UTC
(In reply to tagwerk19 from comment #7)
> (In reply to tagwerk19 from comment #5)
> > ... I'll have a go at installing paperwork and exploring ...
> I've done that although I'm not sure how clean an installation I managed. I
> failed at the pacstrap step on a fresh install.
> 
> All the same, pulling an old installation off a backup and bringing it
> up-to-date with Plasma 6, installing paperwork, sane, avahi and tesseract,
> works (for me...)
>     
> I'd say, not with a large collection of test documents and with a fair
> number of crashes caught and reported by paperwork but Baloo indexing seems
> OK.
> 
> Not sure what to suggest.
Thanks for all your troubles.

What we found out is that baloo, perhaps only with qt 6.7.1, crashes on *some* png files. On my Arch Linux system...

I have to disable  indexing the paperwork directory. Would be nice to have the OCR'd text, which lies there in plain text files, in my KDE filesearch results, but that obviously does not work at the moment. 

I am not enough into C++/QT/KDE programming to go deeper into this.
Comment 9 Stefan Brüns 2024-06-06 00:27:48 UTC
Apparently the KFileMetaData PNG extractor crashes when trying to extract metadata from a file.

The problem is the Qt PBF plugin, notable lines from the backtrace:

----
#12 0x00007c88e588ce5b in QMessageLogger::fatal (this=this@entry=0x7ffe640369b0, msg=msg@entry=0x7c88e4f00890 "QFontDatabase: Must construct a QGuiApplication before accessing QFontDatabase") at /usr/src/debug/qt6-base/qtbase/src/corelib/global/qlogging.cpp:889
#13 0x00007c88e48e2220 in QFontDatabasePrivate::ensureFontDatabase () at /usr/src/debug/qt6-base/qtbase/src/gui/text/qfontdatabase.cpp:1333
#20 0x00007c48cf7b599a in Style::load (this=<optimized out>, fileName=...) at ../QtPBFImagePlugin-3.0/src/style.cpp:598
#21 0x00007c48cf7aaf66 in PBFPlugin::PBFPlugin (this=0x55b64f663120, this=<optimized out>) at ../QtPBFImagePlugin-3.0/src/pbfplugin.cpp:16
#23 0x00007c88e5b9f6c3 in QLibraryPrivate::pluginInstance (this=0x55b64f6bdb00) at /usr/src/debug/qt6-base/qtbase/src/corelib/plugin/qlibrary.cpp:516
#27 0x00007c88e493fb58 in QImageReader::canRead (this=this@entry=0x7ffe64038368) at /usr/src/debug/qt6-base/qtbase/src/gui/image/qimagereader.cpp:1123
---

IMHO, the Qt PBF plugin is misbehaving here, as it violates the contract from QImageReader:

https://doc.qt.io/qt-6/qimagereader.html#canRead
> canRead() is a lightweight function that only does a quick test to see if the image data is valid.

This is a known problem of the PBF plugin:
https://github.com/tumic0/QtPBFImagePlugin/issues/7

baloo_file_extractor was changed from QGuiApplication to QCoreApplication recently:
https://invent.kde.org/frameworks/baloo/-/merge_requests/192
Comment 10 tagwerk19 2024-06-06 05:49:22 UTC
(In reply to Peter Kreussel from comment #8)
> ... I have to disable  indexing the paperwork directory ...
Or perhaps disable indexing of .png's

Edit the ~/.config/baloofilerc file and add a "*.png" to the list of "exclude filters"

(In reply to Stefan Brüns from comment #9)
> This is a known problem of the PBF plugin:
> https://github.com/tumic0/QtPBFImagePlugin/issues/7
It looks as if a fix there is not going to happen :-/

Seems possible that this will catch people in other areas, not just people using Paperwork. Something that fails for "paper.1.png" and works for an identical (according to the file hash) "paper.1.edited.png" is just a bit too slippery...
Comment 11 David Edmundson 2024-06-07 17:15:41 UTC
Git commit ff9d8d66d24c382f34dbc1d38c36519ad0ae1db5 by David Edmundson.
Committed on 06/06/2024 at 14:19.
Pushed by davidedmundson into branch 'master'.

Revert "[Extractor] Change to QCoreApplication"

This reverts commit e8cf89c912c97d6affb3b3242958747664968226.

M  +1    -0    src/file/extractor/CMakeLists.txt
M  +12   -2    src/file/extractor/main.cpp

https://invent.kde.org/frameworks/baloo/-/commit/ff9d8d66d24c382f34dbc1d38c36519ad0ae1db5
Comment 12 tagwerk19 2024-06-19 20:59:22 UTC
*** Bug 488759 has been marked as a duplicate of this bug. ***