Bug 421317 - Baloo is crashing at startup
Summary: Baloo is crashing at startup
Status: CONFIRMED
Alias: None
Product: frameworks-kfilemetadata
Classification: Frameworks and Libraries
Component: general (other bugs)
Version First Reported In: 5.68.0
Platform: Compiled Sources Linux
: NOR crash
Target Milestone: ---
Assignee: Pinak Ahuja
URL:
Keywords: drkonqi
Depends on:
Blocks:
 
Reported: 2020-05-11 07:12 UTC by DaBler
Modified: 2023-11-19 19:23 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description DaBler 2020-05-11 07:12:54 UTC
Application: baloo_file_extractor (5.67.0)
 (Compiled from sources)
Qt Version: 5.14.1
Frameworks Version: 5.67.0
Operating System: Linux 5.4.38-gentoo-10 x86_64
Distribution: Gentoo/Linux

-- Information about the crash:
Baloo is crashing at startup and everytime. Just logged in into KDE and it crashes....

The crash can be reproduced every time.

-- Backtrace:
Application: Baloo File Extractor (baloo_file_extractor), signal: Aborted
Using host libthread_db library "/lib64/libthread_db.so.1".
[Current thread is 1 (Thread 0x7f05dd98d7c0 (LWP 5507))]

Thread 2 (Thread 0x7ec5dcc8c700 (LWP 5520)):
#0  __GI___libc_read (nbytes=16, buf=0x7ec5dcc8bb90, fd=13) at ../sysdeps/unix/sysv/linux/read.c:26
#1  __GI___libc_read (fd=13, buf=buf@entry=0x7ec5dcc8bb90, nbytes=nbytes@entry=16) at ../sysdeps/unix/sysv/linux/read.c:24
#2  0x00007f05df78c36f in read (__nbytes=16, __buf=0x7ec5dcc8bb90, __fd=<optimized out>) at /usr/include/bits/unistd.h:44
#3  g_wakeup_acknowledge (wakeup=0x55bec52ddbc0) at ../glib-2.62.6/glib/gwakeup.c:210
#4  0x00007f05df744f4e in g_main_context_check (context=context@entry=0x7ec5d8000c20, max_priority=2147483647, fds=fds@entry=0x7ec5d8005240, n_fds=n_fds@entry=1) at ../glib-2.62.6/glib/gmain.c:3732
#5  0x00007f05df745392 in g_main_context_iterate (context=context@entry=0x7ec5d8000c20, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib-2.62.6/glib/gmain.c:3951
#6  0x00007f05df74550f in g_main_context_iteration (context=0x7ec5d8000c20, may_block=may_block@entry=1) at ../glib-2.62.6/glib/gmain.c:4015
#7  0x00007f05e0b15b13 in QEventDispatcherGlib::processEvents (this=0x7ec5d8000b60, flags=...) at kernel/qeventdispatcher_glib.cpp:425
#8  0x00007f05e0ac2593 in QEventLoop::exec (this=this@entry=0x7ec5dcc8bdb0, flags=..., flags@entry=...) at ../../include/QtCore/../../src/corelib/global/qflags.h:136
#9  0x00007f05e0921c7e in QThread::exec (this=this@entry=0x7f05e1b13da0 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at ../../include/QtCore/../../src/corelib/global/qflags.h:118
#10 0x00007f05e1a93585 in QDBusConnectionManager::run (this=0x7f05e1b13da0 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at qdbusconnection.cpp:179
#11 0x00007f05e0922e1b in QThreadPrivate::start (arg=0x7f05e1b13da0 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at thread/qthread_unix.cpp:342
#12 0x00007f05dffc4fa7 in start_thread (arg=<optimized out>) at pthread_create.c:479
#13 0x00007f05e051edbf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7f05dd98d7c0 (LWP 5507)):
[KCrash Handler]
#7  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#8  0x00007f05e044755b in __GI_abort () at abort.c:79
#9  0x00007f05e0680745 in __gnu_cxx::__verbose_terminate_handler () at /var/tmp/portage/sys-devel/gcc-9.3.0/work/gcc-9.3.0/libstdc++-v3/libsupc++/vterminate.cc:95
#10 0x00007f05e06b0646 in __cxxabiv1::__terminate (handler=<optimized out>) at /var/tmp/portage/sys-devel/gcc-9.3.0/work/gcc-9.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:48
#11 0x00007f05e06b06b1 in std::terminate () at /var/tmp/portage/sys-devel/gcc-9.3.0/work/gcc-9.3.0/libstdc++-v3/libsupc++/eh_terminate.cc:58
#12 0x00007f05e06b0903 in __cxxabiv1::__cxa_throw (obj=<optimized out>, tinfo=0x7f05e0844c28 <typeinfo for std::bad_alloc>, dest=0x7f05e06aec70 <std::bad_alloc::~bad_alloc()>) at /var/tmp/portage/sys-devel/gcc-9.3.0/work/gcc-9.3.0/libstdc++-v3/libsupc++/eh_throw.cc:95
#13 0x00007f05e08e8f16 in qBadAlloc () at /usr/lib/gcc/x86_64-pc-linux-gnu/8.3.0/include/g++-v8/bits/exception.h:63
#14 0x00007f05e08ee699 in QString::QString (this=0x7ffcd932cb88, size=<optimized out>) at text/qstring.cpp:2158
#15 0x00007f05e0b193a6 in QUtf8::convertToUnicode (chars=0x7ec3d81f6010 "", len=-2145246092, state=0x7ffcd932cc80) at codecs/qutfcodec.cpp:561
#16 0x00007f05e0b19b75 in QUtf8Codec::convertToUnicode (this=<optimized out>, chars=<optimized out>, len=<optimized out>, state=<optimized out>) at codecs/qutfcodec.cpp:993
#17 0x00007f05dd28e95c in QTextCodec::toUnicode (state=0x7ffcd932cc80, length=<optimized out>, in=<optimized out>, this=0x55bec52cc4e0) at /usr/include/qt5/QtCore/qtextcodec.h:114
#18 KFileMetaData::PlainTextExtractor::extract (this=<optimized out>, result=0x7ffcd932cdf0) at /var/tmp/portage/kde-frameworks/kfilemetadata-5.67.0/work/kfilemetadata-5.67.0/src/extractors/plaintextextractor.cpp:89
#19 0x000055bec473f96f in Baloo::App::index (this=this@entry=0x7ffcd932d540, tr=0x55bec535ab80, url=..., id=id@entry=239817586395644676) at /var/tmp/portage/kde-frameworks/baloo-5.67.0-r1/work/baloo-5.67.0/src/file/extractor/app.cpp:193
#20 0x000055bec47418cb in Baloo::App::processNextFile (this=0x7ffcd932d540) at /var/tmp/portage/kde-frameworks/baloo-5.67.0-r1/work/baloo-5.67.0/src/file/extractor/app.cpp:112
#21 0x00007f05e0aedee5 in QObject::event (this=0x7ffcd932d540, e=0x55bec52e49d0) at kernel/qobject.cpp:1339
#22 0x00007f05e14705be in QApplicationPrivate::notify_helper (this=this@entry=0x55bec52cb0f0, receiver=receiver@entry=0x7ffcd932d540, e=e@entry=0x55bec52e49d0) at kernel/qapplication.cpp:3684
#23 0x00007f05e1477620 in QApplication::notify (this=0x7ffcd932d530, receiver=0x7ffcd932d540, e=0x55bec52e49d0) at kernel/qapplication.cpp:3430
#24 0x00007f05e0ac37b1 in QCoreApplication::notifyInternal2 (receiver=0x7ffcd932d540, event=0x55bec52e49d0) at kernel/qcoreapplication.cpp:1092
#25 0x00007f05e0ac6399 in QCoreApplicationPrivate::sendPostedEvents (receiver=0x0, event_type=0, data=0x55bec52cb260) at kernel/qcoreapplication.cpp:1832
#26 0x00007f05e0b15d33 in postEventSourceDispatch (s=0x55bec52dd870) at kernel/qeventdispatcher_glib.cpp:277
#27 0x00007f05df7451f7 in g_main_dispatch (context=0x55bec52ddcd0) at ../glib-2.62.6/glib/gmain.c:3216
#28 g_main_context_dispatch (context=context@entry=0x55bec52ddcd0) at ../glib-2.62.6/glib/gmain.c:3881
#29 0x00007f05df745480 in g_main_context_iterate (context=context@entry=0x55bec52ddcd0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib-2.62.6/glib/gmain.c:3954
#30 0x00007f05df74550f in g_main_context_iteration (context=0x55bec52ddcd0, may_block=may_block@entry=1) at ../glib-2.62.6/glib/gmain.c:4015
#31 0x00007f05e0b15afc in QEventDispatcherGlib::processEvents (this=0x55bec52dcf80, flags=...) at kernel/qeventdispatcher_glib.cpp:423
#32 0x00007f05e0ac2593 in QEventLoop::exec (this=this@entry=0x7ffcd932d490, flags=..., flags@entry=...) at ../../include/QtCore/../../src/corelib/global/qflags.h:136
#33 0x00007f05e0aca532 in QCoreApplication::exec () at ../../include/QtCore/../../src/corelib/global/qflags.h:118
#34 0x000055bec473e51c in main (argc=<optimized out>, argv=<optimized out>) at /var/tmp/portage/kde-frameworks/baloo-5.67.0-r1/work/baloo-5.67.0/src/file/extractor/main.cpp:59
[Inferior 1 (process 5507) detached]

Possible duplicates by query: bug 420615, bug 420414, bug 419798, bug 419788, bug 418804.

Reported using DrKonqi
Comment 1 Stefan Brüns 2020-05-11 10:41:11 UTC
The extractor from KFileMetadata is crashing.

'balooctl failed' will tell which file causes the crash.
Comment 2 DaBler 2020-05-11 11:21:37 UTC
(In reply to Stefan Brüns from comment #1)
> The extractor from KFileMetadata is crashing.
> 
> 'balooctl failed' will tell which file causes the crash.

$ balooctl failed
All Files were indexed successfully
Comment 3 Bug Janitor Service 2020-05-26 04:33:11 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 4 Christoph Feck 2020-06-05 00:04:15 UTC
From the backtrace, it looks like there is a text file larger than 2 GiB. Qt only supports up to 2 GiB.

I suggest to add incremental loading to the plaintextextractor.
Comment 5 al F 2021-02-06 12:17:26 UTC
HP ZBook 15 G3 with fresh install of kubuntu 20.04.
This bug seems to be affecting me, after login to Plasma baloo_file_extractor crashes repeatedly. The crash notifications are more disturbing for the workflow than the crash itself.

~$ balooctl failed
All Files were indexed successfully

The Crash Reporting Assistant wants me to create a backtrace, but I do not understand how to do that. I can't find -dbg or -dbgsys packages for any of the libraries listed. The guide at https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports#Preparing_your_KDE_packages tells me to "Always install kdelibs5-dbgsym" but I can't find that either.

Error message and packages listed are:
"Missing debug information packages - The KDE Crash Handler
The packages containing debug information for the following application and libraries are missing:
/usr/bin/baloo_file_extractor
/lib/x86_64-linux-gnu/libQt5Core.so.5
/lib/x86_64-linux-gnu/liblmdb.so.0
/lib/x86_64-linux-gnu/libQt5Widgets.so.5
/lib/x86_64-linux-gnu/libKF5BalooEngine.so.5"

Since I only can provide the automatic backtrace information, the Crash Reporting Assistant does not let me continue, stating I've not provided enough information.
Comment 6 Stefan Brüns 2023-11-13 07:50:56 UTC
(In reply to DaBler from comment #2)
> (In reply to Stefan Brüns from comment #1)
> > The extractor from KFileMetadata is crashing.
> > 
> > 'balooctl failed' will tell which file causes the crash.
> 
> $ balooctl failed
> All Files were indexed successfully

This is probably caused by DrKonqi messing up signal delivery.
Comment 7 Bug Janitor Service 2023-11-13 22:13:56 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/baloo/-/merge_requests/174
Comment 8 tagwerk19 2023-11-13 22:31:48 UTC
That explains a lot!

Is the "... depends on the kernel /proc/sys/kernel/core_pattern setting ..." a distribution thing? I've never, as far as I remember, found files listed as failed.

If this fixes repeat crashes because baloo wants and fails (and wants and fails...) to index a particular file, that's a *big* step....
Comment 9 Stefan Brüns 2023-11-13 23:53:58 UTC
On Tumbleweed, the pattern is set by installing the `systemd-coredump` package:

$> grep core_pattern /usr/lib/sysctl.d/50-coredump.conf
kernel.core_pattern=|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
Comment 10 Stefan Brüns 2023-11-13 23:55:58 UTC
(In reply to Christoph Feck from comment #4)
> From the backtrace, it looks like there is a text file larger than 2 GiB. Qt
> only supports up to 2 GiB.
> 
> I suggest to add incremental loading to the plaintextextractor.

I suggest to open a MR.
Comment 11 tagwerk19 2023-11-14 15:28:18 UTC
(In reply to Christoph Feck from comment #4)
> From the backtrace, it looks like there is a text file larger than 2 GiB. Qt only supports up to 2 GiB.
Not sure what Baloo would do with a 2GB file...

Thought there was a "rough limit" of 10 Mbyte, see
    https://bugs.kde.org/show_bug.cgi?id=410680#c7

Doesn't always seem to cut in, maybe does for some filetypes and not others. Cf Bug 447681
Comment 12 Stefan Brüns 2023-11-16 03:10:15 UTC
Git commit 819cb757b5742372cf017fba84955178c1f1a7d1 by Stefan Brüns.
Committed on 14/11/2023 at 01:28.
Pushed by bruns into branch 'master'.

[ExtractorProcess] Handle signal mangling by DrKonqi

DrKonqui catches the SIGSEGV or SIGILL from a misbehaving extractor,
and then eventually quits the process. This may happen either by
re-raising the signal, or just are regular `_exit(253)`.

Which one is used depends on the kernel `/proc/sys/kernel/core_pattern`
setting - if it uses a pipe, the signal will be reraised.

The unexpected exit status was not handled before, and could cause
a blocking indexer, as neither a `done()` or `failed()` signal would
ever be emitted.

This also causes the list of failed files to stay empty, i.e.
`balooctl failed` would not return anything.

M  +1    -1    src/file/CMakeLists.txt
M  +12   -4    src/file/extractorprocess.cpp

https://invent.kde.org/frameworks/baloo/-/commit/819cb757b5742372cf017fba84955178c1f1a7d1
Comment 13 Stefan Brüns 2023-11-16 03:26:49 UTC
Git commit fd24c90c1eaaaa465324baa81a622f8767b62eef by Stefan Brüns.
Committed on 16/11/2023 at 04:18.
Pushed by bruns into branch 'kf5'.

[ExtractorProcess] Handle signal mangling by DrKonqi

DrKonqui catches the SIGSEGV or SIGILL from a misbehaving extractor,
and then eventually quits the process. This may happen either by
re-raising the signal, or just are regular `_exit(253)`.

Which one is used depends on the kernel `/proc/sys/kernel/core_pattern`
setting - if it uses a pipe, the signal will be reraised.

The unexpected exit status was not handled before, and could cause
a blocking indexer, as neither a `done()` or `failed()` signal would
ever be emitted.

This also causes the list of failed files to stay empty, i.e.
`balooctl failed` would not return anything.
(cherry picked from commit 819cb757b5742372cf017fba84955178c1f1a7d1)

M  +1    -1    autotests/unit/file/CMakeLists.txt
M  +1    -1    src/file/CMakeLists.txt
M  +12   -4    src/file/extractorprocess.cpp

https://invent.kde.org/frameworks/baloo/-/commit/fd24c90c1eaaaa465324baa81a622f8767b62eef
Comment 14 Stefan Brüns 2023-11-16 22:40:29 UTC
(In reply to tagwerk19 from comment #11)
> (In reply to Christoph Feck from comment #4)
> > From the backtrace, it looks like there is a text file larger than 2 GiB. Qt only supports up to 2 GiB.
> Not sure what Baloo would do with a 2GB file...
> 
> Thought there was a "rough limit" of 10 Mbyte, see
>     https://bugs.kde.org/show_bug.cgi?id=410680#c7
> 
> Doesn't always seem to cut in, maybe does for some filetypes and not others.
> Cf Bug 447681

The limit is currently only applied when the detected mimetype is a subtype of "text/", not when it is a specialized (inherited) type of e.g. text/plain. For example, "application/json" or "message/rfc822".

For a more complete list:
$> grep -E '<mime|sub-class' /usr/share/mime/packages/freedesktop.org.xml | grep -B1 -E 'sub-class.*text/' | grep -v 'mime-type.*text/' | sed -nE '/mime-type/ { N; s@.*="([^"]*).*\n.*"(.*)".*@\1 \t->\t\2 @ ; p} '  | tee /dev/stderr | wc -l

That lists 64 types.

You can find such files with e.g.
$> baloosearch -t Text mimetype:application or mimetype:message