I noticed that for a long time baloo has only index half of my home directory: user@debian:~/baloo$ balooctl status Baloo File Indexer is running Indexer state: Indexing file content Indexed 7832 / 15848 files Current size of index is 122.47 MiB Resetting baloo (logging out, deleting all baloo related file) yields to indexing from scratch, but it stops around the same file count and does not index any further. Stopping, restarting, resuming the file indexer does not help at all. Everytime baloo_file gets started it immediately segfaults, so I tried to debug the problem: user@debian:~/baloo$ gdb /usr/bin/baloo_file GNU gdb (Debian 7.11.1-2) 7.11.1 Copyright (C) 2016 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>. Find the GDB manual and other documentation resources online at: <http://www.gnu.org/software/gdb/documentation/>. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from /usr/bin/baloo_file...Reading symbols from /usr/lib/debug/.build-id/cc/6382b1b4146efbd90e5836d11546a4568d809d.debug...done. done. (gdb) set follow-fork-mode child (gdb) run Starting program: /usr/bin/baloo_file [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7fffec961700 (LWP 3073)] org.kde.baloo: "/home/user" [New Thread 0x7ffea7dfe700 (LWP 3074)] Power state changed [New process 3075] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". process 3075 is executing new program: /usr/bin/baloo_file_extractor [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". [New Thread 0x7fffead9c700 (LWP 3076)] [New Thread 0x7fffd79f0700 (LWP 3077)] Thread 2.1 "baloo_file_extr" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff7e528c0 (LWP 3075)] KFileMetaData::OdfExtractor::extract (this=<optimized out>, result=0x7fffffffd870) at /build/kfilemetadata-kf5-l1YepB/kfilemetadata-kf5-5.23.0/src/extractors/odfextractor.cpp:133 133 /build/kfilemetadata-kf5-l1YepB/kfilemetadata-kf5-5.23.0/src/extractors/odfextractor.cpp: No such file or directory. (gdb) thread apply all bt Thread 2.3 (Thread 0x7fffd79f0700 (LWP 3077)): #0 0x00007ffff5625dcd in poll () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007ffff428e39c in g_main_context_poll (priority=2147483647, n_fds=3, fds=0x7fffd0003020, timeout=<optimized out>, context=0x7fffd0000990) at /build/glib2.0-wnDt2X/glib2.0-2.48.1/./glib/gmain.c:4135 #2 g_main_context_iterate (context=context@entry=0x7fffd0000990, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at /build/glib2.0-wnDt2X/glib2.0-2.48.1/./glib/gmain.c:3835 #3 0x00007ffff428e4ac in g_main_context_iteration (context=0x7fffd0000990, may_block=may_block@entry=1) at /build/glib2.0-wnDt2X/glib2.0-2.48.1/./glib/gmain.c:3901 #4 0x00007ffff5f411af in QEventDispatcherGlib::processEvents (this=0x7fffd00008c0, flags=...) at kernel/qeventdispatcher_glib.cpp:417 #5 0x00007ffff5ee9e4a in QEventLoop::exec (this=this@entry=0x7fffd79efcd0, flags=..., flags@entry=...) at kernel/qeventloop.cpp:204 #6 0x00007ffff5d129e4 in QThread::exec (this=this@entry=0x7ffff7fd8d40 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at thread/qthread.cpp:500 #7 0x00007ffff7f65515 in QDBusConnectionManager::run (this=0x7ffff7fd8d40 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at qdbusconnection.cpp:189 #8 0x00007ffff5d17808 in QThreadPrivate::start (arg=0x7ffff7fd8d40 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at thread/qthread_unix.cpp:341 #9 0x00007ffff511b464 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #10 0x00007ffff562ee5d in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 2.2 (Thread 0x7fffead9c700 (LWP 3076)): #0 0x00007ffff5625dcd in poll () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007ffff20c9382 in ?? () from /usr/lib/x86_64-linux-gnu/libxcb.so.1 #2 0x00007ffff20caff7 in xcb_wait_for_event () from /usr/lib/x86_64-linux-gnu/libxcb.so.1 #3 0x00007fffeccd4a89 in QXcbEventReader::run (this=0x664600) at qxcbconnection.cpp:1325 #4 0x00007ffff5d17808 in QThreadPrivate::start (arg=0x664600) at thread/qthread_unix.cpp:341 #5 0x00007ffff511b464 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #6 0x00007ffff562ee5d in clone () from /lib/x86_64-linux-gnu/libc.so.6 Thread 2.1 (Thread 0x7ffff7e528c0 (LWP 3075)): #0 KFileMetaData::OdfExtractor::extract (this=<optimized out>, result=0x7fffffffd870) at /build/kfilemetadata-kf5-l1YepB/kfilemetadata-kf5-5.23.0/src/extractors/odfextractor.cpp:133 #1 0x0000000000408750 in Baloo::App::index (this=this@entry=0x7fffffffdfb0, tr=0x6e8030, url=..., id=id@entry=306226873237544) at /build/baloo-kf5-sHKQba/baloo-kf5-5.23.0/src/file/extractor/app.cpp:163 #2 0x0000000000408bfe in Baloo::App::processNextFile (this=0x7fffffffdfb0) at /build/baloo-kf5-sHKQba/baloo-kf5-5.23.0/src/file/extractor/app.cpp:93 #3 0x00007ffff5f24ca6 in QtPrivate::QSlotObjectBase::call (a=0x7fffffffd9e0, r=<optimized out>, this=<optimized out>) at ../../include/QtCore/../../src/corelib/kernel/qobject_impl.h:124 #4 QSingleShotTimer::timerEvent (this=0x79cf20) at kernel/qtimer.cpp:310 #5 0x00007ffff5f19523 in QObject::event (this=0x79cf20, e=<optimized out>) at kernel/qobject.cpp:1278 #6 0x00007ffff6823afc in QApplicationPrivate::notify_helper (this=<optimized out>, receiver=0x79cf20, e=0x7fffffffdca0) at kernel/qapplication.cpp:3804 #7 0x00007ffff6829036 in QApplication::notify (this=0x7fffffffdf80, receiver=0x79cf20, e=0x7fffffffdca0) at kernel/qapplication.cpp:3561 #8 0x00007ffff5eec0f8 in QCoreApplication::notifyInternal2 (receiver=0x79cf20, event=event@entry=0x7fffffffdca0) at kernel/qcoreapplication.cpp:1015 #9 0x00007ffff5f4009e in QCoreApplication::sendEvent (event=0x7fffffffdca0, receiver=<optimized out>) at ../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:225 #10 QTimerInfoList::activateTimers (this=0x683e00) at kernel/qtimerinfo_unix.cpp:637 #11 0x00007ffff5f40609 in timerSourceDispatch (source=<optimized out>) at kernel/qeventdispatcher_glib.cpp:176 #12 idleTimerSourceDispatch (source=<optimized out>) at kernel/qeventdispatcher_glib.cpp:223 #13 0x00007ffff428e1a7 in g_main_dispatch (context=0x7fffe40016f0) at /build/glib2.0-wnDt2X/glib2.0-2.48.1/./glib/gmain.c:3154 #14 g_main_context_dispatch (context=context@entry=0x7fffe40016f0) at /build/glib2.0-wnDt2X/glib2.0-2.48.1/./glib/gmain.c:3769 #15 0x00007ffff428e400 in g_main_context_iterate (context=context@entry=0x7fffe40016f0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at /build/glib2.0-wnDt2X/glib2.0-2.48.1/./glib/gmain.c:3840 #16 0x00007ffff428e4ac in g_main_context_iteration (context=0x7fffe40016f0, may_block=may_block@entry=1) at /build/glib2.0-wnDt2X/glib2.0-2.48.1/./glib/gmain.c:3901 #17 0x00007ffff5f411af in QEventDispatcherGlib::processEvents (this=0x688c70, flags=...) at kernel/qeventdispatcher_glib.cpp:417 #18 0x00007ffff5ee9e4a in QEventLoop::exec (this=this@entry=0x7fffffffdef0, flags=..., flags@entry=...) at kernel/qeventloop.cpp:204 #19 0x00007ffff5ef250c in QCoreApplication::exec () at kernel/qcoreapplication.cpp:1285 #20 0x00007ffff623381c in QGuiApplication::exec () at kernel/qguiapplication.cpp:1607 #21 0x00007ffff6820ac5 in QApplication::exec () at kernel/qapplication.cpp:2979 #22 0x0000000000407bc5 in main (argc=1, argv=0x7fffffffe1a8) at /build/baloo-kf5-sHKQba/baloo-kf5-5.23.0/src/file/extractor/main.cpp:57 It's obvious that baloo_file gets stuck on some file in my home directory ... though I did not manage to find out which one, I do not have any odf files, at least to my knowledge. Is there some way to find out which file is the problematic one? I think baloo_file should be implemeted in a way so that one file can not stop the whole indexing. Is this a direction planned for the future? Reproducible: Always Steps to Reproduce: 1. run baloo_file 2. notice crash in dmesg: baloo_file_extr[3240]: segfault at 0 ip 00007ff512bb73d6 sp 00007ffe5cc07de0 error 4 in kfilemetadata_odfextractor.so[7ff512bb4000+5000]
Last few lines in the "strace -f -e open baloo_file" output: [pid 5606] open("/home/user/maple/maple2015/data/xml/dtd/mathml2/mathml2-qname-1.mod", O_RDONLY|O_CLOEXEC) = 21 [pid 5606] open("/home/user/maple/maple2015/data/eBookTools/Preface.mw", O_RDONLY|O_CLOEXEC) = 21 [pid 5606] open("/home/user/maple/maple2015/data/eBookTools/Legal.mw", O_RDONLY|O_CLOEXEC) = 21 [pid 5606] open("/home/user/maple/maple2015/data/help/Optimization/afiro.mpl", O_RDONLY|O_CLOEXEC) = 21 [pid 5606] open("/home/user/maple/maple2015/data/help/ImportData/recipe.mps", O_RDONLY|O_CLOEXEC) = 21 [pid 5606] open("/home/user/maple/maple2015/data/xml/template/template.ods", O_RDONLY|O_CLOEXEC) = 21 [pid 5606] open("/home/user/maple/maple2015/data/xml/template/template.ods", O_RDONLY|O_CLOEXEC) = 21 [pid 5606] open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 23 [pid 5606] open("/etc/group", O_RDONLY|O_CLOEXEC) = 23 [pid 5606] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0} --- [pid 5607] +++ killed by SIGSEGV +++ [pid 5608] +++ killed by SIGSEGV +++ [pid 5606] +++ killed by SIGSEGV +++ [pid 5605] --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=5606, si_uid=1000, si_status=SIGSEGV, si_utime=4, si_stime=5} --- So I guess the offending file maybe "maple/maple2015/data/xml/template/template.ods". How to verify that? How can I invoke baloo_file_extractor manually?
Confirmed the offending file by creating a new user and copying that file in the new users home directory. Once that happend baloo_file_extractor segfaults there too. Since this document is part of a copyrighted proprietary application I can't share that file publicly.
By adding "*.ods" to the excludes in the config I finally manged the following status: Baloo File Indexer is running Indexer state: Indexing file content Indexed 15835 / 15835 files Current size of index is 264.27 MiB :-)), so at least in my case no other crash happened.
odf indexer has problems if some files are not in the zip, perhaps that is your issue, see: https://git.reviewboard.kde.org/r/128886/
Fixed https://quickgit.kde.org/?p=kfilemetadata.git&a=commit&h=40730d75397aefb92145f86fc6abc9b303c56cfe Make odf indexer more error prove, check if the files are there (and are files at all) (meta.xml + content.xml) REVIEW: 128886 BUG 364748 => if you download this odt's to indexed directories your baloo will die on each index, be careful autotests/odfextractortest.cpp blob | diff | history | plain autotests/odfextractortest.h blob | diff | history | plain autotests/samplefiles/test_missing_content.odt [ new file with mode 0644 ] blob | plain autotests/samplefiles/test_missing_meta.odt [ new file with mode 0644 ] blob | plain src/extractors/odfextractor.cpp blob | diff | history | plain