SUMMARY I am indexing from scratch, it takes terrible ages (over one month already, 14 hours a day, EXT4 on LUKS2 on MDRAID on rotating HDDs to index file stored on EXT4 on LUKS2 on SSD) and came home to see a KCrash handler icon. There is still 50 GiB of free space on SSD storing the index file. The 'balooctl status' command returns: Baloo File Indexer is not running Total files indexed: 716,213 Files waiting for content indexing: 151,027 Files failed to index: 0 Current size of index is 67.41 GiB OBSERVED RESULT #6 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #7 0x00007f29a60b3b6f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #8 0x00007f29a6063a02 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #9 0x00007f29a604c22d in __GI_abort () at abort.c:79 #10 0x00007f29a604d29c in __libc_message (fmt=fmt@entry=0x7f29a619d0da "%s\n") at ../sysdeps/posix/libc_fatal.c:150 #11 0x00007f29a60bd975 in malloc_printerr (str=str@entry=0x7f29a61a07f0 "mremap_chunk(): invalid pointer") at malloc.c:5765 #12 0x00007f29a60c2cec in mremap_chunk (new_size=48, p=0x7f29a68663d0 <prime_deltas+16>) at malloc.c:3063 #13 __GI___libc_realloc (oldmem=0x7f29a68663e0 <QListData::shared_null>, bytes=32) at malloc.c:3473 #14 0x00007f29a660c463 in QListData::realloc_grow (this=this@entry=0x55da3b0a3a80, growth=growth@entry=1) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/tools/qlist.cpp:170 #15 0x00007f29a660c50a in QListData::append (this=0x55da3b0a3a80, n=n@entry=1) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/tools/qlist.cpp:196 #16 0x00007f29a660c53a in QListData::append (this=<optimized out>) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/tools/qlist.cpp:206 #17 0x000055da39d34781 in QList<QString>::append (t=..., this=<optimized out>) at /usr/include/qt5/QtCore/qlist.h:643 SOFTWARE/OS VERSIONS KDE Plasma Version: 5.27.9 KDE Frameworks Version: 5.111.0 Qt Version: 5.15.11 Kernel Version: 6.6.0-gentoo (64-bit) Graphics Platform: Wayland Processors: 16 × AMD Ryzen 7 3800X 8-Core Processor Memory: 62.7 GiB of RAM Graphics Processor: AMD Radeon RX 580 Series Manufacturer: Gigabyte Technology Co., Ltd. Product Name: X570 AORUS ELITE System Version: -CF
I am sorry, the complete backtrace follows: Application: baloo_file (baloo_file), signal: Aborted Content of s_kcrashErrorMessage: std::unique_ptr<char []> = {get() = 0x0} [KCrash Handler] #6 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #7 0x00007f29a60b3b6f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #8 0x00007f29a6063a02 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #9 0x00007f29a604c22d in __GI_abort () at abort.c:79 #10 0x00007f29a604d29c in __libc_message (fmt=fmt@entry=0x7f29a619d0da "%s\n") at ../sysdeps/posix/libc_fatal.c:150 #11 0x00007f29a60bd975 in malloc_printerr (str=str@entry=0x7f29a61a07f0 "mremap_chunk(): invalid pointer") at malloc.c:5765 #12 0x00007f29a60c2cec in mremap_chunk (new_size=48, p=0x7f29a68663d0 <prime_deltas+16>) at malloc.c:3063 #13 __GI___libc_realloc (oldmem=0x7f29a68663e0 <QListData::shared_null>, bytes=32) at malloc.c:3473 #14 0x00007f29a660c463 in QListData::realloc_grow (this=this@entry=0x55da3b0a3a80, growth=growth@entry=1) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/tools/qlist.cpp:170 #15 0x00007f29a660c50a in QListData::append (this=0x55da3b0a3a80, n=n@entry=1) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/tools/qlist.cpp:196 #16 0x00007f29a660c53a in QListData::append (this=<optimized out>) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/tools/qlist.cpp:206 #17 0x000055da39d34781 in QList<QString>::append (t=..., this=<optimized out>) at /usr/include/qt5/QtCore/qlist.h:643 #18 QList<QString>::append (this=<optimized out>, t=...) at /usr/include/qt5/QtCore/qlist.h:620 #19 0x000055da39d43ed9 in Baloo::FileContentIndexer::slotFinishedIndexingFile (this=0x55da3b0a3a40, filePath=..., fileUpdated=<optimized out>) at /var/tmp/portage/kde-frameworks/baloo-5.111.0/work/baloo-5.111.0/src/file/filecontentindexer.cpp:125 #20 0x00007f29a67b4024 in QObject::event (this=0x55da3b0a3a40, e=0x7ee99432ba20) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qobject.cpp:1347 #21 0x00007f29a6788f25 in doNotify (event=0x7ee99432ba20, receiver=0x55da3b0a3a40) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qcoreapplication.cpp:1154 #22 QCoreApplication::notify (event=<optimized out>, receiver=<optimized out>, this=<optimized out>) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qcoreapplication.cpp:1140 #23 QCoreApplication::notifyInternal2 (receiver=0x55da3b0a3a40, event=0x7ee99432ba20) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qcoreapplication.cpp:1064 #24 0x00007f29a678914e in QCoreApplication::sendEvent (receiver=<optimized out>, event=<optimized out>) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qcoreapplication.cpp:1462 #25 0x00007f29a678c4c3 in QCoreApplicationPrivate::sendPostedEvents (receiver=0x0, event_type=0, data=0x55da3b083b70) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qcoreapplication.cpp:1821 #26 0x00007f29a678c778 in QCoreApplication::sendPostedEvents (receiver=<optimized out>, event_type=<optimized out>) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qcoreapplication.cpp:1680 #27 0x00007f29a67db013 in postEventSourceDispatch (s=0x55da3b0853f0) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qeventdispatcher_glib.cpp:277 #28 0x00007f29a4f73d52 in g_main_dispatch (context=context@entry=0x55da3b085180) at ../glib-2.78.1/glib/gmain.c:3476 #29 0x00007f29a4f76f07 in g_main_context_dispatch_unlocked (context=0x55da3b085180) at ../glib-2.78.1/glib/gmain.c:4284 #30 g_main_context_iterate_unlocked (context=context@entry=0x55da3b085180, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib-2.78.1/glib/gmain.c:4349 #31 0x00007f29a4f7752c in g_main_context_iteration (context=0x55da3b085180, may_block=1) at ../glib-2.78.1/glib/gmain.c:4414 #32 0x00007f29a67dab16 in QEventDispatcherGlib::processEvents (this=0x55da3b085080, flags=...) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/src/corelib/kernel/qeventdispatcher_glib.cpp:423 #33 0x00007f29a678797b in QEventLoop::exec (this=this@entry=0x7ffcd9102b80, flags=..., flags@entry=...) at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/include/QtCore/../../src/corelib/global/qflags.h:69 #34 0x00007f29a678fc7d in QCoreApplication::exec () at /var/tmp/portage/dev-qt/qtcore-5.15.11-r1/work/qtbase-everywhere-src-5.15.11/include/QtCore/../../src/corelib/global/qflags.h:121 #35 0x000055da39d33915 in main (argc=<optimized out>, argv=<optimized out>) at /var/tmp/portage/kde-frameworks/baloo-5.111.0/work/baloo-5.111.0/src/file/main.cpp:78 [Inferior 1 (process 5469) detached]
There's been a change to limit the amount of RAM baloo can use to 512M (assuming you are on a system with systemd). See: https://bugs.kde.org/show_bug.cgi?id=446071#c9 Without a cap on memory, baloo can expand and slug the system performance. With the cap, you might find baloo starting to use swap when doing large write transactions. Also not good. I've been setting MemoryHigh=50% and MemorySwapMax=0B to find a "middle way". Your mileage may vary You can watch the files being indexed with "balooctl monitor", you should see them indexed in batches of 40. You say "KDE Frameworks Version: 5.111.0", did you update the system halfway through the indexing? You also say "Current size of index is 67.41 GiB" which doesn't sound healthy.
My system uses OpenRC. I started the indexing on KF 5.110, during the compilation it was paused. I am definitely not going to try to really use it (but I would like to!) if I will be told that after each KF5/6 upgrade touching Baloo it will need re-indexing. That would be terrible waste of energy and time. It should be programed the way that it will make needed internal changes of existing index file after each incompatible upgrade of Baloo internals. I am having plenty of literature in pdf and epub formats but was wrongly thinking it maybe a right size but from the output of the 'balooctl indexSize' command it may really suggest it went crazy again: File Size: 67,41 GiB Used: 2,95 GiB PostingDB: 3,47 GiB 117.860 % PositionDB: 1,76 GiB 59.766 % DocTerms: 1,51 GiB 51.211 % DocFilenameTerms: 48,49 MiB 1.606 % DocXattrTerms: 4,00 KiB 0.000 % IdTree: 9,99 MiB 0.331 % IdFileName: 52,75 MiB 1.747 % DocTime: 28,57 MiB 0.946 % DocData: 49,64 MiB 1.645 % ContentIndexingDB: 4,21 MiB 0.140 % FailedIdsDB: 0 B 0.000 % MTimeDB: 13,07 MiB 0.433 % I will try to reduce its size using the command 'mdb_copy -n -c index index.new'. Yes, it does it in 40 pcs batches. Thank you.
mdb_copy -n downsized it to 31 GiB.
> My system uses OpenRC. Don't know whether OpenRC gives you a way of limiting the memory use (with cgroups?). I only know the systemd unit files. Putting some sort of cap on the memory use is sensible. > ... told that after each KF5/6 upgrade touching Baloo it will need re-indexing ... Probably more complicated. Previously if could be that when you mount disks on a reboot, they get a different device number each time. This was a clear issue with BTRFS if you have multple subvolumes, there was a race and disks came up with different minor device numbers. OK, "previously" applies to Baloo. Baloo used to rely on the device number (device number and inode) to build an internal DocID for each file it indexed. If the device number changed on a reboot then Baloo thought it had a whole set of new files and indeed them all again. Bad. This may also be happening with your Ext4/LUKS2 setup. I'm afraid I don't know how this presents itself to the system. With Frameworks 5.111 there's been a patch to use an "unvariant" File system ID (rather than the minor device number). This means there will be "one more" reindexing and then the index should be stable. It shouldn't be every KF5/KF6 change, it should be more stable after this one... https://invent.kde.org/frameworks/baloo/-/merge_requests/131 https://discuss.kde.org/t/baloo-and-frameworks-5-111/6348 You can keep watch on the device number / inode on disk with "stat filename", see how Baloo has indexed it with "balooshow -x filename" and also check for "multiple hits" for the same file if you do a "baloosearch -i filename". There's also a possible "gotcha" that happens if you are worried about how the indexing is going and watch with "balooctl status". This counts the files waiting to be indexed - and holds the index "read only" when it's doing it. If baloo_file/baoo_file_extractor wants to write at that moment, the write is an append. Suddenly the index is bigger (Bug 437754) > ... It should be programed the way that it will make needed internal changes of existing index file after each incompatible upgrade of Baloo internals ... Not sure there's a watertight way of doing this - beyond keeping a hash of the files and comparing. > ... I am having plenty of literature in pdf and epub formats ... These can sometimes be slow to index, each file need to be read as a stream of text. PDF's can be compressed and things like graphs can take a *load* of CPU to render.... Not sure whether this all helps. Probably the thing to do it to check what "stat" says for your files; change the indexing "includes" so you can see what happens with a small set of folder; pkill baloo_file and purge the index. Sorry.
> ... Crash ... I think you are off in the wildlands with: > Current size of index is 67.41 GiB and > Memory: 62.7 GiB of RAM I'd be 95% sure that the root cause is reindexing (possibly historically, anyway with 5.111)
Both the mdb_copy and mdb_stat commands understand the index file structure and Baloo continues to index content of my files without complaining so I think its structure was not damaged by the crash. I will let it go. After it (maybe) will finish it I can run some searches to see if documents are found once only. Can I dump/export it to text file? There are a mdb_dump and mdb_load binaries. I will try. Are there sanity/health checks for the index file BTW?
(In reply to tagwerk19 from comment #6) You were right, it is doubled. I do have 363696 regular files in home folder. (cd && find . \( ! -regex '.*/\..*' \) -type f | wc -l) Stopped, purged, logged off/on, started, got again surprised that it asks for Computer restart after enabling it (at least in GUI of Settings).
(In reply to David Kredba from comment #8) > ... it asks for Computer restart after enabling it (at least in GUI of Settings) ... No a reboot is not needed, a restart of the Baloo process is sufficient. Sometimes though Baloo isn't listening when you "ask it to restart itself". Bug 467531. > You were right, it is doubled. I do have 363696 regular files in home folder. > (cd && find . \( ! -regex '.*/\..*' \) -type f | wc -l) Thank you, it's something of a relief that Ext4 over LUKS2 did give a stable device number. I think now a question of patience, watching with "balooctl monitor" (and I've found iotop gives a nice view into the indexing behaviour). Avoid "balooctl status" if you can...
(In reply to David Kredba from comment #7) > Are there sanity/health checks for the index file BTW? Sorry, I missed that one. Yes. Igor Poboiko has a baloo-checkdb.py script here: https://invent.kde.org/frameworks/baloo/uploads/bdc9f5f17fc96490b7bd4a22ac664843/baloo-checkdb.py With a quick description: https://invent.kde.org/frameworks/baloo/-/merge_requests/87#note_535270 It works "one level up"; the logical DB structure rather then the physical one. Heads up about the need to load the whole database into memory! I've used it for small indexes. It probably needs a new release though, see the report here https://bugs.kde.org/show_bug.cgi?id=474973#c29
(In reply to tagwerk19 from comment #9) > ... it's something of a relief that Ext4 over LUKS2 did give a stable > device number. I think now a question of patience, watching with "balooctl > monitor" (and I've found iotop gives a nice view into the indexing > behaviour). Avoid "balooctl status" if you can... Are you OK now? I know there's been some fixes to Poppler that have fixed some extreme PDF issues https://bugs.kde.org/show_bug.cgi?id=380456#c22
Baloo seems to be fine after I excluded my many many epub books folder. Pdf books are indexed by Baloo in a different Calibre Library fine. (Calibre's full-text index file of that epub Library is over 20 GiB in size.)