SUMMARY Baloo file extractor keeps asserting txn != nullptr frequently on large(r) files, e.g. PDFs and large text files STEPS TO REPRODUCE 1. Have baloo enabled, log into your system, perhaps clear the index so it runs again OBSERVED RESULT Baloo keeps crashing all the freaking time spawning milloins of drkonqis Without asserts enabled it prints "m_writeTrans is null" in the log EXPECTED RESULT Baloo works as it used to SOFTWARE/OS VERSIONS Linux/KDE Plasma: git master as of 2025-2-12 Git bisect suggests e75cdd6016ba5433c05fbb04f4424b630c40dfbf is the first bad commit commit e75cdd6016ba5433c05fbb04f4424b630c40dfbf Author: Stefan Brüns <stefan.bruens@rwth-aachen.de> Date: Thu Jan 8 20:18:28 2026 +0100 [Extractor] Release DB write lock while content is extracted The extractor process held the DB write lock during the complete index batch, which may last for several seconds, or on rare occasions even minutes or hours. This had several negative side effects: - Any filesystem changes had to be queued in the scheduler, as these can not be commited to the DB. - Even deleted files may be commited to the DB, to be immediately deleted when the pending event queue is processed. - Any search may return fairly obsolete results, including deleted files. (Searching may still return incorrect results for files still pending, but this is out of scope.) - When an extractor crashes, the write transaction was still open. Although this is detected and handled, but may still cause further problems. Create a preliminary workload which is processed without holding any transactions, and only create the write transaction when the content extraction has completed. The completed workload is then checked if it matches the original state (url/id), and commited. For the unlikely case the state has changed the mismatching document(s) is discarded. src/file/extractor/app.cpp | 145 ++++++++++++++++++++++++++++----------------- src/file/extractor/app.h | 27 ++++++--- Qt Version: 6.10.2 ADDITIONAL INFORMATION I suspect it’s got something to do with the changes in early January re splitting stuff into multiple transactions.
I see the same, with a kf.filemetadata: Extracting UTF-8 "\n" plain text from ".... ASSERT: "txn != nullptr" in file /workspace/build/src/engine/documentiddb.cpp, line 17 Seems not to be a particular file as after the crash and restart the file is indexed (according to the debug anyway). It's also possible to index the files that crash with a "balooctl6 index ..." This is on Neon Unstable.
How about providing a backtrace?
Git commit 1a80af307dfc8ea07a2a4623a2e6078c90ecdd2b by Stefan Brüns. Committed on 13/02/2026 at 03:48. Pushed by bruns into branch 'master'. [Extractor] Open the DB in ReadWrite mode from the beginning The open mode can not be changed later, open it read-write. M +1 -1 src/engine/transaction.cpp M +2 -5 src/file/extractor/app.cpp https://invent.kde.org/frameworks/baloo/-/commit/1a80af307dfc8ea07a2a4623a2e6078c90ecdd2b
Created attachment 189510 [details] baloo_file_extractor backtrace (2026/02/13) > How about providing a backtrace? This is what I saw...
*** Bug 515651 has been marked as a duplicate of this bug. ***
*** Bug 515911 has been marked as a duplicate of this bug. ***
I salvaged the assert from the coredump but forgot to save the trace, sorry... With recent git master it works fine again \o/ Thanks so much for the prompt fix!
Git commit abdf26f61fd4de8637d77fd0d51b5ab0fd8b23c5 by Nicolas Fella, on behalf of Stefan Brüns. Committed on 13/02/2026 at 11:58. Pushed by nicolasfella into branch 'Frameworks/6.23'. [Extractor] Open the DB in ReadWrite mode from the beginning The open mode can not be changed later, open it read-write. (cherry picked from commit 1a80af307dfc8ea07a2a4623a2e6078c90ecdd2b) M +1 -1 src/engine/transaction.cpp M +2 -5 src/file/extractor/app.cpp https://invent.kde.org/frameworks/baloo/-/commit/abdf26f61fd4de8637d77fd0d51b5ab0fd8b23c5
*** Bug 515944 has been marked as a duplicate of this bug. ***