Bug 425017 - baloo_file_extractor crashes in a loop trying to index one particular file
Summary: baloo_file_extractor crashes in a loop trying to index one particular file
Status: RESOLVED FIXED
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: Baloo File Daemon (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR crash
Target Milestone: ---
Assignee: Stefan Brüns
URL:
Keywords:
: 414318 419043 (view as bug list)
Depends on:
Blocks:
 
Reported: 2020-08-05 01:11 UTC by Nate Graham
Modified: 2020-08-06 05:39 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 5.74.0
Sentry Crash Report:


Attachments
The file that makes it crash (3.72 KB, text/plain)
2020-08-05 01:11 UTC, Nate Graham
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nate Graham 2020-08-05 01:11:10 UTC
Created attachment 130644 [details]
The file that makes it crash

All KDE software from git master. I'm seeing baloo_file_extractor crashing in a loop while trying to index a particular file of mine, which I am attaching. I saw this happen last week and I purged and rebuilt the baloo index. But it started happening again, for the exact same file! There seems to be something about the file that makes baloo very unhappy.


(master) balooctl monitor
Press ctrl+c to stop monitoring
File indexer is running
Indexing file content
Indexing: /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt: Ok
Indexing: /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt: Ok
Indexing: /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt: Ok
Indexing: /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt: Ok
Indexing: /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt: Ok
Indexing: /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt: Ok
Indexing: /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt: Ok


balooctl index "/home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt"
Skipping: /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt Reason: Already scheduled for indexing
File(s) indexed


balooctl clear "/home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt"
kf.baloo.engine: MTimeDB::del 0 109792322002158340 MDB_BAD_TXN: Transaction must abort, has a child, or is invalid
kf.baloo.engine: Transaction::commit MDB_BAD_TXN: Transaction must abort, has a child, or is invalid


The backtrace is unfortunately not very helpful:


bt
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/home/nate/kde/usr/bin/baloo_file_extractor'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50        return ret;
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007fa1cfdff539 in __GI_abort () at abort.c:79
#2  0x00007fa1d037ac27 in qt_message_fatal (message=<synthetic pointer>..., 
    context=...) at global/qlogging.cpp:1914
#3  QMessageLogger::fatal (this=this@entry=0x7fffbded0770, 
    msg=msg@entry=0x7fa1d0e96f05 "%s") at global/qlogging.cpp:893
#4  0x00007fa1d09cd6d4 in init_platform (argv=<optimized out>, 
    argc=@0x7fffbded097c: 1, platformThemeName=..., platformPluginPath=..., 
    pluginNamesWithArguments=...)
    at ../../include/QtCore/../../src/corelib/tools/qarraydata.h:208
#5  QGuiApplicationPrivate::createPlatformIntegration (this=0x1e5b430)
    at kernel/qguiapplication.cpp:1481
#6  0x00007fa1d09cdb60 in QGuiApplicationPrivate::createEventDispatcher (
    this=<optimized out>) at kernel/qguiapplication.cpp:1498
#7  0x00007fa1d059a696 in QCoreApplicationPrivate::init (
    this=this@entry=0x1e5b430) at kernel/qcoreapplication.cpp:852
#8  0x00007fa1d09d0aaf in QGuiApplicationPrivate::init (
    this=this@entry=0x1e5b430) at kernel/qguiapplication.cpp:1527
#9  0x00007fa1d10f1eb9 in QApplicationPrivate::init (this=0x1e5b430)
    at kernel/qapplication.cpp:513
#10 0x000000000040a291 in main (argc=<optimized out>, argv=0x7fffbded09c0)
    at /home/nate/kde/src/baloo/src/file/extractor/main.cpp:27
Comment 1 Nate Graham 2020-08-05 03:21:05 UTC
Eventually I can get it to stop crashing for a while by disabling the file indexer, killing all baloo* processes, and then enabling it again.
Comment 2 Stefan Brüns 2020-08-05 03:47:25 UTC
I don't think its the file itself causing the error, but a somewhat deterministic chain of events (file index operations) which has somehow brought the DB into a bad state.
Comment 3 Stefan Brüns 2020-08-05 03:55:04 UTC
Can you try https://invent.kde.org/frameworks/baloo/-/merge_requests/4/diffs
Comment 4 Nate Graham 2020-08-05 05:03:20 UTC
Thanks, that seems to fix it for me!
Comment 5 Stefan Brüns 2020-08-05 17:53:10 UTC
Git commit fdc162f3f72eaafec2dd8609470ec0002d5c9517 by Stefan Brüns.
Committed on 05/08/2020 at 17:47.
Pushed by bruns into branch 'master'.

[Engine] Propagate transaction errors

In case a transaction can not be commited, exit the extractor process. If
the commit only failed because it is too large the extractor is started
again with half the batch.

If the batch only contains a single file and marking it as failed is not
possible we are stuck, so also exit from the main process.

M  +6    -3    src/engine/transaction.cpp
M  +1    -1    src/engine/transaction.h
M  +4    -1    src/file/extractor/app.cpp
M  +8    -1    src/file/extractorprocess.cpp
M  +6    -1    src/file/filecontentindexer.cpp
M  +2    -2    src/file/filecontentindexerprovider.cpp
M  +1    -1    src/file/filecontentindexerprovider.h

https://invent.kde.org/frameworks/baloo/commit/fdc162f3f72eaafec2dd8609470ec0002d5c9517
Comment 6 Stefan Brüns 2020-08-05 17:55:55 UTC
@Nate - can you check if the file is now in the failed list (balooctl failed), or indexed (balooshow <file>)?
Comment 7 Nate Graham 2020-08-05 17:58:27 UTC
The file is not in the failed list and appears to be indexed now:

balooshow "/home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt"
109792322002158340 66308 25563017 /home/nate/SpiderOak Hive/Work & Money/Blue Systems/This week's work.txt
        Mtime: 1596589388 2020-08-04T19:03:08
        Ctime: 1596589388 2020-08-04T19:03:08
        Cached properties:
                Line Count: 104
Comment 8 Stefan Brüns 2020-08-05 21:27:24 UTC
Anything else in the failed list?

If not, then splitting up the batch/transaction apparently helped, or the batches were just different after restarting baloo_file.
Comment 9 Nate Graham 2020-08-05 21:58:03 UTC
The failed list is empty.
Comment 10 Nate Graham 2020-08-05 22:34:53 UTC
So should we call this fixed?
Comment 11 Stefan Brüns 2020-08-06 02:21:29 UTC
The symptoms are cured, but the cause still exists.

But as I have no idea how to find the cause, lets call it done.
Comment 12 Stefan Brüns 2020-08-06 02:23:17 UTC
*** Bug 414318 has been marked as a duplicate of this bug. ***
Comment 13 Stefan Brüns 2020-08-06 02:24:14 UTC
*** Bug 419043 has been marked as a duplicate of this bug. ***
Comment 14 Christoph Feck 2020-08-06 05:39:31 UTC
5.73.0 is already tagged, see https://community.kde.org/Schedules/Frameworks