Bug 435161 - Baloo crashes with ASSERT in metadatamover.cpp
Summary: Baloo crashes with ASSERT in metadatamover.cpp
Status: ASSIGNED
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: Baloo File Daemon (show other bugs)
Version: 5.80.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Stefan Brüns
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-30 18:38 UTC by Oded Arbel
Modified: 2024-01-07 22:58 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Oded Arbel 2021-03-30 18:38:40 UTC
SUMMARY
Another Baloo assert crash, this time:

----8<----
Mar 30 19:04:59 vesho baloo_file[2358496]: kf.baloo: Failed to create watch for "/home/odeda/.cache/mozilla/firefox/i1m74zv1.default/safebrowsing-updating/" No such file or directory
Mar 30 19:04:59 vesho baloo_file[2358496]: kf.baloo: "/home/odeda/.cache/mozilla/firefox/i1m74zv1.default/safebrowsing-updating/" -> "/home/odeda/.cache/mozilla/firefox/i1m74zv1.default/safebrowsing-backup/"
Mar 30 19:04:59 vesho baloo_file[2358496]: ASSERT: "from[from.size()-1] != QLatin1Char('/')" in file [...]src/file/metadatamover.cpp, line 80
----8<----

Looks like `MetadataMover::updateMetadata()` (that is called from `MetadataMover::moveFileMetadata()` is called for a directory.

STEPS TO REPRODUCE
I'm not really sure of the process - it has happened during first scan, but it looks like a directory being moved and I think that it was moved before it was indexed, but I'm not sure of the exact process. Regardless, the directory no longer exists - I think it was moved than deleted before baloo got a chance to index it, but I don't think that is related to the problem.

OBSERVED RESULT
baloo_file crashes and indexing stops.

EXPECTED RESULT
It shouldn't crash

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 5.21.80
KDE Frameworks Version: 5.81.0
Qt Version: 5.15.2

ADDITIONAL INFORMATION
It looks like the issue is in `KInotify::slotEvent()`, which emits `KInotify::moved` and hits `FileWatch::slotFileMoved`, but looking at the code:

----8<----
                // update the path cache
                if (event->mask & IN_ISDIR) {
                    // ... stuff in case event is for a directory
                }
//                qCDebug(BALOO) << oldPath << "EventMoveTo" << path;
                Q_EMIT moved(QFile::decodeName(oldPath), fname);
----8<----

Which means that `KInotify::moved` is emitted for both directories and files, but `FileWatch::slotFileMoved` expects only files and enforces that with an assert.
Comment 1 Oded Arbel 2021-03-30 18:45:39 UTC
(In reply to Oded Arbel from comment #0)
> Which means that `KInotify::moved` is emitted for both directories and
> files, but `FileWatch::slotFileMoved` expects only files and enforces that
> with an assert.

Lookgin deeper into `FileWatch::slotFileMoved`, it does expect to handle directories in some cases, but for calls that are forwarded to `MetadataMover::moveFileMetadata()` - it doesn't care. `MetadataMover::moveFileMetadata()` itself doesn't seem to care either, but it calls `MetadataMover::updateMetadata()` that will crash on paths that end with '/'.

Maybe the correct approach would be to have `MetadataMover::updateMetadata()` silently ignore paths that end with '/'?
Comment 2 Oded Arbel 2021-03-30 18:58:11 UTC
(In reply to Oded Arbel from comment #0)
> Regardless, the
> directory no longer exists - I think it was moved than deleted before baloo
> got a chance to index it, but I don't think that is related to the problem.

This could actually be related to the problem because I have configured `$HOME/.cache` to be excluded from scanning so it shouldn't have been scanned.

`FileWatch::slotFileMoved` consults ``FileIndexerConfig::shouldBeIndexed()` which in turn uses `QFileInfo::isDir()` which - as far as I understand - check if the path is an actual directory and will return `false` for paths that don't exist. I think the case where a directory was moved and then deleted is not handled correctly and will cause weird behaviors that might trigger asserts.
Comment 3 tagwerk19 2021-04-01 10:35:01 UTC
Not sure whether this is related...

It's a glitch I get "every so often". When I've missed capturing it, I've not found a way of reproducing it...

    cd ~/Documents
    mkdir temp
    echo "Hello Penguin" > temp/file1.txt
    balooshow -x temp/file1.txt

gave:

    144da00000fc01 64513 1330592 temp/file1.txt [/home/test/.kde/share/config/kdeglobals]
            Mtime: 1617136526 2021-03-30T22:35:26
            Ctime: 1617136526 2021-03-30T22:35:26
            Cached properties:
                    Line Count: 158

    Internal Info
    Terms: 0 0.025 ... Mplain Mtext T5 T8 X20-158 activebackground activeblend activeforeground backgroundalternate backgroundnormal ... widgetstyle window wm
    File Name Terms: Fkdeglobals
    XAttr Terms:
    lineCount: 158

That is, the details of something else completely (a different file, indexed at a different time). "Balooctl monitor" did not show "file1.txt" being indexed.

After a reboot I get the more expected:

    144da00000fc01 64513 1330592 temp/file1.txt [/home/test/Documents/temp/file1.txt]
            Mtime: 1617270546 2021-04-01T11:49:06
            Ctime: 1617270546 2021-04-01T11:49:06
            Cached properties:
                    Line Count: 1

    Internal Info
    Terms: Mplain Mtext T5 T8 X20-1 hello penguin
    File Name Terms: Ffile1 Ftxt
    XAttr Terms:
    lineCount: 1

I was hoping to be able to snapshot/clone the "confused" system, seemingly not this time though :-(

It may be that if such a "confused" system is referencing a directory rather than a file then you might get such an assert....

    Neon Unstable
    Plasma: 5.21.80
    Frameworks: 5.81.0
    Qt: 5.15.2
Comment 4 tagwerk19 2021-04-01 10:36:52 UTC
Had to trim the list of terms in the previous post as the full list looked too much like spam.
Comment 5 Bernie Innocenti 2021-12-06 04:10:15 UTC
I just hit this same ASSERT() in baloo built from git master, and the pathname was: "/home/bernie/thunderbird.bak/".

Invalid encoding. Ignoring "/home/bernie/emu/amiga/amigahd/Work/Network/Programs/MicroDot/Data/DBX_60/EDJAEDCOBDJEJLABDMJFJNPKGINIEG"
ASSERT: "from[from.size()-1] != QLatin1Char('/')" in file /home/bernie/kde/src/baloo/src/file/metadatamover.cpp, line 80

(gdb) bt
#0  0x00007ffff70cdd22 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff70b7862 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff7680910 in QMessageLogger::fatal(char const*, ...) const () from /usr/lib/libQt5Core.so.5
#3  0x00007ffff767fcf5 in qt_assert(char const*, char const*, int) () from /usr/lib/libQt5Core.so.5
#4  0x00005555555908b2 in Baloo::MetadataMover::updateMetadata (this=0x5555555d9800, tr=0x7fffffffcc00, from="/home/bernie/thunderbird/", to="/home/bernie/thunderbird.bak/") at /home/bernie/kde/src/baloo/src/file/metadatamover.cpp:80
#5  0x00005555555904aa in Baloo::MetadataMover::moveFileMetadata (this=0x5555555d9800, from="/home/bernie/thunderbird/", to="/home/bernie/thunderbird.bak/") at /home/bernie/kde/src/baloo/src/file/metadatamover.cpp:42
#6  0x000055555558a1ed in Baloo::FileWatch::slotFileMoved (this=0x7fffffffd570, urlFrom="/home/bernie/thunderbird/", urlTo="/home/bernie/thunderbird.bak/") at /home/bernie/kde/src/baloo/src/file/filewatch.cpp:77
...
Comment 6 Bernie Innocenti 2021-12-06 06:18:28 UTC
I'm able to trigger a similar assertion a different location by simply creating a new directory in my home:



...
[Detaching after fork from child process 31519]
QIBusPlatformInputContext: invalid portal bus.
QSocketNotifier: Can only be used with threads started with QThread
[Detaching after fork from child process 31538]
QIBusPlatformInputContext: invalid portal bus.
QSocketNotifier: Can only be used with threads started with QThread
ASSERT: "!filePath.endsWith(QLatin1Char('/'))" in file /home/bernie/kde/src/baloo/src/file/newfileindexer.cpp, line 38

Thread 4 "Thread (pooled)" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff1680640 (LWP 31481)]
0x00007ffff70cdd22 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff70cdd22 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff70b7862 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff7680910 in QMessageLogger::fatal(char const*, ...) const () from /usr/lib/libQt5Core.so.5
#3  0x00007ffff767fcf5 in qt_assert(char const*, char const*, int) () from /usr/lib/libQt5Core.so.5
#4  0x0000555555579e99 in Baloo::NewFileIndexer::run (this=0x555555870bf0)
    at /home/bernie/kde/src/baloo/src/file/newfileindexer.cpp:38
#5  0x00007ffff76c1332 in ?? () from /usr/lib/libQt5Core.so.5
#6  0x00007ffff76be02f in ?? () from /usr/lib/libQt5Core.so.5
#7  0x00007ffff6016259 in start_thread () from /usr/lib/libpthread.so.0
#8  0x00007ffff718f5e3 in clone () from /usr/lib/libc.so.6
(gdb) f 4
#4  0x0000555555579e99 in Baloo::NewFileIndexer::run (this=0x555555870bf0)
    at /home/bernie/kde/src/baloo/src/file/newfileindexer.cpp:38
38              Q_ASSERT(!filePath.endsWith(QLatin1Char('/')));
(gdb) list
33              : BasicIndexingJob::MarkForContentIndexing;
34
35          Transaction tr(m_db, Transaction::ReadWrite);
36
37          for (const QString& filePath : std::as_const(m_files)) {
38              Q_ASSERT(!filePath.endsWith(QLatin1Char('/')));
39
40              QString mimetype;
41              QFileInfo fileInfo(filePath);
42
(gdb) p filePath
$1 = "/home/bernie/foo/"
Comment 7 Bernie Innocenti 2021-12-06 08:22:36 UTC
I don't see how these asserts could have ever worked in the past. I traced back the pathnames, and the '/' is appended to all directories with normalizeTrailingSlash(): https://invent.kde.org/frameworks/baloo/-/blob/master/src/file/kinotify.cpp#L359

So, new directories will always have a trailing slash, why is NewFileIndexer::run() not expecting to see them?
My first hypothesis was that NewFileIndex is only for processing files. But there's a specific check for directories a few lines after the Q_ASSERT that fails: ​https://invent.kde.org/frameworks/baloo/-/blob/master/src/file/newfileindexer.cpp#L47-51

Code to append slashes to directories was added ~1 year ago:
https://invent.kde.org/frameworks/baloo/-/commit/73183acf00a2b7f10b286a1241716cb7e0785f0d

A few months later, we removed a couple of asserts on directory removal and rename:
https://invent.kde.org/frameworks/baloo/-/commit/96284739b4b63110113af0ba7dedfd18c4459063

The second check was missing a '!', and it was added shortly after by Nate:
https://invent.kde.org/frameworks/baloo/-/commit/dc04879f058dd1eab72e4bd4c80bdc1ea3c06c9b

So I think we missed directory creation and a few other cases. It should be safe to keep trailing slashes, because BasicIndexingJob strips them in the constructor:
https://invent.kde.org/frameworks/baloo/-/blob/master/src/file/basicindexingjob.cpp#L26-28

There are other asserts disallowing trailing slashes. These are placed just before code like this:
  QString fileName = filePath.mid(filePath.lastIndexOf(QLatin1Char('/')) + 1);

So I think we can't simply remove the check, as it will set fileName to an empty string. QFileInfo::fileName() is explicitly documented to behave like this:
  "Note that, if this QFileInfo object is given a path ending in a slash, the name of the file is considered empty."
Comment 8 Bernie Innocenti 2021-12-06 08:23:41 UTC
On a tangential note, why didn't the autotests catch this regression?
Comment 9 Bug Janitor Service 2021-12-06 09:01:33 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/baloo/-/merge_requests/71
Comment 10 Bernie Innocenti 2021-12-06 09:04:23 UTC
Just to be clear: I could 100% reproduce the crash in baloo_file with these directory operations in my home:

  cd ~
  mkdir foo   <- CRASH!
  mv foo bar  <- CRASH!
  chmod -x bar  <- CRASH!
Comment 11 tagwerk19 2021-12-06 11:41:48 UTC
(In reply to Bernie Innocenti from comment #10)
> ... 100% reproduce ...
I remember having a go at building a "with a clean index" test and got nowhere. Chasing up what would happen if an inode was reused but I couldn't pin down anything.

My recollection is that the Asserts were included (and kept) to flush out "bad cases" before they made things worse. Alas, cannot remember/find where I read that.
Comment 12 tagwerk19 2024-01-07 22:58:10 UTC
(In reply to tagwerk19 from comment #11)
> ... but got nowhere ...
I stumbled on this again (Bug 478854)

I can reliably reproduce on neon-unstable; on newly-installed, as vanilla as possible, systems but if I try the same on a neon-testing, I don't get failures.