Summary: | DB must be purged whenever the DB scheme/contents change in an incompatible way | ||
---|---|---|---|
Product: | [Frameworks and Libraries] frameworks-baloo | Reporter: | Stefan Brüns <stefan.bruens> |
Component: | Baloo File Daemon | Assignee: | baloo-bugs-null |
Status: | REPORTED --- | ||
Severity: | normal | CC: | nate, postix, tagwerk19 |
Priority: | NOR | ||
Version: | 5.112.0 | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Linux | ||
See Also: |
https://bugs.kde.org/show_bug.cgi?id=474973 https://bugs.kde.org/show_bug.cgi?id=475919 |
||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: |
Description
Stefan Brüns
2023-11-16 00:48:33 UTC
(In reply to Stefan Brüns from comment #0) > ... unfortunate https://invent.kde.org/frameworks/baloo/-/merge_requests/131 introduced such a change ... That was my bad, I should have noticed the impact on normal "real world" systems. A problem of intensively watching unstable branches :-( However the positive was the change did not *force* a structure change on the DB. A change to use a 64 bit FSID would have forced that; one step further and very likely necessary at some point. The issue was there earlier with BTRFS and multiple subvolumes, affecting predominantly OpenSUSE but latterly Fedora (Bug 402154). I've also found it possible to recreate if you have mounted a different filesystem somewhere under your $HOME. It would be nice if the "search" could be proof against ambiguous parent folders. It would be a "belt and braces" check, something the code ought not to meet but could deal with in case it does. There is a test DB showing the issue here: https://bugs.kde.org/attachment.cgi?id=163168 @ngraham: How about notifying the users about the regression caused by the ID change, to keep the impact as small as possible? This is currently not fixed, and even if it were, the next KF5 release (KF5.113) is still some weeks in the future (+ distribution delay). More and more people will be hit by this. Most notably, it will affect also users with Ext* filesystems, which as of now had no problems with the filesystem/device ID. I'm a bit more sanguine here; what might people see? Doubling up of search results, Bug 475919, might catch them unawares (but that is something that is not "out of the ordinary" where your search results contains files with the same name) Confusing "empty" results when searching from a particular directory (as opposed to "Your Files"). That's the Bug 474973 They might notice baloo reindexing in the background but the systemd limit on RAM usage really helps here (even if one could argue about where that limit should be set). Possibly the increase in index size but BTRFS systems such as openSUSE have suffered far, far worse. Of the above, I think the "empty" search results is the most pernicious. That's the one it would be nice to have a fix/workround for in a new release... ... A change in the DB structure would worry me :-] The problems are: - Sudden misbehavior seen by users not affected so for (Ext3/4, probably others) - Doubling of index size for users on e.g. Ext4 Btw, the systemd limit does not actually help so much (if at all), it are the fixes which went in afterwards which made the index behavior significantly better. (In reply to Stefan Brüns from comment #4) > - Doubling of index size for users on e.g. Ext4 I'm not sure whether many people would see a difference, at least those with an SSD. This is where we could discuss systemd limits but that's probably better elsewhere :-) If I look at openSUSE: $ baloosearch -i fred ... 9caf4a24446 /home/test/fred 9ca00000028 /home/test/fred 9ca00000029 /home/test/fred 9ca0000002a /home/test/fred 9ca0000002b /home/test/fred 9ca0000002c /home/test/fred 9ca0000002d /home/test/fred 9ca0000002e /home/test/fred 9ca0000002f /home/test/fred 9ca00000027 /home/test/fred ... Elapsed: 0.465656 msecs This is an oft-rebooted Tumbleweed with BTRFS and I've got 9 copies (as indexed with different minor device numbers) plus a "final" copy with the folded FSID. > - Sudden misbehavior seen by users not affected so for (Ext3/4, probably others) I have been on the lookout for comments about misbehaviour, it's been quieter than I expected. There is a "heads up" here: https://discuss.kde.org/t/baloo-and-frameworks-5-111/6348/1 https://discuss.kde.org/t/baloo-and-frameworks-5-111/6348/1 > Purging and reindexing as above is not necessary but probably sensible… That's obviously incorrect, it is necessary for BTRFS and Ext to have it working correctly. It is better to err on the safe side here. And while I appreciate you have mentioned it on Discuss, the reach is fairly small, thats the reason I asked Nate to get the message out. And the number of bugs filed or other reports is not a good measure of systems affected - people may simply not want to invest the time (and contrary to the automated kcrash reports it takes significantly more time to do so). Also, Debian and Ubuntu are both currently shipping older KF5 version (5.107/5.110) not yet affected. (In reply to Stefan Brüns from comment #6) > ... It is better to err on the safe side here ... As I read it, we are saying the same thing, better to be on the safe side. Good news is Bug 474973 has been around for a while (and is annoying) but we now have an explanation for it. So, although I had asked Nate *twice* about notifying users about the breaking change, he chose to ignore it. (In reply to Stefan Brüns from comment #6) > https://discuss.kde.org/t/baloo-and-frameworks-5-111/6348/1 I've added a heads up: https://discuss.kde.org/t/baloo-and-frameworks-5-111/6348/2 describing the issue with searching "From Here" as opposed to "Your Files" What is somewhat strange behaviour with the test DB (as per attachment) is that balooctl6 check doesn't notice/heal the missing parent (and I don't seem to be able to force it with combinations of balooctl6 clear and balooctl6 index) Maybe when baloo_file does its initial scan through the filestructure to set up iNotify watches, it could do a sanity check of the folder name against the DocId. (It's possible to "balooctl6 clear" a just-indexed entry, but seems not possible to clear one indexed earlier where the DocId no longer matches). (In reply to tagwerk19 from comment #10) > Maybe when baloo_file does its initial scan through the filestructure to set > up iNotify watches, it could do a sanity check of the folder name against > the DocId. OK, I think that would catch this particular issue. It seems that the FSID change triggers the issue of (comment #3): > Confusing "empty" results when searching from a particular directory (as > opposed to "Your Files"). That's the Bug 474973 but there's no guarantee that this is the only cause. Doing a sanity check of ensuring the folder is indexed before creating an iNotify watch would catch any similar issues. Whatever the root cause. (In reply to tagwerk19 from comment #11) > (In reply to tagwerk19 from comment #10) > > Maybe when baloo_file does its initial scan through the filestructure to set > > up iNotify watches, it could do a sanity check of the folder name against > > the DocId. > OK, I think that would catch this particular issue. I mentioned already in the FSID change that there is no guarantee a document in the DB is currently reachable by its path, see network mounts, see thumbdrives, etc. But unfortunately everyone ignored my reservations, as everyone else apparently knows better how Baloo works than me ... > It seems that the FSID change triggers the issue of (comment #3): > > Confusing "empty" results when searching from a particular directory (as > > opposed to "Your Files"). That's the Bug 474973 > but there's no guarantee that this is the only cause. Doing a sanity check > of ensuring the folder is indexed before creating an iNotify watch would > catch any similar issues. Whatever the root cause. You fall into the same trap as others have before, these are not synchronous. First, inotify works with paths, and second, it works with a sequence of events. You will have intermediate states which may no longer be valid, e.g. when doing renames. Trying to "synchronize" inotify and DB state have been the source of inconsistent DB states in the past, this will only make things worse. The DB must be self-consistent, but may temporarily be in a different state than the FS. But actually, these comments are off-topic here. This BR is about DB inconsistencies caused by code changes, nothing else. (In reply to Stefan Brüns from comment #12) > ... everyone else apparently knows better how Baloo works than me ... I think when troubleshooting, you get a view of problems from the "outside, in" and that comes with a fair amount of guesswork and assumptions. If you know the code, you see things from the "inside, out". That's two different ways of "knowing" - it's a question of getting them to meet in the middle. > ... no guarantee a document in the DB is currently reachable by its path, see network mounts, see thumbdrives, etc ... That's quite clear. In this instance though I create a test file, that triggers an iNotify, and baloo indexes it - without the parent directory being in the index (with the right DocID anyway). > ... First, inotify works with paths, and second, it works with a sequence of events ... Yes. I've seen, with inotifywait, some of the holes you can fall into (for example when a move is done with a "link/unlink") and can see that you could get synchronisation issues. I am pretty sure that is not happening here though. For what it is worth, I did the tests with and without content indexing enabled. > ... This BR is about DB inconsistencies caused by code changes, nothing else. The thread has drifted, apologies for that. I think I posted to the wrong thread to start with. |