SUMMARY Baloo appears to be in very bad shape in KF6. It was working oK until I updated to plasma 6. I am encountering the following issues: - Machine at startup is completely unresponsive: baloo_file is saturating the I/O bandwidth, top reports that the machine is fundamentally just waiting for I/O. - Running `balooctl6` shows baloo "Checking for unindexed files" - Trying `balooctl6 suspend` does nothing - Trying `balooctl6 disable` causes `balooctl6 status` to show baloo as not running, but the `baloo_file` process remains around hogging the machine. STEPS TO REPRODUCE 1. Reboot and login OBSERVED RESULT The machine hardly crawls. Getting a cursor on the terminal takes a long time. System load shows huge "wait time" and no system/user time. `balooctl6 disable` leaves resource hogging baloo_file process around. EXPECTED RESULT Baloo makes a limited and fair use of the machine resources, balooctl6 can actually suspend, resume and disable baloo. SOFTWARE/OS VERSIONS Operating System: Manjaro Linux KDE Plasma Version: 6.0.5 KDE Frameworks Version: 6.2.0 Qt Version: 6.7.1 Kernel Version: 6.6.32-1-MANJARO (64-bit) Graphics Platform: Wayland Processors: 8 × Intel® Core™ i7-4750HQ CPU @ 2.00GHz Memory: 15.5 GiB of RAM Graphics Processor: Mesa Intel® Iris® Pro Graphics P5200 Manufacturer: Notebook Product Name: W740SU
I don't *think* there's anything specific about the big move to KF6 that would trigger these sort of issues. Maybe check a couple of things... First, see how much memory Baloo is using. Nowadays, it is run via systemd by default and constrained to 512 Mbyte RAM. Check whether this is actually happening with $ systemctl --user status kde-baloo There are two possible concerns, one being it's been restarted and is *not* running within this constraint. The second, contrasting, possibility is that the index is very, very large (see how big your ~/.local/share/baloo/index is) and Baloo is continuously reading, dropping, rereading and redropping pages - trying to work within its 512 MB. Both options *may* end up with high I/O - particularly if Baloo is starting to swap.
Reeboted to make sure everything is clear. Before the reboot baloo was using 0% I/O. Then tried ``` systemctl --user status kde-baloo ● kde-baloo.service - Baloo File Indexer Daemon Loaded: loaded (/usr/lib/systemd/user/kde-baloo.service; disabled; preset: enabled) Active: active (running) since Mon 2024-06-03 09:09:49 CEST; 1min 7s ago Process: 1618 ExecCondition=/usr/bin/kde-systemd-start-condition --condition baloofilerc:Basic Settings:Indexing-Enabled:true (code=exited, status=0/SUCCE> Main PID: 1629 (baloo_file) Tasks: 3 (limit: 19061) Memory: 249.1M (high: 512.0M available: 262.8M peak: 249.2M) CPU: 2.458s CGroup: /user.slice/user-1000.slice/user@1000.service/background.slice/kde-baloo.service └─1629 /usr/lib/kf6/baloo_file Jun 03 09:09:49 zagar systemd[1468]: Starting Baloo File Indexer Daemon... Jun 03 09:09:49 zagar systemd[1468]: Started Baloo File Indexer Daemon. ``` Indeed it is limited to 512MB. Why does it say "disabled", though? If it is disabled why is it starting and how? At this point I am over 5 minutes after the boot. `baloo_file` is still hogging the I/O. Now about 10 minutes after the boot. `baloo_file` has finally calmed down. In many occasions it took much more. For these first 10' the machine has been hardly usable. This was not happening before my distro updated to plasma 6 (maybe bringing many other changes together with that, which may be the actual cause of what I am observing). Index file is very small: ``` Total files indexed: 382,267 Files waiting for content indexing: 0 Files failed to index: 0 Current size of index is 238.11 MiB ``` Home is on rotating HD and btrfs. Probably no one is anymore testing on rotating HDs, but they are still around. From a usability point of view, two aspects are IMHO of concern: 1. `balooctl6 suspend` seems to do nothing at all. 2. `balooctl6 disable` sets baloo to be disabled, but does not stop `baloo_file` at all.
(In reply to Sergio from comment #2) It's interesting that it is "baloo_file" and not "baloo_file_extractor", "baloo_file" is doing a scan through all your files to build a list of what's new, what's changed and what needs to be done. That can be a lot of directory lookups. It may be that you are caught by an earlier change (related to BTRFSi, some months back), for KF6 https://invent.kde.org/frameworks/baloo/-/merge_requests/131 and cherrypicked for KF5 https://invent.kde.org/frameworks/baloo/-/merge_requests/169 The original issue was that you could have files appearing multiple times in the index, a "baloosearch -i one-of-your-files.txt" gave you a number of hits. The patch meant that there would be an extra "reindex" but after that you're fine (should be fine). If you reboot now, has baloo_file done its work or does it still take time? You can adjust the amount of RAM with systemctl --user edit kde-baloo and you can change the "MemoryHigh=512M" to something like "MemoryHigh=25%", that will give Baloo more breathing room but not allow it to starve other processes of memory. The 512MB might be too tight. > Why does it say "disabled", though? If it is disabled why is it starting and how? It's a good question and I've not got a good answer out of Google. I think it might be being "wanted" by another service. There might be something useful in Bug 481101 > 1. `balooctl6 suspend` seems to do nothing at all. > 2. `balooctl6 disable` sets baloo to be disabled, but does not stop > `baloo_file` at all. These work "by asking" Baloo to stop rather than killing it. If Baloo is busy it does not always listen :-/ A disable should stop it running next reboot, although Bug 481101 suggests that something, somewhere might be reenabling it....
> If you reboot now, has baloo_file done its work or does it still take time? I see the issue every time I restart the system. For sure when I reboot, most likely (I'll check again) even when I logout and login again. Baloo_file hogs the machine for some time. The first time it was huge, after that first time it is about 5-10'. > You can adjust the amount of RAM with > systemctl --user edit kde-baloo > and you can change the "MemoryHigh=512M" to something like "MemoryHigh=25%", > that will give Baloo more breathing room but not allow it to starve other > processes of memory. The 512MB might be too tight. I can try that, but baloo does not seem to be under memory pressure. My set of indexed files is actually not that big. Furthermore, systemd always reports a memory usage well below the limit (`Memory: 249.1M (high: 512.0M available: 262.8M peak: 249.2M)`). > These work "by asking" Baloo to stop rather than killing it. If Baloo is busy > it does not always listen :-/ Even when busy, baloo should probably try to be a better listener ;-). The issue is that right after boot things like giving a presentation are sort of impossible unless you kill `baloo_file` the hard way, which is not really nice.
(In reply to Sergio from comment #4) > ... The first time it was huge, after that first time it is about 5-10' ... 5 to 10 minutes (for baloo_file, not the heavier baloo_file_extractor) is way more than expected... It would be sensible to see whether you've indexed multiple copies of your files (a hang over from the earlier problems with BTRFS), see if baloosearch -i one-of-your-files.txt give a single or multiple hits - where the '-i' lists the internal "Document ID" that Baloo uses. If you get multiple hits, it would make sense to purge the index and leave Baloo reindexing from scratch (when you have time!). Possible that there are issues with BTRFS snapshots, make sure that you have the snapshot directories excluded. I remember someone posting that during the initial indexing, they placed the index on /tmp (possibly in RAM). That would avoid contention on the disk. Of course, this is the initial indexing not the scan done when you log on. Putting a symlink in at .local/shared/baloo to a different location seems to work (but likely not recommended 8-). Beyond that, I'm running out of ideas..
(In reply to tagwerk19 from comment #5) > ... Beyond that, I'm running out of ideas ... On the off-chance that you are indexing hidden files and folders, you could exclude the .cache and .local/share/Trash directories. They can be large and busy
Is there a way to force baloo_file to show what it is actually doing at some given time? E.g., by some option, via dbus, or sending a user signal?
(In reply to Sergio from comment #7) > Is there a way to force baloo_file to show what it is actually doing at some > given time? E.g., by some option, via dbus, or sending a user signal? There are conditional debugs... although maybe not as many as you might want, see Bug 460390. Turn on by creating a: ~/.config/QtProject/qtlogging.ini file and making sure it contains: [rules] kf.*.debug=true and then making sure that "stderr" goes to the journal by: systemctl edit --user kde-baloo.service adding: [Service] StandardError=journal and saving the override.
I would like to add that limiting baloo to 512M by default is counter-productive on my desktop PC. With just 512M, it just causes unrelated pages to go to swap frequently, as there is no "memory+swap" accounting by default, only "memory". This swap-thrashing affects other apps, such as firefox. Here is the "iostat -xm 4" output, the swap is on /dev/zram0: ``` Device r/s rMB/s rrqm/s %rrqm r_await rareq-sz w/s wMB/s wrqm/s %wrqm w_await wareq-sz d/s dMB/s drqm/s %drqm d_await dareq-sz f/s f_await aqu-sz %util dm-0 4.25 0.03 0.00 0.00 0.41 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.18 sda 4.25 0.03 0.00 0.00 0.29 8.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.18 zram0 11556.00 45.14 0.00 0.00 0.00 4.00 11615.75 45.37 0.00 0.00 0.01 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.17 17.18 ``` ``` $ balooctl6 status Baloo File Indexer is running Indexer state: Indexing file content Total files indexed: 1,248,607 Files waiting for content indexing: 233,923 Files failed to index: 0 Current size of index is 4.28 GiB ``` Raising the limit to 5120M resolves the swap-thrashing issue, but it's obviously not a good idea to give 1/3 of my memory to something not related to the primary purpose of this PC.
(In reply to Alexander Patrakov from comment #9) > I would like to add that limiting baloo to 512M by default is > counter-productive on my desktop PC. With just 512M, it just causes > unrelated pages to go to swap frequently ... I tend to agree.... What I've seen is the system drops and rereads "clean" pages and you get a shedload more reads. You also might be forcing large transactions (loads of dirty pages) into swap. > Raising the limit to 5120M resolves the swap-thrashing issue, but it's > obviously not a good idea to give 1/3 of my memory to something not related > to the primary purpose of this PC. This is really a "pick your number", if you give Baloo more space it indexes more quickly. The memory demands are when content indexing (initial indexing or after a purge...) and you are in the middle of a large job... When troubleshooting I keep a watch on the Memory line in: systemctl --user status kde-baloo You can stepwise reduce the MemoryHigh value, Baloo will try to live within that limit when indexing and it will release the pages when it finishes (and other bits of the system ask for space). You can see Memory Use and Peak in the systemctl status > ... as there is no "memory+swap" > accounting by default, only "memory". This swap-thrashing affects other > apps, such as firefox ... No, you *really* *really* do not want to swap... You can put a "MemorySwapMax=0" in the unit file, are you saying that this won't work? Maybe I've only dealt with systems where it did work...