Bug 490928 - Baloo constantly writing to disk
Summary: Baloo constantly writing to disk
Status: RESOLVED NOT A BUG
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: Baloo File Daemon (show other bugs)
Version: 6.4.0
Platform: openSUSE Linux
: NOR major
Target Milestone: ---
Assignee: baloo-bugs-null
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-07-28 15:14 UTC by Josh Robak
Modified: 2024-08-16 20:12 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Josh Robak 2024-07-28 15:14:36 UTC
SUMMARY
Baloo randomly started writing 200 MB/s to my SSD constantly. I needed to turn off file indexing and then kill the process in order to stop it. It ended up writing 5-6 TB to my disk in a short amount of time (<1 day). My entire / partition including home folder is less than 100 GB so it should not be indexing that much. I believe it might have triggered after I woke my system from sleep/suspend. Perhaps it was indexing when I set it to sleep/suspend, and waking it caused Baloo to bug out. Unfortunately, I haven't been able to reproduce this issue since I only had this issue just yesterday, but I still think it is worth reporting.

STEPS TO REPRODUCE
1. Use openSUSE Tumbleweed with the latest snapshot
2. Make sure file indexing is enabled with file names only and "hidden files and folders" box checked (system settings>search>file search>file indexing)
3. Wait (or manually start indexing and set system to sleep and wake it up afterwards)

OBSERVED RESULT
Baloo constantly writes to the disk without end.

EXPECTED RESULT
Baloo writes to the disk for a limited duration before going idle.

SOFTWARE/OS VERSIONS
Operating System: openSUSE Tumbleweed 20240726
KDE Plasma Version: 6.1.3
KDE Frameworks Version: 6.4.0
Qt Version: 6.7.2
Kernel Version: 6.9.9-1-default (64-bit)
Graphics Platform: Wayland
Processors: 12 × AMD Ryzen 5 5600 6-Core Processor
Memory: 15.5 GiB of RAM
Graphics Processor: AMD Radeon RX 6650 XT
Manufacturer: Gigabyte Technology Co., Ltd.
Product Name: AB350M-DS3H

ADDITIONAL INFORMATION
Comment 1 tagwerk19 2024-07-28 17:43:33 UTC
(In reply to robakjoshua from comment #0)
> Operating System: openSUSE Tumbleweed 20240726
Tumbleweed implies you are running with BTRFS? Or at least there's a strong likelihood....

Simple stuff first, how big is your .local/shared/baloo/index file? and what happens if you do a command line search for a file you know you've indexed?

    $ baloosearch6 -i "one-of-your-files.txt"

The thing to watch for is if you get multiple results for the one file. The "-i" is asking baloosearch to give you the internal DocID that Baloo has for the file, OpenSUSE (with BTRFS) was caught by a bug in Baloo (or the way the BTRFS subvolumes were mounted) in that with a reboot, the BTRFS disks were mounted with different device numbers. Baloo saw a file with a different Device Number (same inode but different device number) as a new file and reindexed it. Multiple results are bad...

That was fixed spring last year and most systems should now have the patch. It may be that you are running with a lot of old index data and have gradually upgraded (on a rolling release) to the new system. It's probably best to purge the old index and start afresh....

A possible reason why you are suddenly seeing issues is that you might have deleted a large batch of files, Baloo needs to look through its "much expanded" index and remove the references from the "indexed words" to the (multiple) "indexed files". I seem to remember that process continues even if you've switched to indexing just the filenames.

So...

Purge the index and start afresh...

Add folder exclusions for ".cache" and ".local/share/Trash", maybe ".mozilla" if you have it, and also ".snapshots". Watch out specifically for the last. You might not be snapshotting into your Home folder, but if you are, and Baloo is indexing it, that's bad news!

See if you see files being indexed with "balooctl6 monitor"

Sorry, this is a hurried response before going offline for a couple of weeks and contains guesswork. Should be able to reply again mid August...
Comment 2 Josh Robak 2024-07-28 21:25:56 UTC
(In reply to tagwerk19 from comment #1)
> (In reply to robakjoshua from comment #0)
> > Operating System: openSUSE Tumbleweed 20240726
> Tumbleweed implies you are running with BTRFS? Or at least there's a strong likelihood....

Yes

> Simple stuff first, how big is your .local/shared/baloo/index file?

Do note that I temporarily disabled file indexing and deleted my previously indexed data, but I do remember it being around the same size as it is now. So this is my current index size after re-enabling it.

josh@localhost:~> balooctl6 status
Baloo File Indexer is running
Indexer state: Idle
Total files indexed: 498,609
Files waiting for content indexing: 0
Files failed to index: 0
Current size of index is 440.40 MiB

> and what happens if you do a command line search for a file you know you've indexed?
>     $ baloosearch6 -i "one-of-your-files.txt"

Here's a couple searches

josh@localhost:~> baloosearch6 -i "Notes.txt"
8bc5f70cdda6 /home/josh/Documents/Notes.txt
f187ff70cdda6 /home/josh/.local/share/Steam/ubuntu12_32/steam-runtime.old/usr/share/doc/libtbb2/Release_Notes.txt.gz
175649f70cdda6 /home/josh/.local/share/Steam/ubuntu12_32/steam-runtime/usr/share/doc/libtbb2/Release_Notes.txt.gz
Elapsed: 0.300704 msecs

josh@localhost:~> baloosearch6 -i "baloo_bug_report.txt"
541257f70cdda6 /home/josh/Documents/baloo_bug_report.txt
Elapsed: 0.295343 msecs

So it doesn't look like I'm getting duplicates for the same file.

> Purge the index and start afresh...

See above.

> Add folder exclusions for ".cache" and ".local/share/Trash", maybe
> ".mozilla" if you have it, and also ".snapshots". Watch out specifically for
> the last. You might not be snapshotting into your Home folder, but if you
> are, and Baloo is indexing it, that's bad news!

Good idea, I think you're onto something. It's possible that it got stuck in a loop indexing one of those directories or files. Thankfully, my .snapshots are not saved in my home directory, which is the only directory I have indexed.

This is what my .config/baloofilerc looks like now:

[General]
dbVersion=2
exclude filters=*~,*.part,*.o,*.la,*.lo,*.loT,*.moc,moc_*.cpp,qrc_*.cpp,ui_*.h,cmake_install.cmake,CMakeCache.txt,CTestTestfile.cmake,libtool,config.status,confdefs.h,autom4te,conftest,confstat,Makefile.am,*.gcode,.ninja_deps,.ninja_log,build.ninja,*.csproj,*.m4,*.rej,*.gmo,*.pc,*.omf,*.aux,*.tmp,*.po,*.vm*,*.nvram,*.rcore,*.swp,*.swap,lzo,litmain.sh,*.orig,.histfile.*,.xsession-errors*,*.map,*.so,*.a,*.db,*.qrc,*.ini,*.init,*.img,*.vdi,*.vbox*,vbox.log,*.qcow2,*.vmdk,*.vhd,*.vhdx,*.sql,*.sql.gz,*.ytdl,*.tfstate*,*.class,*.pyc,*.pyo,*.elc,*.qmlc,*.jsc,*.fastq,*.fq,*.gb,*.fasta,*.fna,*.gbff,*.faa,po,CVS,.svn,.git,_darcs,.bzr,.hg,CMakeFiles,CMakeTmp,CMakeTmpQmake,.moc,.obj,.pch,.uic,.npm,.yarn,.yarn-cache,__pycache__,node_modules,node_packages,nbproject,.terraform,.venv,venv,core-dumps,lost+found
exclude filters version=9
exclude folders[$e]=$HOME/.cache,$HOME/.local/share/Trash,$HOME/.mozilla
index hidden folders=true

> See if you see files being indexed with "balooctl6 monitor"

Yeah, it's working normally now. Hopefully it stays that way.

> Sorry, this is a hurried response before going offline for a couple of weeks
> and contains guesswork. Should be able to reply again mid August...

Thank you for timely response. I am looking forward to your next reply.
Comment 3 Josh Robak 2024-07-28 22:17:19 UTC
UPDATE: I can confirm that it happened again, so it seems to be reproducible. This is also after applying the changes tagwerk19@innerjoin.org suggested. It's the same high load constant writing to the disk. However, it is random and unrelated to the sleep/suspend part of my original report, so ignore that part. It happened when balooctl6 monitor reported new files being indexed multiple times before I killed the baloo-file process. I assume it would've continued to display that message had I not stopped it.
Comment 4 Fabian Vogt 2024-07-29 08:51:49 UTC
(In reply to robakjoshua from comment #3)
> It happened when balooctl6 monitor reported new files being indexed multiple times before I killed the baloo-file process. I assume it would've continued to display that message had I not stopped it.

The same file(s) in a loop?

Do you recall doing anything with those files around the time this happened?
Comment 5 Josh Robak 2024-07-29 17:26:00 UTC
(In reply to Fabian Vogt from comment #4)
> (In reply to robakjoshua from comment #3)
> > It happened when balooctl6 monitor reported new files being indexed multiple times before I killed the baloo-file process. I assume it would've continued to display that message had I not stopped it.
> 
> The same file(s) in a loop?

Yeah, I assume that to be the case.

> Do you recall doing anything with those files around the time this happened?

Well the only programs I remember being open at the time were Brave, Steam, a file manager, and a terminal. Brave shouldn't be the issue since I already excluded my .cache folder. So maybe it has to do with Steam. For now I'm going to exclude my .steam folder and see if it happens again.
Comment 6 Josh Robak 2024-07-30 18:13:55 UTC
UPDATE2: I have mostly good news and some bad news. The good news is that excluding my ~/.steam and ~/.local/share/Steam folder seemed to resolve the constant high load writing that I was seeing before.

The only bad news is that Baloo will still write to the disk frequently, and alternates between indexing and idling indefinitely. Granted it's not anywhere near as bad as before, but after disabling the indexing of hidden files and folders, everything goes back to normal. No more random constant writing to the disk at all. So it's probably another unknown that would need to be excluded. If there's any additional folders I should blacklist, please let me know. Otherwise, I am not as concerned about it since it doesn't look like a bug.

Nevertheless, I can't be absolutely sure if the main problem is a bug with Baloo, or if Steam is just particularly bad for indexing. It certainly seems to be in a loop whenever it happens though. So I will keep this report open for the time being. I want to hear from others before I come to more of a conclusion.
Comment 7 Bug Janitor Service 2024-08-14 03:46:55 UTC
🐛🧹 ⚠️ This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information, then set the bug status to REPORTED. If there is no change for at least 30 days, it will be automatically closed as RESOLVED WORKSFORME.

For more information about our bug triaging procedures, please read https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging.

Thank you for helping us make KDE software even better for everyone!
Comment 8 tagwerk19 2024-08-16 13:17:37 UTC
(In reply to robakjoshua from comment #6)
> UPDATE2: I have mostly good news and some bad news. The good news is that
> excluding my ~/.steam and ~/.local/share/Steam folder seemed to resolve the
> constant high load writing that I was seeing before ...
I think Steam (and Wine) is a bit Terra Incognita for Baloo,  they can drop large structures of files onto the disc and possibly ones that Baloo does not know how to decode (or avoid!).

> ... The only bad news is that Baloo will still write to the disk frequently, and
> alternates between indexing and idling indefinitely...
That could be considered normal. As an example, if you are indexing a log file, Baloo will jump in and reindex it when it changes. Or if you are creating temporary/work files, Baloo might be seeing those...

> ... Granted it's not
> anywhere near as bad as before, but after disabling the indexing of hidden
> files and folders, everything goes back to normal ...
Sounds good then...

> ... No more random constant
> writing to the disk at all. So it's probably another unknown that would need
> to be excluded. If there's any additional folders I should blacklist, please
> let me know. Otherwise, I am not as concerned about it since it doesn't look
> like a bug.
I think you have to keep watch on the "balooctl monitor" output or set up logging (so you can then look through the journal). If you find a particular file (or file type) repeatedly popping up, we can look deeper....

> Nevertheless, I can't be absolutely sure if the main problem is a bug with
> Baloo, or if Steam is just particularly bad for indexing. It certainly seems
> to be in a loop whenever it happens though. So I will keep this report open
> for the time being. I want to hear from others before I come to more of a
> conclusion.
That's fine, I'll set back to "Reported" for the time being.
Comment 9 Josh Robak 2024-08-16 19:50:15 UTC
UPDATE3: After some time to reflect, I don't believe it's a bug, but simply a potential consequence of enabling a feature. In addition, I discovered that some flatpaks installed on my system contributed to the latter problem of my last update, so excluding my ~/.var helped reduce writes a bit as well.

(In reply to tagwerk19 from comment #8)
> (In reply to robakjoshua from comment #6)
> > ... No more random constant
> > writing to the disk at all. So it's probably another unknown that would need
> > to be excluded. If there's any additional folders I should blacklist, please
> > let me know. Otherwise, I am not as concerned about it since it doesn't look
> > like a bug.
> I think you have to keep watch on the "balooctl monitor" output or set up
> logging (so you can then look through the journal). If you find a particular
> file (or file type) repeatedly popping up, we can look deeper....

Steam (and indirectly WINE as you pointed out) was the main culprit, however checking the logs, I didn't see anything unusual. It just seemed to be a bad file(s) that caused Baloo to freak out. Regardless, after updating my system multiple times, I haven't been able to reproduce the main issue since then. It might've just been a particular version of Baloo that caused it and is now patched.

At this point, I don't know if there's anything more that I can add. I'm going to mark this as resolved since I don't want to waste any more time from the developers. Thank you to you all who helped.
Comment 10 tagwerk19 2024-08-16 20:12:04 UTC
(In reply to robakjoshua from comment #9)
> ... In addition, I discovered
> that some flatpaks installed on my system contributed to the latter problem
> of my last update, so excluding my ~/.var helped reduce writes a bit as well ...
That's something very useful to know... 

> At this point, I don't know if there's anything more that I can add. I'm
> going to mark this as resolved since I don't want to waste any more time
> from the developers. Thank you to you all who helped.
And thank you for taking the time and effort to troubleshoot. It is appreciated.