Bug 502086

Summary: Defaulting to "scan file content" causes huge CPU load and index filesize
Product: [Frameworks and Libraries] frameworks-baloo Reporter: Henning <boredsquirrel>
Component: generalAssignee: baloo-bugs-null
Status: CONFIRMED ---    
Severity: major CC: meven, tagwerk19
Priority: NOR    
Version First Reported In: 6.12.0   
Target Milestone: ---   
Platform: Fedora RPMs   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Henning 2025-03-27 18:52:40 UTC
I think I have found the reason why baloo is so hated by some.

I enabled it, and by default it indexes file content too. Which could be cool, but not by default.

Indexing took over an hour on a modern laptop, and the index took up 12GB space, while 3GB were used for the actual index (no idea how that makes sense)

I then disabled file content indexing, stopped, purged, enabled again, and indexing was very fast, the index is small and unproblematic

https://discuss.kde.org/t/baloo-huge-index-file/32102/

Please change this value by default, as baloo causes a ton of issues otherwise, while this would make it simply a useful tool.
Comment 1 tagwerk19 2025-03-27 21:51:43 UTC
What's the version of Fedora? Presumably you've got BTRFS disks but anything you think might be "out of the ordinary"? Maybe BTRFS snapshots?

Have you stuck with Fedora defaults of indexing ~/Documents, ~/Music, ~/Pictures and ~/Videos?
Comment 2 Henning 2025-04-03 08:35:00 UTC
Fedora 41 Kinoite, default BTRFS stuff, no /home snapshots

I only scan home, and excluded a lot of directories that contain junk, like archives etc.

I dont always use these default directories, instead I scanned my entire home and excluded a lot.

In the end the issue is not the amount of files, but the huge index file. Scanning some directories including file content, but by default only names, would be better for most users I think.

I had severe performance issues and the index was pretty big. That was a problem in the past too, high CPU load.
Comment 3 Méven 2025-04-03 16:29:39 UTC
(In reply to Henning from comment #2)
> Fedora 41 Kinoite, default BTRFS stuff, no /home snapshots
> 
> I only scan home, and excluded a lot of directories that contain junk, like
> archives etc.
> 
> I dont always use these default directories, instead I scanned my entire
> home and excluded a lot.
> 
> In the end the issue is not the amount of files, but the huge index file.
> Scanning some directories including file content, but by default only names,
> would be better for most users I think.
> 
> I had severe performance issues and the index was pretty big. That was a
> problem in the past too, high CPU load.

Can you share the result of the commands:

balooctl6 status
balooctl6 failed

After an index purge you can monitor baloo using `balooctl6 monitor`.
The issue is likely a file that causes baloo to misbehave, identify the files and sharing them would be very helpful.

Then if you can/want to use heaptrack (https://github.com/KDE/heaptrack) or gdb to actually diagnose the issue, we can help.
Comment 4 tagwerk19 2025-04-20 08:44:45 UTC
(In reply to Henning from comment #2)
> Fedora 41 Kinoite...
Ooo. New territory...

> I only scan home, and excluded a lot of directories that contain junk, like
> archives etc.
Hidden Files and Folders?

Best exclude .cache, .local/share/Trash, .mozilla and .thunderbird (of you have them). I wonder if Kinoite makes a lot of use of .var/apps, try excluding that as well.

> ... I had severe performance issues and the index was pretty big. That was a
> problem in the past too, high CPU load ...
You can try "systemctl status --user kde-baloo". That should include a "Memory:" line showing how much memory Baloo is allowed, Nowadays it is capped at 512MB and there have been bugs where that was too small. When it is too small, Baloo has to work a lot harder to sort and clear memory, reread pages back from the index. It really is a balancing act, you don't want Baloo affecting performance by taking too much RAM and you don't want is fighting to work in too small a space. Oooo, and you really don't want it swapping, that really affects you.