With File search set to index file names and contents, I got baloo_file_extractor to consume close to 20GB of RAM while extracting a 50GB tgz containing photos. I don't think baloo should ever consume that much RAM. My whole PC became sluggish because the OS started to swap
Created attachment 171269 [details] baloo_file_extrator high memory usage
(In reply to Andrea Ippolito from comment #0) > ... I don't think baloo should ever consume that much RAM ... Definitely not. There are protections against this, but these assume you are running on a system with systemd. You can check with: $ systemctl --user status kde-baloo Which ought to show a line like: Memory: 1008.0K (high: 512.0M available: 511.0M) This shows the limit, 512 MB and how much memory Baloo is using. What format are the photos? JPGs and PNG ought to be perfectly fine. PDFs may possibly be an issue. As far as I know, Baloo doesn't index within .tgz files (doesn't unpack the archive to index the contents). It would be a problem if it did that ....
(In reply to tagwerk19 from comment #2) > (In reply to Andrea Ippolito from comment #0) > > ... I don't think baloo should ever consume that much RAM ... > Definitely not. > > There are protections against this, but these assume you are running on a > system with systemd. You can check with: > $ systemctl --user status kde-baloo > Which ought to show a line like: > Memory: 1008.0K (high: 512.0M available: 511.0M) > This shows the limit, 512 MB and how much memory Baloo is using. > > What format are the photos? JPGs and PNG ought to be perfectly fine. PDFs > may possibly be an issue. > > As far as I know, Baloo doesn't index within .tgz files (doesn't unpack the > archive to index the contents). It would be a problem if it did that .... Hi, the photos were simple JPGs. I was unzipping a tar.gz archive I got from Google's Takeout service. I take out my entire Google Photos collection every couple of months by downloading it as tar.gz (actually, multiple of them, each is capped at 50GB). Then I uncompress these archives and upload them to a backup location. The problem happens while uncompressing these (admittedly large) archives. I think that my File Search settings were set to index "file name and contents" Since noticing this issue I've skipped indexing contents, but RAM usage was still way too high for what it does, IIRC. But anyway that's not really the main point of this report, this is about 18GB of RAM being used while "file name and contents" indexing is enabled. Thanks As for the command you suggested, I don't see any memory stats reported: ``` ● kde-baloo.service - Baloo File Indexer Daemon Loaded: loaded (/usr/lib/systemd/user/kde-baloo.service; disabled; preset: disabled) Active: active (running) since Wed 2024-07-03 09:10:36 CEST; 22min ago Process: 2007 ExecCondition=/usr/bin/kde-systemd-start-condition --condition baloofilerc:Basic Settings:Indexing-Enabled:true (code=exited, status=0/SU> Main PID: 2010 (baloo_file) Tasks: 2 (limit: 37605) CPU: 7.555s CGroup: /user.slice/user-1000.slice/user@1000.service/background.slice/kde-baloo.service └─2010 /usr/libexec/kf6/baloo_file Jul 03 09:10:36 andromeda systemd[1774]: Starting Baloo File Indexer Daemon... Jul 03 09:10:36 andromeda systemd[1774]: Started Baloo File Indexer Daemon. Jul 03 09:10:37 andromeda baloo_file[2010]: qt.dbus.integration: QDBusConnection: name 'org.freedesktop.UDisks2' had owner '' but we thought it was ':1.31' Jul 03 09:10:37 andromeda baloo_file[2010]: qt.dbus.integration: QDBusConnection: name 'org.freedesktop.UPower' had owner '' but we thought it was ':1.35' ```
(In reply to Andrea Ippolito from comment #3) > ... As for the command you suggested, I don't see any memory stats reported ... It's possible that you are running an older distro, the KF5 patch was here: https://invent.kde.org/frameworks/baloo/-/merge_requests/124 in April 2023. The KF6 patch was earlier... However you can add the memory caps yourself with a $ systemctl --user edit kde-baloo and make sure the override file contains: [Service] MemoryHigh=25% You can adjust the value, I find 25% OK although in some situations I set it to 40% A further option is to add: MemorySwapMax=0 which forcefully prevents Baloo swapping (but then, if it hits the limit, it will crash with an Out Of Memory). I think if you have an older Distro (dependent on how old), you might have missed a patch that fixes mimetype lookups reading too much of the file. This is an "unreliable memory" as I'm not sure if I can find the patch/description but earlier I think if you wanted to find out the mimetype of a file (that depended on magic values within the file), all the file was read. That's something that might explain you findings...