SUMMARY: Baloo's index size and memory usage can balloon when running "balooctl status" while baloo is handling file deletions. STEPS TO REPRODUCE: Create a temporary folder and create 50000 one-line files in it: mkdir ~/Testdir cd ~/Testdir for i in {1..50000}; do echo "This is file $i" > file$i.txt; done Baloo will need a while to index these, watch with balooctl monitor in one window and check the count of indexed files with balooctl status It's quite likely that creating so many files so quickly hits the inotify "event limit" and baloo doesn't get told of all the new files. Run balooctl check to get it to look for any files it's missed. Keep an eye on the index size and the memory used by baloo_file as shown by htop. There's nothing remarkable Remove the test folder rm -r ~/Testdir Keep watching the index size and the memory used by baloo_file. Still no sign anything untoward... Run balooctl status; balooctl indexSize a few times and ... OBSERVED RESULTS: ... "balooctl status" takes a considerable amount of time to respond. The "File Size" reported by "balooctl indexSize" increases quite dramatically (while the "Used" count is dropping slowly). "balooctl monitor" does not report file deletions. Watching the memory used (MEM%) by baloo_file as shown by htop is similarly increasing, more-or-less in line with the "File Size" For my tests, the initial "File Size" was 50 Mbyte with "Used" 28 MByte. After several runs of "balooctl status" while baloo is dealing with file deletions, "File Size" had risen to 2.8 Gbyte. Baloo File Indexer is running Indexer state: Idle Total files indexed: 29,476 Files waiting for content indexing: 0 Files failed to index: 0 Current size of index is 2.98 GiB File Size: 2.98 GiB Used: 16.07 MiB MEM% was also shown as 2.8 Gbyte, or about 75% of total memory (in a 4 GByte machine) The guesswork here is that "balooctl status" is counting the indexed files and is locking the db so that writes don't change the number. However the process of deleting entries continues and the changes are appended to the DB. This seems strange and better explanations are welcome 8-/ EXPECTED RESULTS: Baloo maintains a count of indexed files and "balooctl status" can show it without needing to lock the DB and count the entries. "balooctl monitor" should probably show files as they are deleted (as a bit of reassurance that something is happening) SOFTWARE/OS VERSIONS: Checked on Neon Unstable... Plasma: 5.22.80 Frameworks: 5.83.0 Qt: 5.15.2 ADDITIONAL INFORMATION: Once baloo_file memory usage has gone up, it does not drop down again. You need to restart baloo
(In reply to tagwerk19 from comment #0) > It's quite likely that creating so many files so quickly hits the > inotify "event limit" and baloo doesn't get told of all the new files. Creating the testfiles via a script can give you a kf.baloo: Inotify - too many event - Overflowed and you need to run "balooctl check" when the script has finished to find the rest of the newly created files. It is also possible to get an "Overflowed" message when deleting files and when this happens, baloo stops removing deleted entries and a "balooctl check" does not to resolve the situation. EXPECTED RESULTS: Ideally, if a baloo receives an "inotify overflow", it should queue up a "balooctl check" "balooctl check" should recognise up files that no longer exist in the filesystem and remove the index entries
tagwerk can you show the output of "balooctl indexSize". For me it currently is: % balooctl indexSize File Size: 8,12 GiB Used: 77,17 MiB PostingDB: 1,36 GiB 1801.974 % PositionDB: 1,68 GiB 2228.958 % DocTerms: 784,75 MiB 1016.891 % DocFilenameTerms: 68,79 MiB 89.137 % DocXattrTerms: 4,00 KiB 0.005 % IdTree: 17,55 MiB 22.742 % IdFileName: 77,23 MiB 100.071 % DocTime: 51,05 MiB 66.157 % DocData: 38,66 MiB 50.101 % ContentIndexingDB: 9,32 MiB 12.077 % FailedIdsDB: 0 B 0.000 % MTimeDB: 15,06 MiB 19.518 % I do not claim I understand the output though. 2228% of what? Why 77 MiB used?
(In reply to Martin Steigerwald from comment #2) > % balooctl indexSize > File Size: 8,12 GiB > Used: 77,17 MiB There's some analysis/discussion/confusion about the percentages here: https://bugs.kde.org/show_bug.cgi?id=354636#c10 I think the "used" sizes are believable. I have copied/compressed a test index with mdb_copy -n -c index index.new (from lmdb-utils) and this changes "indexSize" details from: File Size: 2,28 GiB Used: 18,99 MiB PostingDB: 4,89 MiB 25.735 % PositionDB: 4,92 MiB 25.921 % DocTerms: 2,47 MiB 13.001 % DocFilenameTerms: 1,70 MiB 8.969 % DocXattrTerms: 4,00 KiB 0.021 % IdTree: 240,00 KiB 1.234 % IdFileName: 1,94 MiB 10.204 % DocTime: 1,29 MiB 6.809 % DocData: 1,53 MiB 8.044 % ContentIndexingDB: 0 B 0.000 % FailedIdsDB: 0 B 0.000 % MTimeDB: 12,00 KiB 0.062 % to: File Size: 19,37 MiB Used: 18,99 MiB PostingDB: 4,89 MiB 25.735 % PositionDB: 4,92 MiB 25.921 % DocTerms: 2,47 MiB 13.001 % DocFilenameTerms: 1,70 MiB 8.969 % DocXattrTerms: 4,00 KiB 0.021 % IdTree: 240,00 KiB 1.234 % IdFileName: 1,94 MiB 10.204 % DocTime: 1,29 MiB 6.809 % DocData: 1,53 MiB 8.044 % ContentIndexingDB: 0 B 0.000 % FailedIdsDB: 0 B 0.000 % MTimeDB: 12,00 KiB 0.062 % Which points at loads of "empty space" created during the deletions/status. This is on ext4, after having created 50000 files and deleted 20000. I will try the same on BTRFS Whether this helps any...?
I helped, but not as much as with your setup: % ~/.local/share/baloo> balooctl indexSize File Size: 8,12 GiB Used: 79,78 MiB PostingDB: 1,36 GiB 1743.970 % PositionDB: 1,68 GiB 2157.734 % DocTerms: 785,45 MiB 984.552 % DocFilenameTerms: 68,79 MiB 86.226 % DocXattrTerms: 4,00 KiB 0.005 % IdTree: 17,55 MiB 22.000 % IdFileName: 77,23 MiB 96.803 % DocTime: 51,05 MiB 63.996 % DocData: 38,67 MiB 48.475 % ContentIndexingDB: 9,29 MiB 11.649 % FailedIdsDB: 0 B 0.000 % MTimeDB: 15,06 MiB 18.881 % % ~/.local/share/baloo> mdb_copy -n -c index index.new % ~/.local/share/baloo> LANG=en ls -lh total 13G -rw-r--r-- 1 martin martin 8.2G Jun 26 15:21 index -rw-r--r-- 1 martin martin 8.0K Jun 26 21:02 index-lock -rw-r--r-- 1 martin martin 4.1G Jun 26 21:02 index.new % ~/.local/share/baloo> mv index.new index % ~/.local/share/baloo> balooctl indexSize File Size: 4,08 GiB Used: 79,78 MiB PostingDB: 1,36 GiB 1743.970 % PositionDB: 1,68 GiB 2157.734 % DocTerms: 785,45 MiB 984.552 % DocFilenameTerms: 68,79 MiB 86.226 % DocXattrTerms: 4,00 KiB 0.005 % IdTree: 17,55 MiB 22.000 % IdFileName: 77,23 MiB 96.803 % DocTime: 51,05 MiB 63.996 % DocData: 38,67 MiB 48.475 % ContentIndexingDB: 9,29 MiB 11.649 % FailedIdsDB: 0 B 0.000 % MTimeDB: 15,06 MiB 18.881 %
*** Bug 449713 has been marked as a duplicate of this bug. ***
Flagging as Confirmed on the basis of Bug 460460