Bug 428416 - Baloo_file use 100% cpu and 4Gb of RAM for Checking for unindexed files
Summary: Baloo_file use 100% cpu and 4Gb of RAM for Checking for unindexed files
Status: RESOLVED WORKSFORME
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: Baloo File Daemon (show other bugs)
Version: 5.85.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: baloo-bugs-null
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-29 09:14 UTC by Carl Schwan
Modified: 2023-12-10 03:45 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Hotspot Stack Collapsed Cycles of baloo_file (2.22 MB, application/zip)
2021-08-30 10:27 UTC, postix
Details
Hotspot Flamegraph of baloo_file (44.25 KB, image/svg+xml)
2021-08-30 10:28 UTC, postix
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Carl Schwan 2020-10-29 09:14:30 UTC
SUMMARY

Using too much RAM and CPU during "Checking for unindexed files"

STEPS TO REPRODUCE
1. Have many files (1TB SSD almost filled)
2. Baloo has indexed everything
3. Baloo checks for unindexed files

OBSERVED RESULT

Checking for unindexed files takes a long time (already more than an hour) and use 100% CPU and 4GB of RAM according to htop.

EXPECTED RESULT

Don't use too much RAM.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 5.20.1 openSUSE tumbleweed
KDE Frameworks Version: 5.75
Qt Version: 5.15.1

ADDITIONAL INFORMATION

$ balooctl status
Baloo File Indexer is running
Indexer state: Checking for unindexed files
Total files indexed: 9,550,325
Files waiting for content indexing: 0
Files failed to index: 0
Current size of index is 8.84 GiB

$ balooctl monitor
Press ctrl+c to stop monitoring
File indexer is running
Checking for unindexed files
Comment 1 Carl Schwan 2020-10-29 14:16:26 UTC
6 hours later it is still running and using 100% of my CPU.
Comment 2 Carl Schwan 2020-10-29 14:28:51 UTC
Using balooctl to disable the baloo doesn't stop the process, this makes me think that there is an infinite loop.

Attaching gdb to the running process gives me:

#0  0x00007f3f921d3527 in __memmove_avx_unaligned_erms () from /lib64/libc.so.6
#1  0x00007f3f92b246ea in memmove (__len=<optimized out>, __src=<optimized out>, __dest=<optimized out>, __dest=<optimized out>, __src=<optimized out>, __len=<optimized out>) at /usr/include/qt5/QtCore/qvector.h:838
#2  QVector<Baloo::PositionInfo>::insert (before=..., before=..., n=1, t=..., this=0x7f3f8d6818e8) at /usr/include/qt5/QtCore/qvector.h:863
#3  QVector<Baloo::PositionInfo>::insert (x=..., before=..., this=0x7f3f8d6818e8) at /usr/include/qt5/QtCore/qvector.h:235
#4  Baloo::sortedIdInsert<QVector<Baloo::PositionInfo>, Baloo::PositionInfo> (id=..., vec=...) at /usr/src/debug/baloo5-5.75.0-1.1.x86_64/src/engine/idutils.h:90
#5  Baloo::WriteTransaction::commit (this=<optimized out>) at /usr/src/debug/baloo5-5.75.0-1.1.x86_64/src/engine/writetransaction.cpp:299
#6  0x00007f3f92b1883b in Baloo::Transaction::commit (this=this@entry=0x7f3f8d681aa0) at /usr/src/debug/baloo5-5.75.0-1.1.x86_64/src/engine/transaction.cpp:275
#7  0x000055da09973ad3 in Baloo::UnindexedFileIndexer::run (this=0x55da0bba8480) at /usr/src/debug/baloo5-5.75.0-1.1.x86_64/src/file/unindexedfileindexer.cpp:61
#8  0x00007f3f926356f2 in ?? () from /usr/lib64/libQt5Core.so.5
#9  0x00007f3f92631911 in ?? () from /usr/lib64/libQt5Core.so.5
#10 0x00007f3f9158deb1 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f3f9216dccf in clone () from /lib64/libc.so.6


and typing next:

Single stepping until exit from function __memmove_avx_unaligned_erms,
which has no line number information.

Another helpful information is that I actually have two baloo_file process. The second one give me this backtrace when attaching gdb:

#0  0x00007f3f91594082 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f3f9263792b in QWaitCondition::wait(QMutex*, QDeadlineTimer) () from /usr/lib64/libQt5Core.so.5
#2  0x00007f3f92634277 in QThreadPoolPrivate::waitForDone(QDeadlineTimer const&) () from /usr/lib64/libQt5Core.so.5
#3  0x00007f3f92634697 in QThreadPoolPrivate::waitForDone(int) () from /usr/lib64/libQt5Core.so.5
#4  0x00007f3f9263472c in QThreadPool::~QThreadPool() () from /usr/lib64/libQt5Core.so.5
#5  0x000055da0996ac8d in ?? ()
#6  0x000055da0996acd8 in ?? ()
#7  0x000055da09961210 in ?? ()
#8  0x00007f3f92095e0a in __libc_start_main () from /lib64/libc.so.6
#9  0x000055da099616aa in ?? ()

Unfortunally I'm using my distro package and they are optimized so I can't inspect the state of the different variables.
Comment 3 Carl Schwan 2020-11-12 13:24:53 UTC
So i removed my baloo database (~/.local/share/baloo/index) and let baloo run again but I find myself in the same problem again. And need to kill the baloo_file process each time I restart my computer.
Comment 4 postix 2021-08-30 10:27:53 UTC
Created attachment 141150 [details]
Hotspot Stack Collapsed Cycles  of baloo_file

I have experienced the same today, whereby Baloo has used 100% of one core for twenty minutes or so right after the login. During that I used Hotspot to record where /usr/lib64/libexec/baloo_file had spent its time. I hope that helps us further!


SOFTWARE/OS VERSIONS
Operating System: openSUSE Tumbleweed 20210823
KDE Plasma Version: 5.22.4
KDE Frameworks Version: 5.85.0
Qt Version: 5.15.2
Kernel Version: 5.13.12-1-default (64-bit)
Memory: 31.2 GiB of RAM
Comment 5 postix 2021-08-30 10:28:41 UTC
Created attachment 141151 [details]
Hotspot Flamegraph of baloo_file
Comment 6 postix 2021-08-30 11:06:27 UTC
Carl Schwan, at least your stracktrace in #2 is definitely in line with the performance data I have recorded above.

Apparently most or almost all cycles are spent in (libKF5BalooEngine.so.5.85.0)

- Baloo:WriteTransaction:commit() 

--->

(69.8%) aggregated sample costs in 
- void Baloo::sortedIdInsert<...>(QVector<...>&, Baloo::PositionInfo const&)

(29.4%) aggregated sample costs in 
- Baloo::sortedIdInsert<...>(QVector<...>&, unsigned long long const&) 

---> 

(69.8%) aggregated sample costs in 
- QVector<...>::insert(QTypedArrayData<...>::iterator, Baloo::PositionInfo const&)

(29.4%) aggregated sample costs in 
- QVector<...>::insert(QTypedArrayData<...>::iterator, unsigned long long const&) 

---> 

(69.8%) aggregated sample costs in 
- QVector<...>::insert(QTypedArrayData<...>::iterator, int, Baloo::PositionInfo const&)::memmove

(29.4%) aggregated sample costs in 
- QVector<...>::insert(QTypedArrayData<...>::iterator, int, unsigned long long const&)::memmove
Comment 7 tagwerk19 2021-08-30 19:47:11 UTC
(In reply to postix from comment #6)
> Apparently most or almost all cycles are spent in
> (libKF5BalooEngine.so.5.85.0)
> 
> - Baloo:WriteTransaction:commit() 
> 
> --->
> 
> (69.8%) aggregated sample costs in 
> - void Baloo::sortedIdInsert<...>(QVector<...>&, Baloo::PositionInfo const&)
> 
> (29.4%) aggregated sample costs in 
> - Baloo::sortedIdInsert<...>(QVector<...>&, unsigned long long const&) 
Don't mind saying this is somewhat over my head :-)

However, my assumptions (and recollections) are...

    The LMDB index requires (or perhaps works better?) if the records
    written are sorted.

    baloo_file_extractor processes batches of 40 files when doing content
    indexing so there are many, relatively frequent, small transactions.
    I don't think the initial indexing by baloo_file is the same; if
    there are a load of changes then there's one pretty large transaction.
    In the latter case, you would gain by having a "better" algorithm - or
    also by splitting up this indexing into batches.

I'm not sure whether there are any dev's following this thread but, if there are, it would be nice to know if my guesswork is OK.

(In reply to postix from comment #4)
> I have experienced the same today, whereby Baloo has used 100% of one core
> for twenty minutes or so right after the login.
The questions to ask is "what's happened" that means that baloo has to catch up with loads of changes.

Interesting that you got the issue today and not every time you log on and also that it lasted for 20 minutes and stopped.

Find one of the files indexed and try the following...

    stat testfile
    balooshow -x testfile

and

    baloosearch -i filename:testfile

The "stat" would give you the device and inode number of the file. You should see the same numbers listed in the "balooshow -x" results. See:

    https://bugs.kde.org/show_bug.cgi?id=402154#c12

If the device/inode numbers change for a file, baloo will think it is a different file and index it again. You can see this evidenced in the "baloosearch -i" results, you could get multiple results (different ID's; same file)

My guess, as you both say you are using Tumbleweed and openSUSE uses BTRFS (with multiple subvols), is that you are "suddenly" getting a different minor device number and "suddenly" there's a load of "new files" to index :-/
Comment 8 tagwerk19 2023-10-02 19:39:00 UTC
Maybe fixed by:
    https://invent.kde.org/frameworks/baloo/-/merge_requests/148

Previously baloo bundled all "new files to index" information into a single transaction. With very many changes this could hit memory limits, bog down the system, overflow to swap or trigger an OOM.

The change followed the move to impose a 512 Mbyte "MemoryHigh" systemd limit on Baloo. The two changes should avoid the "Using too much RAM and CPU" issue...

There's another fix coming along, in Frameworks 5.111, that should resolve the problem of baloo reindexing BTRFS discs (that affects OpenSUSE more than others)
Comment 9 Stefan Brüns 2023-11-10 15:06:33 UTC
Response required
Comment 10 Bug Janitor Service 2023-11-25 03:45:44 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 11 Bug Janitor Service 2023-12-10 03:45:58 UTC
This bug has been in NEEDSINFO status with no change for at least
30 days. The bug is now closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

Thank you for helping us make KDE software even better for everyone!