Bug 428416

Summary: Baloo_file use 100% cpu and 4Gb of RAM for Checking for unindexed files
Product: frameworks-baloo Reporter: Carl Schwan <carl>
Component: Baloo File DaemonAssignee: baloo-bugs-null
Status: CONFIRMED ---    
Severity: normal CC: nate, postix, tagwerk19
Priority: NOR    
Version: 5.85.0   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: Hotspot Stack Collapsed Cycles of baloo_file
Hotspot Flamegraph of baloo_file

Description Carl Schwan 2020-10-29 09:14:30 UTC
SUMMARY

Using too much RAM and CPU during "Checking for unindexed files"

STEPS TO REPRODUCE
1. Have many files (1TB SSD almost filled)
2. Baloo has indexed everything
3. Baloo checks for unindexed files

OBSERVED RESULT

Checking for unindexed files takes a long time (already more than an hour) and use 100% CPU and 4GB of RAM according to htop.

EXPECTED RESULT

Don't use too much RAM.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 5.20.1 openSUSE tumbleweed
KDE Frameworks Version: 5.75
Qt Version: 5.15.1

ADDITIONAL INFORMATION

$ balooctl status
Baloo File Indexer is running
Indexer state: Checking for unindexed files
Total files indexed: 9,550,325
Files waiting for content indexing: 0
Files failed to index: 0
Current size of index is 8.84 GiB

$ balooctl monitor
Press ctrl+c to stop monitoring
File indexer is running
Checking for unindexed files
Comment 1 Carl Schwan 2020-10-29 14:16:26 UTC
6 hours later it is still running and using 100% of my CPU.
Comment 2 Carl Schwan 2020-10-29 14:28:51 UTC
Using balooctl to disable the baloo doesn't stop the process, this makes me think that there is an infinite loop.

Attaching gdb to the running process gives me:

#0  0x00007f3f921d3527 in __memmove_avx_unaligned_erms () from /lib64/libc.so.6
#1  0x00007f3f92b246ea in memmove (__len=<optimized out>, __src=<optimized out>, __dest=<optimized out>, __dest=<optimized out>, __src=<optimized out>, __len=<optimized out>) at /usr/include/qt5/QtCore/qvector.h:838
#2  QVector<Baloo::PositionInfo>::insert (before=..., before=..., n=1, t=..., this=0x7f3f8d6818e8) at /usr/include/qt5/QtCore/qvector.h:863
#3  QVector<Baloo::PositionInfo>::insert (x=..., before=..., this=0x7f3f8d6818e8) at /usr/include/qt5/QtCore/qvector.h:235
#4  Baloo::sortedIdInsert<QVector<Baloo::PositionInfo>, Baloo::PositionInfo> (id=..., vec=...) at /usr/src/debug/baloo5-5.75.0-1.1.x86_64/src/engine/idutils.h:90
#5  Baloo::WriteTransaction::commit (this=<optimized out>) at /usr/src/debug/baloo5-5.75.0-1.1.x86_64/src/engine/writetransaction.cpp:299
#6  0x00007f3f92b1883b in Baloo::Transaction::commit (this=this@entry=0x7f3f8d681aa0) at /usr/src/debug/baloo5-5.75.0-1.1.x86_64/src/engine/transaction.cpp:275
#7  0x000055da09973ad3 in Baloo::UnindexedFileIndexer::run (this=0x55da0bba8480) at /usr/src/debug/baloo5-5.75.0-1.1.x86_64/src/file/unindexedfileindexer.cpp:61
#8  0x00007f3f926356f2 in ?? () from /usr/lib64/libQt5Core.so.5
#9  0x00007f3f92631911 in ?? () from /usr/lib64/libQt5Core.so.5
#10 0x00007f3f9158deb1 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f3f9216dccf in clone () from /lib64/libc.so.6


and typing next:

Single stepping until exit from function __memmove_avx_unaligned_erms,
which has no line number information.

Another helpful information is that I actually have two baloo_file process. The second one give me this backtrace when attaching gdb:

#0  0x00007f3f91594082 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f3f9263792b in QWaitCondition::wait(QMutex*, QDeadlineTimer) () from /usr/lib64/libQt5Core.so.5
#2  0x00007f3f92634277 in QThreadPoolPrivate::waitForDone(QDeadlineTimer const&) () from /usr/lib64/libQt5Core.so.5
#3  0x00007f3f92634697 in QThreadPoolPrivate::waitForDone(int) () from /usr/lib64/libQt5Core.so.5
#4  0x00007f3f9263472c in QThreadPool::~QThreadPool() () from /usr/lib64/libQt5Core.so.5
#5  0x000055da0996ac8d in ?? ()
#6  0x000055da0996acd8 in ?? ()
#7  0x000055da09961210 in ?? ()
#8  0x00007f3f92095e0a in __libc_start_main () from /lib64/libc.so.6
#9  0x000055da099616aa in ?? ()

Unfortunally I'm using my distro package and they are optimized so I can't inspect the state of the different variables.
Comment 3 Carl Schwan 2020-11-12 13:24:53 UTC
So i removed my baloo database (~/.local/share/baloo/index) and let baloo run again but I find myself in the same problem again. And need to kill the baloo_file process each time I restart my computer.
Comment 4 postix 2021-08-30 10:27:53 UTC
Created attachment 141150 [details]
Hotspot Stack Collapsed Cycles  of baloo_file

I have experienced the same today, whereby Baloo has used 100% of one core for twenty minutes or so right after the login. During that I used Hotspot to record where /usr/lib64/libexec/baloo_file had spent its time. I hope that helps us further!


SOFTWARE/OS VERSIONS
Operating System: openSUSE Tumbleweed 20210823
KDE Plasma Version: 5.22.4
KDE Frameworks Version: 5.85.0
Qt Version: 5.15.2
Kernel Version: 5.13.12-1-default (64-bit)
Memory: 31.2 GiB of RAM
Comment 5 postix 2021-08-30 10:28:41 UTC
Created attachment 141151 [details]
Hotspot Flamegraph of baloo_file
Comment 6 postix 2021-08-30 11:06:27 UTC
Carl Schwan, at least your stracktrace in #2 is definitely in line with the performance data I have recorded above.

Apparently most or almost all cycles are spent in (libKF5BalooEngine.so.5.85.0)

- Baloo:WriteTransaction:commit() 

--->

(69.8%) aggregated sample costs in 
- void Baloo::sortedIdInsert<...>(QVector<...>&, Baloo::PositionInfo const&)

(29.4%) aggregated sample costs in 
- Baloo::sortedIdInsert<...>(QVector<...>&, unsigned long long const&) 

---> 

(69.8%) aggregated sample costs in 
- QVector<...>::insert(QTypedArrayData<...>::iterator, Baloo::PositionInfo const&)

(29.4%) aggregated sample costs in 
- QVector<...>::insert(QTypedArrayData<...>::iterator, unsigned long long const&) 

---> 

(69.8%) aggregated sample costs in 
- QVector<...>::insert(QTypedArrayData<...>::iterator, int, Baloo::PositionInfo const&)::memmove

(29.4%) aggregated sample costs in 
- QVector<...>::insert(QTypedArrayData<...>::iterator, int, unsigned long long const&)::memmove
Comment 7 tagwerk19 2021-08-30 19:47:11 UTC
(In reply to postix from comment #6)
> Apparently most or almost all cycles are spent in
> (libKF5BalooEngine.so.5.85.0)
> 
> - Baloo:WriteTransaction:commit() 
> 
> --->
> 
> (69.8%) aggregated sample costs in 
> - void Baloo::sortedIdInsert<...>(QVector<...>&, Baloo::PositionInfo const&)
> 
> (29.4%) aggregated sample costs in 
> - Baloo::sortedIdInsert<...>(QVector<...>&, unsigned long long const&) 
Don't mind saying this is somewhat over my head :-)

However, my assumptions (and recollections) are...

    The LMDB index requires (or perhaps works better?) if the records
    written are sorted.

    baloo_file_extractor processes batches of 40 files when doing content
    indexing so there are many, relatively frequent, small transactions.
    I don't think the initial indexing by baloo_file is the same; if
    there are a load of changes then there's one pretty large transaction.
    In the latter case, you would gain by having a "better" algorithm - or
    also by splitting up this indexing into batches.

I'm not sure whether there are any dev's following this thread but, if there are, it would be nice to know if my guesswork is OK.

(In reply to postix from comment #4)
> I have experienced the same today, whereby Baloo has used 100% of one core
> for twenty minutes or so right after the login.
The questions to ask is "what's happened" that means that baloo has to catch up with loads of changes.

Interesting that you got the issue today and not every time you log on and also that it lasted for 20 minutes and stopped.

Find one of the files indexed and try the following...

    stat testfile
    balooshow -x testfile

and

    baloosearch -i filename:testfile

The "stat" would give you the device and inode number of the file. You should see the same numbers listed in the "balooshow -x" results. See:

    https://bugs.kde.org/show_bug.cgi?id=402154#c12

If the device/inode numbers change for a file, baloo will think it is a different file and index it again. You can see this evidenced in the "baloosearch -i" results, you could get multiple results (different ID's; same file)

My guess, as you both say you are using Tumbleweed and openSUSE uses BTRFS (with multiple subvols), is that you are "suddenly" getting a different minor device number and "suddenly" there's a load of "new files" to index :-/