Bug 384372

Summary: baloo_file_extractor always high CPU usage
Product: [Frameworks and Libraries] frameworks-baloo Reporter: Guo Yunhe <i>
Component: Baloo File DaemonAssignee: Pinak Ahuja <pinak.ahuja>
Status: RESOLVED DUPLICATE    
Severity: normal CC: alexlong92, chrisito, heinrich.seebauer, kalomel, nate, ottwolt, rainer
Priority: NOR    
Version: 5.37.0   
Target Milestone: ---   
Platform: openSUSE   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Guo Yunhe 2017-09-05 07:28:52 UTC
After update to KDE Frameworks 5.37, baloo_file_extractor always use about 12% total CPU and cause computer over heat.
Comment 1 Nate Graham 2017-09-05 19:17:22 UTC
Just while indexing, or forever? Does it go away if you restart it (`balooctl restart`)? Does it start immediately on startup, every time?
Comment 2 Guo Yunhe 2017-09-05 19:33:09 UTC
High CPU is very long until I shut down my computer at night. At least longer than 10 hours.

Restart it won't reduce CPU usage. It take up 12% CPU immediately after restarting.

Every time I restart my system, baloo_file_extractor uses the same amount of CPU after I enter the desktop for a while (maybe one ~ three minutes?).
Comment 3 Rainer Finke 2017-11-25 09:06:12 UTC
I have the same issue. baloo_file_extractor consumes 100% on one of my CPU cores. If I kill the service and restart, it consumes again 100%. But this issue is new to me, maybe since Frameworks 5.40 or any other recent change.

Arch Linux
KDE Frameworks 5.40
KDE Applications 17.11.80
Plasma 5.11.3
Qt 5.10 beta 4 (compiled against 5.9.2)
Comment 4 Christoph Roick 2017-12-01 22:19:43 UTC
(In reply to Rainer Finke from comment #3)
> I have the same issue. baloo_file_extractor consumes 100% on one of my CPU
> cores. If I kill the service and restart, it consumes again 100%. But this
> issue is new to me, maybe since Frameworks 5.40 or any other recent change.

I can confirm. It startet doing that just recently.
Arch Linux
KDE Frameworks 5.40
Plasma 5.11.4
Comment 5 Heinrich Seebauer 2018-01-17 09:58:42 UTC
Same problem here. Wonder why no one can give a working fix for it, aside form removing its service completely (at least so it seems to me).
The baloo_file_extractor process consumes a single core completely (100%). Gladly this thing is not multi-threaded, since it might take up all available cores then. bfe goes up with starting the system (e.g. from hibernation) and never goes down until killed manually.

A lot of people seem to struggle with this issue. I wonder how many cpu cores on this planet are senselessly sucking up electricity, executing the indexer, and producing nothing usable but heat.

configuration:
OpenSuse leap 42.3
KDE (plasmashell -v yields) 5.8.7
baloo 5.32.0-1.3-x86_64

There are some real-world consequences beside the annoying sound of the fans spinning up and down cooling this mess. Ever thought about the CO2 footprint?
Comment 6 Christoph Feck 2018-01-31 00:24:49 UTC
It probably depends on the files it tries to index. I think there is a way to monitor which files it currently reads, but I am no expert with baloo.
Comment 7 Michael Heidelbach 2018-02-01 11:15:43 UTC
I agree with Christoph probably there's a file somewhere  baloo_file_extractor can't handle.

Try this to find it:
$ balooctl stop

ensure neither baloo_file nor baloo_file_extractor are running

on a second command line enter
$ balooctl monitor (Do not hit return, yet)

on the first command line
$balooctl start (Hit return)

As quickly as possible hit return on the second command line

With a little luck balooctl monitor will report the files currently indexed.
The last reported file might be the culprit. Examine it and report your findings please.

In case this does not work, you can try balooctl disable/enable instead of stop/start. BEWARE this will rebuild your database from the scratch, which is time consuming and maybe not what you want to do.
Comment 8 Heinrich Seebauer 2018-02-01 15:27:43 UTC
(In reply to Michael Heidelbach from comment #7)
Thanks, Michael, for your advice.
I assume the command 'balooctl stop' is intended to stop baloo_file_extractor in the first place.
> Try this to find it:
> $ balooctl stop
> 
> ensure neither baloo_file nor baloo_file_extractor are running

baloo_file_extractor does neither stop nor show any other reaction after 'balooctl stop' - baloo_file_search is not running. I guess that playing around with the following options makes no sense then.

Or may I kill baloo_file_extractor manually?

Killing the process(es) manually lets balooctl report the dying service: 'Baloo died'.
I killed the baloo_file_extractor process, and deleted the index and the index-lock files from ~/.local/share/baloo, then did 'balooctl start'. The file extractor showd up in the task list with 13% cpu (a full core in an 8-core machine).

After some time, at the terminal where I started balooctl, a message was printed

org.kde.baloo: true "/org/kde/fstab///SWIDC010/MyD"
org.kde.baloo: true "/org/kde/fstab///192.168.2.6/userhome/heinrich"

which obviously refers to two mounted cifs drives that are (by default?) excluded from search. Above message is printed repeatedly from time to time.

My home directory is sized about 32GB, and after about 45 min of shuffling around, the index has round about 1.4GB.

It seems to have ended for now, there is a process baloo_file, baloo_file_extractor is gone, at index' size of 1.5 GB.

For the first time since installing 42.3 the indexer remains quiet. Could it have been the deletion of the index files? What about the newly created index, sized 1.5G for some 30GB user data - is that to be exepcted or is that just too big?

Regards
Heinrich
Comment 9 Michael Heidelbach 2018-02-01 18:18:33 UTC
@Heinrich Seebauer: Problem solved, I think.
Comment 10 Nate Graham 2018-02-01 18:20:55 UTC
Great! Should we dupe this to a bug that tracks baloo hanging and burning CPU time when attempting to index certain files, or use this to track that?

Or is it another bug in play?
Comment 11 Michael Heidelbach 2018-02-01 18:33:08 UTC
Deleting ~/.local/share/balooIndex* and restarting baloo is essentially the same as 
  $ balooctl disable
  $ balooctl enable

After that the database is rebuilt from the scratch. 
As reported in Comment 8 this sometimes solves the problem.

But, if we don't know the file that caused the trouble, we can't learn from that. :)

@Nate: I think it's a dupe.
Comment 12 Nate Graham 2018-02-01 18:35:55 UTC
Great, of which bug?
Comment 13 Michael Heidelbach 2018-02-01 18:47:20 UTC
(In reply to Nate Graham from comment #12)
> Great, of which bug?

BUG:380456 Comment 2 looks similar. But please don't mark it as duplicate.
I need to read the bugs (and baloo's code) thoroughly before I can say anything about it.
I'd really like to get a hand at some files that cause trouble. Maybe baloo is just choking.
Comment 14 Heinrich Seebauer 2018-02-02 14:31:35 UTC
(In reply to Michael Heidelbach from comment #13)
If it's a file that utlimately causes the indexer's behaviour, shouldn't I expect it to occur again when I start indexing (i.e. call 'balooctl start')?

In my case it did not run endlessly anymore after deleting the index.
I would conclude that, having not changed anything in the setup or the indexed data sets, it might take more than a single file to provoke this behaviour.

If the indexer runs amok again, I will compile me a debuggable version, and attach gdb to the running process. Maybe one could tell then where it's looping undefinitely.

Thanks for your kind support
Heinrich
Comment 15 kalomel 2018-02-11 18:06:13 UTC
There are already a number of resource limitations in place for baloo, but I wonder if they cover all thinkable situations: When on a multicore system one core is more or less idle, and baloo happens to run on this otherwise idle core, is it possible there is nothing to really limit baloo's CPU consumption?

As far as I understand, *nice* is without effect in such situations, and baloo will get as much CPU time as it wants, if necessary 100% with all the annoying side effects (heat, fan speed going up, noise).

If that conclusion is right, wouldn't setrlimit() be a remedy?
Comment 16 kalomel 2018-02-11 21:09:09 UTC
No Baloo expert here either, but if you want to know the file it indexes right now, try it with

$ qdbus org.kde.baloo /fileindexer org.kde.baloo.fileindexer.currentFile
Comment 17 Rainer Finke 2018-03-25 17:19:59 UTC
Today I had this again that baloorunner consumed 100% of one core of my CPU. For now I have deactivated the search again. Plasma 5.12.3, KDE Frameworks 5.44.
Comment 18 kalomel 2018-05-26 21:14:54 UTC
Don't know if the issues are the same, just for reference:

* An old topic on KDE forum with a whopping 65000 visits, the most recent posts are maybe relevant: https://forum.kde.org/viewtopic.php?f=154&t=120468

* A fix mentioned in this topic: https://phabricator.kde.org/D12335
Comment 19 Nate Graham 2018-05-26 22:29:49 UTC
Yes, it's almost certainly the same issue that was recently fixed. Thanks for following up!

*** This bug has been marked as a duplicate of bug 378754 ***