Bug 333655 - Baloo indexing I/O introduces serious noticable delays
Summary: Baloo indexing I/O introduces serious noticable delays
Status: RESOLVED MOVED
Alias: None
Product: frameworks-baloo
Classification: Unclassified
Component: Baloo File Daemon (show other bugs)
Version: 5.20.0
Platform: Ubuntu Packages Linux
: NOR normal
Target Milestone: ---
Assignee: Pinak Ahuja
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-20 16:30 UTC by Jakob Petsovits
Modified: 2018-11-05 15:22 UTC (History)
39 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
logged output of atop (410.96 KB, application/octet-stream)
2014-04-29 10:34 UTC, Vojtěch Zeisek
Details
top part of output of atop (9.65 KB, application/octet-stream)
2014-04-29 10:36 UTC, Vojtěch Zeisek
Details
Strace of baloo_file when Minecraft is running (11.64 KB, text/plain)
2014-04-30 01:31 UTC, Marcin Śliwiński
Details
strace of baloo_file_indexer (162.36 KB, text/x-log)
2014-05-02 11:04 UTC, Vojtěch Zeisek
Details
strace of baloo_file_indexer (234.17 KB, application/zip)
2014-05-02 11:11 UTC, Vojtěch Zeisek
Details
strace of baloo_file_indexer (217.32 KB, text/x-log)
2014-05-02 18:23 UTC, Vojtěch Zeisek
Details
test case for inducing a lot of write activity of baloo (180 bytes, text/x-sh)
2014-05-14 19:39 UTC, Martin Steigerwald
Details
atop logs with test case from comment #38 with current baloo git (200.06 KB, application/x-xz)
2014-05-15 09:36 UTC, Martin Steigerwald
Details
atop raw log (2.14 MB, image/x-panasonic-raw)
2014-05-15 10:38 UTC, Vojtěch Zeisek
Details
atop raw log with baloo stresser running (1.05 MB, image/x-panasonic-raw)
2014-05-15 18:41 UTC, Vojtěch Zeisek
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jakob Petsovits 2014-04-20 16:30:43 UTC
I'm running on a ThinkPad T410, which by now is 4 years old. My laptop has 16 GB of RAM so that's not much of an issue, the CPU is still fast enough for lots of things, however the 5,200 rpm hard drive is the bottleneck of the system. My Linux partition is formatted with ext4.

I upgraded to Kubuntu 14.10 three days ago and had the system running for at least a whole night without interruption of other work being done on my laptop. I explicitly excluded my extensive dev/ (development, repositories) directory in the Desktop Search kconfig module.

Despite this, the system info accessed from the Alt-F2 krunner shows me that Baloo continuously keeps indexing with something between 200 KB/s to 2.4 MB/s, either reading or writing or both. It starts immediately after the KDE desktop is loaded and does not seem to end. On my laptop, disk usage like this makes any disk-related activity (save a file, open a menu or new Firefox tab, run a command whose binary is not yet in the RAM cache) go from a fraction of a second to three or five seconds of loading time. This is not acceptable and forces me to exclude my home directory from being indexed.

Is this Baloo's standard behaviour? Will it ever be done indexing? Here are some ideas that I can think of might be helpful for future releases:

* Add a status indicator in the Desktop Search module, showing statuses like "Done indexing everything", "Waiting to index", "Indexing new files" and "Reindexing". [1]
* Stop indexing as long as any input devices are being used (mouse, keyboard, touch - maybe start again 1 min or so after no input activity has been detected).
* Wait with indexing until a couple of minutes after the desktop is fully functional and the previous session has been restored.

[1] While this is not doing anything to improve performance, it will help me to figure out whether Baloo just needs more time to get through my home directory initially or is a hopeless case that I need to disable. My music collection isn't terribly huge, I've got a multi-gigabyte set of emails in Akonadi but not much more significant data in my home directory except the excluded dev/ folder, so right now I don't see why a full night of indexing opportunities won't be enough to make my system run smoothly afterwards. Hence me not giving Baloo a chance until the next release.

Thanks for working on making things better for lots of people! Let me know if I can provide any information to help with this.
Comment 1 Jakob Petsovits 2014-04-20 16:36:52 UTC
Correction: Hard drive RPM speed was going to be 5,400 rather than 5,200.
Comment 2 Ignacio Serantes 2014-04-21 08:36:22 UTC
Some problem for me when I enabled it after upgrading to KDE 4.13.0. Now Baloo is disabled and system works fine.

Maybe the problem is related to database size, in my case File directory size is 3.2 GiB and growing, a little bit curious because my Nepomuk's Virtuoso DB has only 371,6 MiB. 

Maybe this because I not blacklisted all directories properly but working with a blacklist system when you have so many directories in your home is totally painful.
Comment 3 Vishesh Handa 2014-04-22 10:08:31 UTC
Hi. Could you please let me know which was the exact process responsible for the high io?
Comment 4 Ignacio Serantes 2014-04-22 10:33:36 UTC
I'm not sure but maybe was related to my Calibre library.
Comment 5 Alan Ezust 2014-04-23 15:50:28 UTC
I had a similar problem. 1TB drive, with 200gb of mp3 files on one of the partitions, and nothing unusual on the others. 

The mp3s were on a different partition which was already on the list of ignored directories in the KDE desktop search settings, so I am not sure if it is the cause of this problem.

The drive light never turns off! There is CONSTANT i/o which doesn't result in high CPU usage, but a lot of "wait%" time, as shown in "top".

Sure, baloo has "nice" settings that keep it from eating too much CPU but that is totally useless when the CPU is not the bottleneck. If baloo constantly runs indexing stuff whenever the CPU is idle, that means the hard drive will always be slow in responding to all requests.
Also, if I add /home to the list of ignored directories, it seems to have no effect. Perhaps that is a different bug. But I was unable to disable baloo without doing stuff from the bash shell.
Comment 6 Piotr 2014-04-23 18:26:40 UTC
I have a hard disk like this:
    SATA; cache 64MB; speed: 7200 rot./min.; size: 1TB; 
1/3 taken by over 300,000 files (huge number of small text files - welcome to bioinformatics). 
After an upgrade to Kubuntu 14.10 LTS Baloo was running wild, what I thought was the initial phase of indexing everything.
A day later I started getting lags on keyboard and mouse actions. And some unidentified process took 100% of one of four cores this computer has. I added my $HOME to the Baloo's blacklist and all works better. Only time to time I have bursts of processor usage.  But before that the computer became unusable. It was reacting too slow. 
Seems it is the same case asJakob's.

Previously I had Nepomuk disabled for the very same reson. Can we have semantic search off button back, please? Baloo suppose to improve e-mail search, but I'm using Tunderbird anyway.
Comment 7 Piotr 2014-04-23 18:29:26 UTC
(In reply to comment #6)

> After an upgrade to Kubuntu 14.10 LTS Baloo was running wild, what I thought

I mean 14.04 LTS 64-bit, obviously !!! 
My bad, sorry.
Comment 8 pyrkosz 2014-04-23 18:55:19 UTC
After updating to Kubuntu 14.04 everything goes slow. I saw it was baloo file indexer, but allows it to work for two days thinking that after indexing it will not consume so many resources. After two days I gave up. And there is not gui switch do disable it. I have blacklisted my home dir and added
[Basic Settings]
Indexing-Enabled=false
to ~/.kde/share/config/baloofilerc.
Now everything is as quick as in previous version. I think this is a bug.

My hardware:
AMD Turion(tm) 64 X2 Mobile Technology TL-62
3 GB RAM
Kingston SSDNow V Series 64GB

lsb_release -rd:
Description: Ubuntu 14.04 LTS
Release: 14.04

apt-cache policy baloo:
baloo:
  Zainstalowana: 4:4.13.0-0ubuntu3
  Kandydująca: 4:4.13.0-0ubuntu3
  Tabela wersji:
 *** 4:4.13.0-0ubuntu3 0
        500 http://archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages
        100 /var/lib/dpkg/status
Comment 9 Alan Ezust 2014-04-23 23:44:40 UTC
Suggestion:  monitor the value of the "wait%" that is reported by top and if it's over 10%, pause indexing until it's under that value again.
Comment 10 Martin Steigerwald 2014-04-24 10:50:32 UTC
(In reply to comment #3)
> Hi. Could you please let me know which was the exact process responsible for
> the high io?

You can use atop for this quite nicely. Start atop, press "d" and paste header information and enough information from process table (first 5 to 10 lines).

You can also generate a report of the three processes generating the most disk utilization over time if you allow atop to gather statistics. It writes a value very 10 minutes which is the average of what happened. Some drivers may erraneously report slightly  more 100% for disk usage due to inaccuracies, but the general figures would be good enough I bet.

Here an example from my server. I currently do not let atop gather statistics on my laptop, but I bet I switch it on there as well. It doesn´t do much I/O and the SSDs are likely to be handle that little additional I/O.

mondschein:~> atopsar -D -b 12:00 -e 13:00

mondschein  3.2.0-4-686-pae  #1 SMP Debian 3.2.51-1  i686  2014/04/24

-------------------------- analysis date: 2014/04/24 --------------------------

12:00:02    pid command  dsk% |   pid command  dsk% |   pid command  dsk%_top3_
12:10:02  15063 collectd  72% |  1430 jbd2/dm-   7% | 20784 master     6%
12:20:02  15063 collectd  72% | 20784 master    10% |  1430 jbd2/dm-   6%
12:30:02  15063 collectd  68% | 23754 dovecot    9% |  1434 jbd2/dm-   6%
12:40:02  15063 collectd  74% | 15953 mysqld     7% |  1430 jbd2/dm-   6%


This report goes nicely along with:

mondschein:~> atopsar -d -b 12:00 -e 13:00

mondschein  3.2.0-4-686-pae  #1 SMP Debian 3.2.51-1  i686  2014/04/24

-------------------------- analysis date: 2014/04/24 --------------------------

12:00:02  disk           busy read/s KB/read  writ/s KB/writ avque avserv _dsk_
12:10:02  sda              0%    0.0     6.2     6.1     6.2  35.9   0.16 ms
12:20:02  sda              0%    0.0     7.2     5.2     6.1  29.9   0.30 ms
12:30:02  sda              0%    0.0    19.1     6.6     6.9  37.5   0.16 ms
12:40:02  sda              0%    0.0     4.0     5.3     6.1  48.9   0.15 ms

So here in this case you see the disk utilization does not matter at all :)

In Debian and Kubuntu you can just apt-get install atop.

If you have the process that does most of the I/O you can use

strace -e file -f -o strace.out -p <PID>

to see what it is doing. Beware of attaching strace logs unmodified tough as they contain private data like path and file names.
Comment 11 Wolfgang Mader 2014-04-25 11:50:16 UTC
I brought this up on the kde-user mailinglist [Subject: baloo_file, baloo_file_extractor in disk sleep] and have been asked to add the info here. So this is copy past for the post on the list.

Whenever baloo is at work on my system, I see baloo_file and baloo_file_extractor in disk sleep state.  This always leads to KDE GUI freeze. This was obvious worst when baloo was initially shuffling through all my files, but even now when baloo only needs to update newly changes files (I guess), the GUI hangs  and the processes hit disk sleep. My disk is a traditional spinning rust one, no ssd, but the machine is fairly well powered.

Complete interface hang means
- All running apps, e.g. I can no longer changes tabs in rekonq, switch to a different mail in kmail, switch from kmail to akregator in kontakt

- KWin effects, e.g. cube animation for desktop switch hangs in the middle of the effect

- Text input into e.g. kwrite. Since I am mostly using kde software, I am not sure if gtk is affected also, but I would guess so, since in my view the busy disk is the cause of all evil. 

The mouse pointer is movable all the time, but the system does not respond to clicks.

If you need anything more, please let me know.
Comment 12 Piotr 2014-04-25 12:08:20 UTC
I had all what Wolfgang mentioned, plus:
- switch between KDE activities was slow.
- slowdown on launching any new apps.
Comment 13 Vojtěch Zeisek 2014-04-28 14:55:37 UTC
I have same performance issue with Baloo (running openSUSE 13.1, 64bit). I have still a lot of free RAM and CPU, but disk activity caused by Baloo is causing serious lags almost every minute, so that computer is not usable at all. It doesn't depend on my activity, I see lags when working with KMail or Firefox or LibreOffice or anything else. Even when I do nothing nad just listen music in Amarok, there are lags, so that the music stops for 10 seconds. When I stop Baloo processes, the computer works as supposed. I don't know how to trace what exactly happens, but if someone tells me how, I'd do it.
Comment 14 Vojtěch Zeisek 2014-04-28 18:16:22 UTC
To add iotop output:
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
 7740 idle vojta       0.00 B/s    0.00 B/s  0.00 % 99.99 % baloo_file_extractor 176879 176878 176876 176875 176874 176873 176872 176871 176870 176869
All the time ballo_file_extractor has 100 % usage. The computer is unusable. I tried to lot it run overnight (twice) in hope it index all files, but it isn't better...
Comment 15 Martin Steigerwald 2014-04-28 21:22:39 UTC
(In reply to comment #14)

Vojtěch, I gave some hints that may help in comment #10:
https://bugs.kde.org/show_bug.cgi?id=333655#c10

Sounds strange, I thought baloo would throttle indexing somehow like Nepomuk does in the meanwhile.
Comment 16 Vojtěch Zeisek 2014-04-29 10:33:58 UTC
(In reply to comment #15)

Hi, thank You for advice. See output of $ sudo atop -d -D 300 15 > atop_log From the beginning, I tried to work (update some webs, nothing complicated), but it was nearly impossible, so that I let it run over lunch. After return I killed all baloo processes to be able to work...
Comment 17 Vojtěch Zeisek 2014-04-29 10:34:46 UTC
Created attachment 86340 [details]
logged output of atop
Comment 18 Vojtěch Zeisek 2014-04-29 10:36:19 UTC
Created attachment 86341 [details]
top part of output of atop

It is wide, so don't use line breaking to see it
Comment 19 Martin Steigerwald 2014-04-29 11:59:10 UTC
Thanks, so baloo_file_ext is indeed occupying about 100% of a 95% busy disk:

read      10 | write   8861 |              |  KiB/r     77 |              | KiB/w     10

10 read access of about 77 KiB average per read, but 8861 write accesses of about 10 KiB average per write. Atop is measuring in 10 seconds intervals by default. Which makes it about 886 write access per second. If those are random its no wonder the disk is utilized.

CPU load is not a problem here. Basically as you already told the CPU is mainly waiting for disks, core 4 and core 0 here:

cpu007 w  0%
cpu003 w  1%
cpu004 w 84%
cpu002 w 11%
cpu000 w 81%
cpu006 w  0%
cpu005 w  1%

I bet atop marked disk as read, but nothing else?

This is going on for longer time I bet?

What do the followings commands say?

atopsar -d -b 13:00 -e 14:00
atopsar -D -b 13:00 -e 14:00

replace  begin and end time with a timeframe where atop has been running in the background and baloo has been trashing the disk. I bet one or two hours duration may be enough for now.


Next step would be to find out what this process is actually doing.

As a first step I´d use strace -e file -p <PID of process> -o strace.log

But before: Before you upload anything of it to this bug report that this log may contain private data such as filenames and paths, that I recommend to clean out first.

As to further investigation ideas I leave this to Vishesh.

Setting this to confirmed as several reporters actually found this issue.
Comment 20 Martin Steigerwald 2014-04-29 12:33:53 UTC
Well thanks also for the atop batch output – I missed it initially. Yeah, it is hogging the disk very heavily.

                   *** system and process activity since boot ***
  PID   TID  RDDSK  WRDSK WCANCL  DSK CMD            
    1     -   3.6G  80.2G 34964K  75% systemd        
 1979     -  23.6G  3848K     0K  21% cron           
 3905     - 360.5M   2.2G     0K   2% baloo_file     
 6848     - 14256K 591.2M     0K   1% baloo_file_ext 
 6123     - 17164K 576.2M     0K   1% baloo_file_ext 
 1023     -     0K 237.8M     0K   0% jbd2/dm-3-8    
 1579     -  1568K 223.9M 111.9M   0% freshclam 

Above is since boot.

Then come values like this:

LVM |      cr_home | busy    100% | read     570 | write  49156 | avio 6.00 ms |
LVM | 200S360G_500 | busy      0% | read      25 | write    152 | avio 0.86 ms |
LVM | ocitac-koren | busy      0% | read      25 | write    134 | avio 0.96 ms |
DSK |          sda | busy    100% | read     549 | write  45960 | avio 6.42 ms |
DSK |          sdb | busy      0% | read      25 | write    152 | avio 0.86 ms |
NET | transport    | tcpi     769 | tcpo     890 | udpi     103 | udpo     115 |
NET | network      | ipi     2473 | ipo     1005 | ipfrw      0 | deliv    872 |
NET | eth0      0% | pcki    3245 | pcko    1004 | si   24 Kbps | so    3 Kbps |
NET | lo      ---- | pcki       1 | pcko       1 | si    0 Kbps | so    0 Kbps |

  PID   TID  RDDSK  WRDSK WCANCL  DSK CMD            
 3905     -  1696K 392.2M     0K  58% baloo_file     
 9556     -  3496K 218.6M     0K  33% baloo_file_ext

disk 100% busy and its these two baloo processes with about 600 MiB in 5 seconds. (Timestamps in atop header indicate a 5 second interval.)

Well, I think this does not need any more proof. To me it seems that Ballo basically runs unthrottled? But that doesn´t make that much sense either. Its just reading way less than it writes. I definately believe something is not right there and suggest a strace log as I recommended before. Beware tough that strace writing out the log will cause additional disk I/O load, so better use a filter like -e file as I recommended.
Comment 21 Vojtěch Zeisek 2014-04-29 12:46:56 UTC
(In reply to comment #20)

Hello, OK, sorry for missing strace, but ended up with some error and I wasn't able to solve it at the moment. Even from easy observation: Baloo is running -> computer is terribly laggy and unusable; Baloo is off -> perfect performance. It is powerful computer, but intensive use of HDD (I use HDD for /home and SSD for system) by Baloo makes it unusable... :-( I like file indexing and I wish it at least for KDE PIM, but this is not usable...
Comment 22 Martin Steigerwald 2014-04-29 13:11:20 UTC
Ah, now I see: You used "atop" instead of "atopsar" :) atopsar produces easier to read reports. But in any case, I think its pretty clear that baloo processes hog the disk.

Well, yes, Vojtěch, its clear to me that baloo hogs the disk. But I wonder why. It writes an usual high amount of data, especially compared to the amount of data it reads. An index of files is smaller than the content of all file it indexes. Thus I expect more read traffic, which doesn´t stall the kernel in the same way as asynchronous, buffered writes, than write traffic. Thats what the strace log may shed a light on. But maybe Vishesh has a better idea than that.
Comment 23 Marcin Śliwiński 2014-04-30 01:31:15 UTC
Created attachment 86358 [details]
Strace of baloo_file when Minecraft is running
Comment 24 Marcin Śliwiński 2014-04-30 01:32:22 UTC
Part of issue can be caused by activity in ~/.local/share/baloo/file/ folder when an application that is constantly writing data into files is running. Or by some other operations Baloo does on the disk in a different process when the file change occurs.

For example Minecraft is writing user stats files every 2-4 seconds which triggers a lot of calls to open() in a folder mentioned above. Example on my system value measured by top, game started but on menu screen):

Baloo running, game not started - io wait: 0.2, load average: 0.5, no visible disk activity
Baloo suspended, game running - io wait: 0.2 - 0.3, load average: 1.8, no visible disk activity
Baloo resumed, game running - io wait: jumping between 5 and 15, load 1.9 - 2.6, visible disk activity

Few seconds from strace attached, it's one cycle of operations. Command: strace -p 21954 -e trace=file -t

These operations may also cause additional load during scanning as .tmp files like postlist.tmp, termlist.tmp and position.tmp are created and deleted continously.

Additional information:
mount options: noatime,nodev,nosuid,user_xattr
Distribution: Kubuntu 14.04
CPU: Phenom II II X4 945
RAM: 8GB, DDR2
Baloo Version: 4.13.0-0ubuntu3
/ is on SSD
/home, /var and /tmp are on HDD.
Comment 25 Alfonso Castro 2014-05-02 09:17:50 UTC
Similar problem in a system with openSuSE 13.1 (update from KDE:Current) with 16 GB RAM, 1 SDD 128 GB, 1 HDD 1 TB and Intel i7.
Comment 26 Vishesh Handa 2014-05-02 09:54:14 UTC
Could everyone please try to change their default io scheduler and see if it makes a difference?

I would recommend using the 'cfq' scheduler instead of deadline, which is the default scheduler in Kubuntu. I'm not sure about OpenSuse.

http://stackoverflow.com/questions/1009577/selecting-a-linux-i-o-scheduler
Comment 27 Vojtěch Zeisek 2014-05-02 10:12:19 UTC
cfq is default kernel scheduler for openSUSE, I have been using it...
Comment 28 Martin Steigerwald 2014-05-02 10:44:14 UTC
(In reply to comment #26)

Vishesh, I think Vojtěch proved that Baloo generates an unusual amount of write I/O as you can see in my atop log analysis – especially if related to read I/O. I don´t believe a different I/O scheduler will make much of a difference as a harddisk can only do a certain amount of I/O operations per second. Granted, the Linux kernel sometimes is not particularily good at keeping I/O responsive if it is over saturated, and choosing a different I/O scheduler may help here a bit, yet deadline also is supposed to guarentee a deadline like CFQ. Still the question remains, why that much write I/O?

Vojtěch do you have an strace log meanwhile? I think its important to find out what exactly Baloo is doing here – especially as it does not seem to do this on every system. Also does Baloo have an protection regarding often changed files like Nepomuk got? Will it avoid files still / constantly being written to?
Comment 29 Vojtěch Zeisek 2014-05-02 11:04:32 UTC
Created attachment 86402 [details]
strace of baloo_file_indexer

sudo strace -e file -f -p 15546 -o strace.log
Comment 30 Vojtěch Zeisek 2014-05-02 11:11:35 UTC
Created attachment 86403 [details]
strace of baloo_file_indexer

sudo strace -e file -f -p 14613 -o strace.log
strace was running just about 15 seconds...
Comment 31 Vojtěch Zeisek 2014-05-02 18:23:04 UTC
Created attachment 86412 [details]
strace of baloo_file_indexer

This afternoon openSUSE received updated Baloo with some minor upstream fixes, which were supposed to throttle it little bit. Well, situation is significantly better, but still too far from being good... Anyway, I hope this is the last needed strace log.
Comment 32 pier andre 2014-05-03 09:29:00 UTC
not for me, in my PC=Dell latitude E6510, RAM=8Gb, GPU=GT218 NVS 3100M, CPU=i7 Q 720 @ 1.60GHz, OS=opensuse 13.1 KDE= 4.13.0 baloo version 4.13.0-7.1
I used the new-advanced-baloo-configuration-tool, an it is very ok, to start and stop baloo indexing, but, even leaving baloo indexing active for all the night, the laptop was unusable when baloo indexing is active, my system monitor ksysguard doesn't identify more than 10% baloo cpu usage, but laptop isn't unusable likewise..:-) :-) :-)
Comment 33 Jonathan Verner 2014-05-14 15:26:13 UTC
I seem to be affected by this bug too, the symptoms are very similar to the ones described --- interface lockups (probably due to high disk I/O, no big cpu load)

I tried changing the scheduler to cfq (as suggested in #26), however I did not notice a difference,
although it is hard to tell. The problem is that I have no idea how to debug it and what triggers
the lockups. It seems correlated with me launching kdevelop.  The baloo_file (and the baloo_file_extractor) processes usually do a lot of i/o (as reported by atop and iotop).

I am on Kubuntu 12.04 (with backports, KDE 4.13.0), the / is on a SSD, /home with most
of my files is a on a rather slow rotating harddrive.

Would be very happy to help but have no idea how :-)
Comment 34 lucke 2014-05-14 17:09:14 UTC
Jonathan, could you show us the output of "dstat -c --disk-tps --disk-util -dmgs --top-bio --top-cpu" while you're experiencing lockups?
Comment 35 Martin Steigerwald 2014-05-14 19:03:44 UTC
I saw some write bursts of baloo_file_indexer during builing XML based training slides with xsltproc and fop. I had no time to investigate further, but it seems to me that baloo bursts on currently written files. Maybe it triggers to quickly on (repeated) file changes. I think I will have a look during my next kernel compile. I will update my baloo compile first, tough. My git is from 8th may.

Vojtěch, on the first two straces, I see only accesses to baloo indexing data (beside .kde / .local and so on):

grep "/home" strace.log | egrep -v "\.local|\.kde" | less

Especially the second large, zipped one (28 MiB unzipped). I don´t see what files baloo_file_indexer is indexing… it really seems to burst on accessing index data, and I don´t really understand why. Did you strip anything from the logs for privacy reasons?

The third indeed shows some access to actual files in your home directory and some accesses to files. This seems much more as I´d expect it to see.
Comment 36 Martin Steigerwald 2014-05-14 19:39:37 UTC
Created attachment 86632 [details]
test case for inducing a lot of write activity of baloo

With the attached script with

martin@merkaba:~> find /usr/share/doc -name "*.txt" | wc -l                                              
1001
martin@merkaba:~> TOTALBYTES=0 ; for FILEBYTES in $( find /usr/share/doc -name "*.txt" -printf "%s\n" ) ; do let TOTALBYTES=TOTALBYTES+FILEBYTES ; done ; echo $TOTALBYTES
4057748

baloo_file_indexer writes

martin@merkaba:~/KDE/Test/baloo -> du -sh .
63M     .

(+ the size of the script with is in same directory)

and causes baloo_indexer to burst on writing at the beginning and especially after it has finished. I saw it writing 800 MiB each 10 seconds for about a minute after the script finished.

It clearly shows that the indexing write I/O amount on often accessed files certainly isn´t in relation to the actual I/O activity. Especially in relation to the actual size of the index before and after the operation:

Before:
martin@merkaba:~/.local/share/baloo> du -sh file 
881M    file

After:
martin@merkaba:~/.local/share/baloo> du -sh file
881M    file

It seems that xapian uses fixed sizes files here or grows files in bigger chunks?

During it I saw it between 881 and 905 MiB, but only checked for half a minute in several second intervals.

My system is still quite reponsive, but its a Dual SSD BTRFS RAID 1 machine.

I think baloo updates to often and quickly on frequently accessed files.

This is with baloo git f52b4dbc6296bc6be876256d2915d13d316bbca0.

I think it would be easy to make an atopsar -d log of the activity or just watch iostat -x 1 on the amount of write accesses. I bet it would make a harddisk crawl. Can you test? (Save your important work first :)
Comment 37 Vojtěch Zeisek 2014-05-15 07:44:15 UTC
(In reply to comment #35)
[...]
> Vojtěch, on the first two straces, I see only accesses to baloo indexing data (beside .kde / .local and so on):
> 
> grep "/home" strace.log | egrep -v "\.local|\.kde" | less
> 
> Especially the second large, zipped one (28 MiB unzipped). I don´t see what files baloo_file_indexer is indexing… it really seems to burst on accessing index data, and I don´t really understand why. Did you strip anything from the logs for privacy reasons?
[...]

No, I didn't, I uploaded whole output of strace.
Comment 38 Martin Steigerwald 2014-05-15 09:19:27 UTC
(In reply to comment #37)
> (In reply to comment #35)
> [...]
> > Especially the second large, zipped one (28 MiB unzipped). I don´t see what files baloo_file_indexer is indexing… it really seems to burst on accessing index data, and I don´t really understand why. Did you strip anything from the logs for privacy reasons?
> [...]
> 
> No, I didn't, I uploaded whole output of strace.

Okay – be careful with that in case it contains private stuff :) – well that seems to also hint at that Baloo can create write bursts.
Comment 39 Martin Steigerwald 2014-05-15 09:36:06 UTC
Created attachment 86642 [details]
atop logs with test case from comment #38 with current baloo git

In comment #36 I talked about a testcase I tested with baloo git f52b4dbc6296bc6be876256d2915d13d316bbca0. Well that was not the case. It was tested before Vishesh´s change to limit handling of plain texts.

After compiling and install and starting this current git I now started

atop -w /tmp/atop-baloo-file-stresser.raw 10

which creates raw atop log with measurements each 10 seconds before doing that test case as in comment #36 again. I finished atop afterwards. Note this is a binary log, so you need atop + atopsar to analyse it.

I see:

merkaba:~> atopsar -d -r /tmp/atop-baloo-file-stresser.raw

merkaba  3.15.0-rc5-tp520  #57 SMP PREEMPT Sat May 10 11:10:55 CEST 2014  x86_64  2014/05/15

-------------------------- analysis date: 2014/05/15 --------------------------

11:14:10  disk           busy read/s KB/read  writ/s KB/writ avque avserv _dsk_
11:14:20  sdb             11%    1.6     5.5   606.1    12.4   1.2   0.18 ms
          sda              7%    2.7     8.0   606.3    12.4   1.2   0.11 ms
11:14:30  sdb              2%    0.7    10.3    56.9    11.6   1.2   0.36 ms
          sda              1%    7.0    11.3    56.9    11.6   1.4   0.15 ms
11:14:40  sdb              1%    1.3     8.9    12.9    30.4   1.3   1.02 ms
          sda              0%    4.4     8.5    13.0    30.1   2.6   0.28 ms
11:14:50  sdb              4%    4.9     5.6   173.9    32.3   5.6   0.24 ms
          sda              4%    0.1     4.0   157.0    35.7  17.1   0.25 ms
11:15:00  sdb              0%    0.4    15.0     2.8    44.1   1.8   0.69 ms
          sda              0%    1.0     8.0     2.8    44.1   2.4   0.37 ms
11:15:10  sdb              0%    2.1     9.3     3.2    13.4   1.1   0.72 ms
          sda              0%    0.1     8.0     3.2    13.4   1.7   0.18 ms
11:15:20  sdb              3%   11.3     6.4   318.6     8.8   4.2   0.08 ms
          sda              2%    1.5     4.8   310.3     9.1  12.1   0.08 ms
11:15:30  sdb             28%   18.1     4.7  4008.3     4.1   1.3   0.07 ms
          sda             27%    9.5     4.9  4008.2     4.1   1.6   0.07 ms
11:15:40  sdb             31%    4.3     4.6  4662.6     4.1   1.3   0.07 ms
          sda             35%   11.5     4.4  4664.6     4.1   1.7   0.08 ms
11:15:50  sdb             30%    8.1     4.6  4715.6     4.1   1.3   0.06 ms
          sda             31%   11.3     4.8  4711.4     4.1   1.7   0.07 ms
11:16:00  sdb             33%   17.1     4.9  4781.3     5.3   3.5   0.07 ms
          sda             34%   16.1     4.6  4672.2     5.4   9.5   0.07 ms
11:16:10  sdb             29%   14.8     5.1  4638.5     4.1   1.3   0.06 ms
          sda             30%    4.2     4.9  4638.2     4.1   1.8   0.06 ms
11:16:20  sdb              1%    0.0     0.0    24.5     5.8   1.1   0.34 ms
          sda              0%    0.0     0.0    24.5     5.8   1.6   0.08 ms
11:16:30  sdb              3%    0.0     0.0   303.8     7.5   3.5   0.12 ms
          sda              3%    0.2     4.0   303.9     7.5  18.0   0.08 ms
11:16:40  
11:16:50  sdb              0%    0.0     0.0     0.7    10.9   1.1   2.43 ms
          sda              0%    0.4    10.0     0.7    10.9   1.7   0.27 ms

and

merkaba:~> atopsar -D -r /tmp/atop-baloo-file-stresser.raw

merkaba  3.15.0-rc5-tp520  #57 SMP PREEMPT Sat May 10 11:10:55 CEST 2014  x86_64  2014/05/15

-------------------------- analysis date: 2014/05/15 --------------------------

11:14:10    pid command  dsk% |   pid command  dsk% |   pid command  dsk%_top3_
11:14:20  24865 baloo_fi  54% | 32679 bash      22% | 29515 kworker/   5%
11:14:30  32679 bash      82% | 24889 akonadi_   9% |  2161 mysqld     2%
11:14:40  32679 bash      77% | 29548 icewease  17% |  2161 mysqld     4%
11:14:50  32679 bash      49% |   747 btrfs-tr  21% | 24889 akonadi_  14%
11:15:00  32679 bash      95% |   285 btrfs-tr   4% |  2161 mysqld     0%
11:15:10  32679 bash      96% |   323 systemd-   1% |  2161 mysqld     1%
11:15:20  20623 bash      90% |   747 btrfs-tr   5% | 24889 akonadi_   3%
11:15:30  20187 baloo_fi  65% | 24865 baloo_fi  12% | 29515 kworker/   4%
11:15:40  20187 baloo_fi  77% | 29462 kworker/   4% | 28540 kworker/   3%
11:15:50  20187 baloo_fi  82% | 28173 kworker/   3% | 29462 kworker/   3%
11:16:00  20187 baloo_fi  72% |   747 btrfs-tr   4% | 11763 kworker/   4%
11:16:10  24865 baloo_fi  94% | 28540 kworker/   1% | 29515 kworker/   1%
11:16:20   2161 mysqld    75% | 28540 kworker/  12% | 11762 kworker/  10%
11:16:30    747 btrfs-tr  37% | 24889 akonadi_  30% |  2161 mysqld     7%
11:16:40   2054 plasma-d   0% |  2962 atop       0% |  1679 Xorg       0%
11:16:50   2161 mysqld    40% |  2115 konsole   23% |  2154 akonadis  23%

which is still hefty. It may be a bit better tough.

You can also single step through this with:

atop -r /tmp/atop-baloo-file-stresser.raw

Some keypresses:
- t: one 10 second step forward
- T: one backward
- g: generic display
- d: disk display
- m: memory display
- A: always display most used resource in resource usage percentage column in process list

This can give you a clear picture about baloo_file´s behaviour. In the end if does several write bursts with about 170-180 MiB each and even one with more than 800 MiB. As viewable my SSDs can cope with that.  But remember the interval is 10 seconds. So in the more than 800 MiB case baloo_file easily oversaturates many harddisks with ease. Especially given the amount of  single write requests. If they are partly random, this would task any I/O scheduler it its limits.

I think it would be beneficial if someone with a harddisk repeats this test case. When doing so, please let atop log to a tmpfs filesystem to exclude it from I/O measurements. Writes to tmpfs filesystem circumvent Linux block layer.
Comment 40 Martin Steigerwald 2014-05-15 09:37:21 UTC
Hmmm, copy into text editor with fixed width fonts to read atopsar output.

Would be nice to have those like in:

https://bugzilla.mozilla.org/show_bug.cgi?id=307112
Comment 41 Martin Steigerwald 2014-05-15 09:39:54 UTC
Also note the per process percentages of atopsar -D need to be related to the *amount* of disk utilization. So while bash had the biggest due in the beginning, the actual amount of disk I/O was way lower during that time. So always read atopsar -D together with atopsar -d
Comment 42 Jonathan Verner 2014-05-15 10:16:17 UTC
(In reply to comment #34)
> Jonathan, could you show us the output of "dstat -c --disk-tps --disk-util
> -dmgs --top-bio --top-cpu" while you're experiencing lockups?

A bit hard to do, since when the system is locked up, I can't run the command :) 

What I did is run the command and prepend timestamps to each line... When the system locked up I 
checked the clock and then grep'd the output for the lines around the time of the lockup... Here is the output:

29   2  69   0   0   0|   2     0 |0.40:   0| 104k    0 |4327M 40.0k 2327M 1193M|   0     0 |1312M 7888M|                      |baloo_file_ex 25
28   2  68   2   0   0|  24     0 |0.40:5.20|1148k    0 |4352M 40.0k 2332M 1164M|   0     0 |1312M 7888M|kdevelop   4296k   24k|baloo_file_ex 25
31   2  46  20   0   1|  94  1430 |25.2:60.0|6728k   12M|4376M 40.0k 2337M 1134M|   0     0 |1312M 7888M|kdevelop   3572k   24k|baloo_file_ex 25
30   1  67   1   0   0|   8     0 |   0:2.80| 428k    0 |4400M 40.0k 2338M 1109M|   0     0 |1312M 7888M|kdevelop    428k   12k|baloo_file_ex 25
28   2  69   0   0   0|   0     1 |   0:   0|   0    16k|4425M 40.0k 2338M 1085M|   0     0 |1312M 7888M|kdevelop      0  4096B|baloo_file_ex 25
28   2  70   0   0   0|   0     0 |   0:   0|   0     0 |4452M 40.0k 2337M 1058M|   0     0 |1312M 7888M|skype         0    24k|baloo_file_ex 25
29   4  67   0   0   0|   0     0 |   0:   0|   0     0 |4465M 40.0k 2336M 1046M|   0     0 |1312M 7888M|baloo_file_   0  4472k|baloo_file_ex 25
28   2  70   0   0   0|   0     1 |   0:   0|   0  4096B|4488M 40.0k 2336M 1023M|   0     0 |1312M 7888M|baloo_file_   0  6928k|baloo_file_ex 25
28   1  71   0   0   0|   0     0 |   0:   0|   0     0 |4511M 40.0k 2336M 1000M|   0     0 |1312M 7888M|baloo_file_   0  6856k|baloo_file_ex 25
31   3  67   0   0   0|   0     3 |   0:   0|   0    12k|4544M 40.0k 2339M  965M|   0     0 |1312M 7888M|baloo_file_   0  6968k|baloo_file_ex 25
29   2  69   0   0   0|   0     0 |   0:   0|   0     0 |4605M 40.0k 2337M  906M|   0     0 |1312M 7888M|baloo_file_   0  7536k|baloo_file_ex 25
29   2  69   0   0   0|   1     0 |   0:2.00|  56k    0 |4666M 40.0k 2337M  845M|   0     0 |1312M 7888M|baloo_file_   0  7616k|baloo_file_ex 25
28   1  71   0   0   0|   0     0 |   0:   0|   0     0 |4727M 40.0k 2336M  784M|   0     0 |1312M 7888M|baloo_file_   0  7592k|baloo_file_ex 25
27   2  72   0   0   0|   0     0 |   0:   0|   0     0 |4789M 40.0k 2336M  723M|   0     0 |1312M 7888M|baloo_file_   0  7656k|baloo_file_ex 25
27   1  72   0   0   0|   0     1 |   0:   0|   0  8192B|4847M 40.0k 2336M  665M|   0     0 |1312M 7888M|baloo_file_   0  7440k|baloo_file_ex 25
27   2  71   0   0   0|   0     0 |   0:   0|   0     0 |4905M 40.0k 2336M  606M|   0     0 |1312M 7888M|baloo_file_   0  7456k|baloo_file_ex 25
29   3  57  12   0   0|  37    73 |   0:68.8| 148k   13M|4960M 40.0k 2337M  551M|   0     0 |1312M 7888M|baloo_file_   0  6960k|baloo_file_ex 25
30   8  57   4   0   0|  32    51 |3.20:23.6|2664k 3588k|5016M 40.0k 2340M  492M|   0     0 |1312M 7888M|baloo_file_   0  6616k|baloo_file_ex 25
29   4  43  25   0   0|  48    44 |   0:98.8|4196k   16M|5066M 40.0k 2344M  438M|   0     0 |1312M 7888M|baloo_file_   0  6936k|baloo_file_ex 25
31  25  27  17   0   0|   0    10 |   0:   0|   0    80k|5113M 40.0k 2345M  390M|   0     0 |1312M 7888M|baloo_file_   0  5720k|baloo_file_ex 25
29  28  21  22   0   0|   0     0 |   0:   0|   0     0 |5123M 40.0k 2345M  380M|   0     0 |1312M 7888M|                      |baloo_file_ex 25
28  26  23  23   0   0|   0     0 |   0:   0|   0     0 |5113M 40.0k 2344M  390M|   0     0 |1312M 7888M|baloo_file_   0  5832k|baloo_file_ex 25
28  27  23  23   0   0|   0     9 |   0:2.80|   0  2604k|5113M 40.0k 2344M  390M|   0     0 |1312M 7888M|baloo_file_   0  6784k|baloo_file_ex 25
29  28   7  35   0   0|  29    81 |   0:89.2| 116k   26M|5113M 40.0k 2345M  389M|   0     0 |1312M 7888M|baloo_file_   0  6952k|baloo_file_ex 25
31  35  14  20   0   0|   9     6 |   0:14.4|  36k   20k|5115M 40.0k 2345M  388M|   0     0 |1312M 7888M|baloo_file_   0  6216k|baloo_file_ex 25
31  29  17  23   0   0| 148     0 |0.80:   0|1240k    0 |5115M 40.0k 2346M  386M|   0     0 |1312M 7888M|baloo_file_1240k 5904k|baloo_file_ex 25
29  29  20  23   0   0|  62     0 |   0:   0| 532k    0 |5114M 40.0k 2347M  387M|   0     0 |1312M 7888M|baloo_file_ 532k 5936k|baloo_file_ex 25
28  28  22  23   0   0|  77     2 |0.80:   0| 628k   36k|5114M 40.0k 2348M  386M|   0     0 |1312M 7888M|baloo_file_ 620k 6768k|baloo_file_ex 25
32  27  18  23   0   0| 141     0 |0.40:   0|1156k    0 |5114M 40.0k 2349M  385M|   0     0 |1312M 7888M|baloo_file_1156k 6216k|baloo_file_ex 25
31  27  19  23   0   0|  31     0 |0.40:   0| 272k    0 |5113M 40.0k 2349M  385M|   0     0 |1312M 7888M|baloo_file_ 272k 6160k|baloo_file_ex 25
33  29  16  22   0   0|  93     0 |0.40:   0| 832k    0 |5113M 40.0k 2350M  385M|   0     0 |1312M 7888M|baloo_file_ 832k 5896k|baloo_file_ex 25
30  29  20  21   0   0| 167     0 |1.20:35.6|8680k    0 |5113M 40.0k 2358M  377M|   0     0 |1312M 7888M|baloo_file_ 944k 6888k|baloo_file_ex 25
28  27  22  23   0   0|  79     0 |0.40:   0| 772k    0 |5113M 40.0k 2359M  376M|   0     0 |1312M 7888M|baloo_file_ 772k 6928k|baloo_file_ex 25
28  26  23  23   0   0|  43     0 |0.40:   0| 388k    0 |5113M 40.0k 2359M  375M|   0     0 |1312M 7888M|baloo_file_ 388k 6952k|baloo_file_ex 25
29  27  22  22   0   0|   0     0 |   0:   0|   0     0 |5112M 40.0k 2359M  377M|   0     0 |1312M 7888M|baloo_file_   0  6880k|baloo_file_ex 25
29  26  21  23   0   0|  97     0 |1.60:   0|3200k    0 |5114M 40.0k 2363M  371M|   0     0 |1312M 7888M|baloo_file_3200k 7104k|baloo_file_ex 25
29  28  20  22   0   0|   3     0 |   0:   0|  24k    0 |5114M 40.0k 2363M  371M|   0     0 |1312M 7888M|baloo_file_  24k 5848k|baloo_file_ex 25
31  26   8  35   0   0|   0     2 |   0:   0|   0   136k|5114M 40.0k 2363M  371M|   0     0 |1312M 7888M|baloo_file_   0  7656k|baloo_file_ex 25
31  27   9  33   0   1|  26     0 |   0:   0| 300k    0 |5114M 40.0k 2363M  370M|   0     0 |1312M 7888M|baloo_file_ 260k 8048k|baloo_file_ex 25
30  27   9  34   0   0|   7     0 |   0:   0| 136k    0 |5115M 40.0k 2363M  370M|   0     0 |1312M 7888M|baloo_file_ 136k 7040k|baloo_file_ex 25
29  26  10  35   0   0|   1     0 |   0:   0|8192B    0 |5113M 40.0k 2363M  371M|   0     0 |1312M 7888M|baloo_file_8192B 6960k|baloo_file_ex 25
28  27   0  45   0   0|   0     0 |   0:   0|   0     0 |5113M 40.0k 2363M  371M|   0     0 |1312M 7888M|baloo_file_   0  7248k|baloo_file_ex 25
30  27   0  43   0   0|   0     0 |   0:   0|   0     0 |5114M 40.0k 2363M  371M|   0     0 |1312M 7888M|baloo_file_   0  8176k|baloo_file_ex 25
38  27   0  35   0   0| 237     0 |2.80:10.8| 960k    0 |5114M 40.0k 2365M  369M| 224k    0 |1312M 7888M|baloo_file_   0  6856k|baloo_file_ex 25
32  26   6  35   0   0|   0     0 |   0:   0|   0     0 |5114M 40.0k 2364M  369M|   0     0 |1312M 7888M|baloo_file_   0  6816k|baloo_file_ex 25
32  27   4  37   0   0|   0     0 |   0:   0|   0     0 |5114M 40.0k 2364M  369M|   0     0 |1312M 7888M|baloo_file_   0  7728k|baloo_file_ex 25
34  28   7  31   0   0|  68    34 |   0:20.4| 272k 3312k|5114M 40.0k 2365M  369M| 256k    0 |1312M 7888M|kdevelop    256k   12k|baloo_file_ex 25
17  27   0  56   0   0|   0     7 |   0:7.20|   0  1548k|5115M 40.0k 2365M  367M|   0     0 |1312M 7888M|baloo_file_   0    56k|btrfs-delallo 23
 4  26   0  70   0   0|   2    20 |   0:19.6|8192B 8712k|5115M 40.0k 2366M  367M|   0     0 |1312M 7888M|                      |btrfs-delallo 23
 2  25   0  72   0   0|   0     0 |   0:   0|   0     0 |5115M 40.0k 2366M  367M|   0     0 |1312M 7888M|                      |btrfs-delallo 24
 7  26   0  67   0   0|   0     0 |   0:   0|   0     0 |5115M 40.0k 2366M  367M|   0     0 |1312M 7888M|                      |btrfs-delallo 23
 3  26   0  71   0   0|   0     0 |   0:   0|   0     0 |5115M 40.0k 2364M  368M|   0     0 |1312M 7888M|                      |btrfs-delallo 23
 2  26   0  72   0   0|   0     5 |   0:2.00|   0  3072k|5115M 40.0k 2364M  368M|   0     0 |1312M 7888M|                      |btrfs-delallo 23
 2  26   0  72   0   0|   1    38 |   0:28.4|4096B   17M|5115M 40.0k 2365M  368M|   0     0 |1312M 7888M|                      |btrfs-delallo 23
 2  26   0  73   0   0|   0     0 |   0:   0|   0     0 |5096M 40.0k 2365M  387M|   0     0 |1312M 7888M|                      |btrfs-delallo 23
11  27   0  63   0   0|   5     0 |0.40:   0| 216k    0 |5099M 40.0k 2365M  384M|   0     0 |1312M 7888M|akonadi_ima 216k    0 |btrfs-delallo 23
 3  26   0  71   0   0|   0     0 |   0:   0|   0     0 |5102M 40.0k 2365M  380M|   0     0 |1312M 7888M|                      |btrfs-delallo 24
 3  26   0  71   0   0|   0     0 |   0:   0|   0     0 |5106M 40.0k 2365M  377M|   0     0 |1312M 7888M|firefox       0   176k|btrfs-delallo 23
 2  26   0  72   0   0|   8     0 |   0:2.80|  32k    0 |5109M 40.0k 2365M  373M|  32k    0 |1312M 7888M|                      |btrfs-delallo 23
 3  26   0  72   0   0|   0     0 |   0:   0|   0     0 |5112M 40.0k 2365M  370M|   0     0 |1312M 7888M|                      |btrfs-delallo 23
 4  27   1  68   0   1|  10    96 |1.60:3.60|  40k  864k|5155M 40.0k 2485M  208M|  32k    0 |1311M 7889M|mysqld       32k 2048k|btrfs-delallo 23
Comment 43 Vojtěch Zeisek 2014-05-15 10:38:19 UTC
Created attachment 86645 [details]
atop raw log

I have SSHD disk - 1 TB classical HDD with 7200 RPM speeded by 8 GB flash cache.
$ balooctl start
Today morning I upgraded to 4.13.1 from openSUSE pacakges, I like balooctl function, but the situation remains as tragic as on beginning...
$ atop -w atop-baloo-file-stresser.raw 10
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
^C
It was running over half an hour, I kept all the services run - KMail, PIM, ... - and left the computer as it was unusable for any work...
Comment 44 Martin Steigerwald 2014-05-15 11:14:31 UTC
(In reply to comment #43)
> Created attachment 86645 [details]
> atop raw log
> 
> I have SSHD disk - 1 TB classical HDD with 7200 RPM speeded by 8 GB flash
> cache.
> $ balooctl start
> Today morning I upgraded to 4.13.1 from openSUSE pacakges, I like balooctl
> function, but the situation remains as tragic as on beginning...
> $ atop -w atop-baloo-file-stresser.raw 10

Was that during running my test case shell script I don´t see a bash process active in CPU list or just during *using* the machine?

For some reason atop missed per process disk activity logs and shows the following error on pressing "d" with it:

No disk-activity figures available; request ignored!

Maybe your kernel misses a configuration option for it. Can you try with standard OpenSUSE kernel – in case you use a self-compiled one.

General statistics clearly show that disk is oversaturated for extended periods of time. And extreme example:

LVM |      cr_home | busy    100% | read       0 | write      1 | KiB/r      0 |              | KiB/w     48 | MBr/s   0.00 | MBw/s   0.00 | avq  3358.85 | avio 10000 ms
DSK |          sda | busy    100% | read       0 | write   1140 | KiB/r      0 |              | KiB/w     11 | MBr/s   0.00 | MBw/s   1.29 | avq   143.21 | avio 8.77 ms |

1140 write accesses with about 1,29 MB/s for writing. That must be random I/O considering the disk utilization and insanely high LVM volume latency.

Also atopsar -d shows the disk being busy.

$ atopsar -r atop-baloo-file-stresser.raw -d

veles  3.11.10-7-desktop  #1 SMP PREEMPT Mon Feb 3 09:41:24 UTC 2014 (750023e)  x86_64  2014/05/15

-------------------------- analysis date: 2014/05/15 --------------------------

11:54:03  disk           busy read/s KB/read  writ/s KB/writ avque avserv _dsk_                                                                                           
11:54:13  sdb              0%    0.0     0.0     0.1     4.0   1.0   2.00 ms
          sda            100%    0.0     0.0   116.3    10.9 143.4   8.64 ms
11:54:23  sdb              0%    0.0     0.0     0.6     3.3   1.7   1.50 ms
          sda            100%    0.0     0.0   117.8    11.0 143.0   8.52 ms
11:54:33  sda            100%    0.0     0.0   120.0    10.8 143.9   8.36 ms
11:54:43  sda            100%    0.0     0.0   114.4    11.6 143.2   8.77 ms
11:55:08  sdb              0%    0.0     0.0     0.2     3.0   1.3   1.50 ms
[… it goes on like this …]

Well… it may be nice to see disk I/O process statistic here. But I think the case has been made.

Baloo doesn´t yet work well with frequently appended to files. I may come up with a fio job to simulate random I/O inside a file.

As a temporary work-around I suggest: If you know about any applications that write to files inside home directory – not hidden files or directories – try to exclude these pathes from indexing.
Comment 45 Martin Steigerwald 2014-05-15 11:15:53 UTC
Ah, got an idea. Did you run the atop -w command as regular user? If so, please use root. The dump is without kernel process accounting. That may be needed for the per process I/O statistics.
Comment 46 Vojtěch Zeisek 2014-05-15 11:27:02 UTC
(In reply to comment #45)

Ah, right, sorry, it was without Your script and top as a regular user. I'll do it later today with the script and as root. I use standard openSUSE kernel, no modifications here. Right now I don't use any application writing inside files in home. Well, does LibreOffice counts? ;-) I have two disks in that machine: one HDD for home (encrypted) and second SSD for root (encrypted LVM containing root and swap).
Comment 47 Martin Steigerwald 2014-05-15 12:38:13 UTC
Vojtěch, the script needs some uncompressed text files in /usr/share/doc. And beware: It may eat your disk. Maybe reduce COUNT to 5 depending of amount of

find /usr/share/doc -name "*.txt" | wc -l

or use COUNT=1 or COUNT=2 and raise from there – unless you can let your system sit at it for a while. But you can also stop baloo after say 5 or 10 minutes, I think.

If that what you have is without the script running and you can reproduce it by other means, that may be fine to. Process accounting is important, as then we can see what amount of I/O baloo generates and what other processes generate. This may give hints which processes generate I/O that baloo_file indexes thereafter.

I am pondering about a simple baloo-top which shows which files it indexes at any given moment, but first I write on my article.
Comment 48 Vojtěch Zeisek 2014-05-15 12:49:46 UTC
(In reply to comment #47)

$ find /usr/share/doc -name "*.txt" | wc -l
1515
$ df -h
/dev/mapper/cr_home           917G  679G  239G  74% /home

It writes to the directory, where the script is located files named "text##.txt", right? I think I can try it for some time. In the worst case I just reboot, login as root and clean it.
Comment 49 Martin Steigerwald 2014-05-15 13:02:41 UTC
(In reply to comment #48)
> It writes to the directory, where the script is located files named
> "text##.txt", right?

Yes.

> I think I can try it for some time. In the worst case I
> just reboot, login as root and clean it.

You can Ctrl-C the script, but in worst case, I suggest to stop Baloo, as I seen most I/O after script has finished. If unsure try with COUNT<1 first in the for loop as in

for (( COUNT=0 ; COUNT<1 ; COUNT=COUNT+1 )); do

and see what it does on your system. You can always raise COUNT< again :)
Comment 50 lucke 2014-05-15 14:17:22 UTC
Jonathan, baloo_file_extractor was writing 6-8 MB each second and using 25% of your cpu time (one core fully in a quad-core?). Later it seems btrfs-delalloc joined the fun and was using one core almost fully, it continued to run after baloo_file extractor quit. Only twice your second disk was close to being fully utilized. An apparently silly baloo_file_extractor behaviour, but is it something that should trigger a lockup?

You could run "while; do ps ax | grep extractor; sleep 1; done" while trying to reproduce that baloo_file_extractor activity (lockup), it would give you numbers of files that extractor was working on, balooshow <numbers> would show you information about these files - that way you could see what it was trying to index.

On a related note, I ran baloo_file_extractor on a 20 MB pdf while moving files between partitions - it normally takes less than 10 s to run, with that disk activity going on it took 500 s, so idle io class really works.
Comment 51 Vishesh Handa 2014-05-15 14:45:56 UTC
Whoa. This is quite a lot to read.

For people compiling git on your own. Try running the tests in baloo/src/file/tests/. Specifically the bindexqueuetest.

In Baloo, the indexing is split into 2 phases - Basic and File. Basic is when we do NOT look into the contents of the file. File is when the contents are actually looked into. File indexing is typically done in another process (baloo_file_extractor). It's only when the basic indexing has finished, does the file indexing run.
Comment 52 Martin Steigerwald 2014-05-15 14:52:21 UTC
(In reply to comment #50)
[…]
> On a related note, I ran baloo_file_extractor on a 20 MB pdf while moving
> files between partitions - it normally takes less than 10 s to run, with
> that disk activity going on it took 500 s, so idle io class really works.

Idle I/O class works if the following conditions are met:
- CFQ I/O scheduler (CFQ v3 time sliced), so it won´t work on Ubuntu by default as it uses deadline AFAIK.
- On read access and direct write accesses.

It does *not* work on buffered writes. Thus I think while it may help with read accesses it does not help with the excessive writes, as I think baloo_file indexer uses buffered writes (through page cache).
Comment 53 lucke 2014-05-15 16:43:19 UTC
To be precise, the default kernel in Ubuntu 12.04 uses cfq, 14.04's default kernel uses deadline and 13.04's kernel in 12.04 uses deadline.
Comment 54 Vojtěch Zeisek 2014-05-15 18:41:02 UTC
Created attachment 86654 [details]
atop raw log with baloo stresser running

Done. In 3 terminals in parallel.
Terminal 1:
$ balooctl start
Process took too long killing 
Indexing failed. Trying to determine offending file 
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
baloo_file(7344):  
baloo_file(7344): Could not obtain lock for Xapian Database. This is bad 
baloo_file(7344):  
baloo_file(7344): Could not obtain lock for Xapian Database. This is bad 
baloo_file(7344):  
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
Process took too long killing 
Indexing failed. Trying to determine offending file 
QProcess: Destroyed while process is still running.
baloo_file(7344): Could not obtain lock for Xapian Database. This is bad 
baloo_file(7344):  
baloo_file(7344): Could not obtain lock for Xapian Database. This is bad 
baloo_file(7344):  
baloo_file(7344): Could not obtain lock for Xapian Database. This is bad 
baloo_file(7344):  
baloo_file(7344): Could not obtain lock for Xapian Database. This is bad 
baloo_file(7344):  
QProcess: Destroyed while process is still running.
$ balooctl stop
$ balooctl status
Baloo File Indexer is NOT running
Indexed 186319 / 238895 files
Failed to index 0 files
Terminal2:
$ ./balootest/baloo-file-indexer-stresser.sh
cat : file /usr/share/doc/XXX does not exist
(about 50 such errors from 1515 available files)
Terminal3:
# atop -w atop-baloo-file-stresser.raw 10
^C
Comment 55 Roger 2014-05-22 16:05:45 UTC
How about splitting the package into two: Baloo and libs_baloo, so that those of us who don't want to install baloo (a large portion of the userbase) or even have it on our systems (it used up 2 GB of HD space in a couple hours -- no signs of stopping -- and consumed enormous amounts of CPU) -- can simply *not* install it. Unfortunately, everything KDE now depends on Baloo's libraries, so *uninstalling* Baloo is problematic unless it's split.
Comment 56 Vojtěch Zeisek 2014-05-23 07:44:02 UTC
$ balooctl status
balooctl(6885): Could not obtain lock for Xapian Database. This is bad 
Baloo File Indexer is running
Indexed 187546 / 238899 files

After running overnight it added just about 1000 files...
Comment 57 Julius Schwartzenberg 2014-05-31 12:22:54 UTC
I have the same problem, also with Kubuntu 14.04. My home dir is around 200 GB. Baloo has been running for almost 24 hours non-stop with around 50% CPU.
I just found out that Baloo also created some huge files:
-rw-rw-r-- 1 julius julius 2,3G mei 31 14:19 .local/share/baloo/file/position.DB
-rw-rw-r-- 1 julius julius 733M mei 31 14:19 .local/share/baloo/file/postlist.DB
-rw-rw-r-- 1 julius julius  24M mei 31 14:19 .local/share/baloo/file/record.DB
-rw-rw-r-- 1 julius julius 668M mei 31 14:19 .local/share/baloo/file/termlist.DB

I suppose that should not be happening. Maybe by default Baloo should only index the Documents directory?
Comment 58 Vojtěch Zeisek 2014-05-31 12:39:19 UTC
(In reply to comment #57)
Just to compare (openSUSE 13.1, 64 bit, KDE 4.13.1):
$ ls -la .local/share/baloo/file/
celkem 5643864
drwxr-xr-x 2 vojta users       4096 30. kvě 14.01 ./
drwxr-xr-x 7 vojta users       4096  2. kvě 12.57 ../
-rw-r--r-- 1 vojta users  137999360 30. kvě 13.23 fileMap.sqlite3
-rw-r--r-- 1 vojta users      32768 31. kvě 11.42 fileMap.sqlite3-shm
-rw-r--r-- 1 vojta users          0 30. kvě 14.01 fileMap.sqlite3-wal
-rw-r--r-- 1 vojta users          0 30. kvě 13.23 flintlock
-rw-r--r-- 1 vojta users         28  2. kvě 12.57 iamchert
-rw-r--r-- 1 vojta users      55006 29. kvě 15.31 position.baseA
-rw-r--r-- 1 vojta users 3603914752 30. kvě 13.23 position.DB
-rw-r--r-- 1 vojta users      20854 29. kvě 15.29 postlist.baseA
-rw-r--r-- 1 vojta users 1387896832 30. kvě 13.23 postlist.DB
-rw-r--r-- 1 vojta users      20855 29. kvě 15.31 postlist.tmp
-rw-r--r-- 1 vojta users        440 29. kvě 15.31 record.baseA
-rw-r--r-- 1 vojta users   27418624 30. kvě 13.23 record.DB
-rw-r--r-- 1 vojta users       9511 29. kvě 15.31 termlist.baseA
-rw-r--r-- 1 vojta users  621895680 30. kvě 13.23 termlist.DB
$ du -sh .local/share/baloo/file/
5,4G    .local/share/baloo/file/
$ du -sh ~
679G    /home/vojta
It indexed ~75 % of files. I'm not sure if it is good to have index of size about ~1 % of all the data, but may be it is not so illogical... I don't have comparison of index size with another tools like that (I don't remember how big was Nepomuk's DB), so that I don't know if it is appropriate...
Comment 59 Julius Schwartzenberg 2014-05-31 15:40:10 UTC
Over here, Baloo's index is now 4.6 GB. The total size of my /home partition is 243 GB (235 GB used) and Baloo is still indexing. I don't know how to find out how far along it is.
Comment 60 Vojtěch Zeisek 2014-06-22 05:24:36 UTC
Still no change with KDE 4.13.2, keeping broken state...
Comment 61 Diogo Vinicius Kersting 2014-09-02 19:40:25 UTC
I'm having the same problem (Arch linux 64bits, brtfs).
During boot becomes completely unresponsive, and when I run iotop, the baloo_file process us using about 97% of available IO.
Once I kill the process the boot continues with no major problems.

Is there any more information we can provide to help solving this problem?
Comment 62 Chris Samuel 2014-09-06 23:34:39 UTC
As a data point from someone who isn't seeing this issue (dual disk RAID-1, quad core i7-3770K, 24GB RAM) I've got a Baloo directory of 2.7GB for a home directory of 101GB.  This is with KDE 4.14 under Kubuntu 14.04.
Comment 63 Luke-Jr 2015-05-09 19:30:10 UTC
Another 2 cents: I just found my baloo directory *because* it is 5.6 GB. I have no idea what it is indexing, nor how to use its indexes. It's just wasted space to me.
Comment 64 Alexander Potashev 2016-03-20 14:27:08 UTC
Initial indexing with baloo_file (baloo-5.20) is indeed I/O aggressive.
Comment 65 Michael Freeman 2016-03-27 13:59:05 UTC
It should not be aggressive it should be NICE. Talking of which it has the lowest NICE priority and yet uses 100% of CPU, surely something is wrong there ? But I'm no expert on nice. I had to use "cpulimit" to throttle the process which helped a bit but still get really annoying GUI freezes. Can someone fix this please ? To me its a major KDE bug that can ruin the experience for new users not aware of the indexer and should either be turned OFF by default or removed. I actually still have it running because I'm curious to see if it does eventually provided useful search function on my 51GB of user docs and graphics files and so forth. 

Until then the root of this problem may be in the philosophy of the developers ...

"There is no explicit “Enable/Disable” button any more. We would like to promote the use of searching and feel that Baloo should never get in the users way."

http://vhanda.in/blog/2014/04/desktop-search-configuration/

What does "promote the use of searching" mean ? It's up to the user and only the user what they want or need to use not some "promotion" thing. Also "should never get in the users way" ... well it does, and really badly as well, this problem goes back to 2014 and has still not been fixed. What is going on here ?
Comment 66 Mircea Kitsune 2016-03-27 17:02:37 UTC
Just dropping this here: Yesterday I decided to give another shot to Baloo, so I re-enabled the desktop search setting. It took Baloo nearly 20 hours to index everything on my drive! And during this entire day, my system was nearly unusable: Every process would randomly freeze for several seconds, from any open application to the desktop itself. I was later told this has to do with the I/O scheduler, and that Baloo intensively using the drive causes other processes to have to wait in line.

Whatever the cause, Baloo in its current state is a nightmare... at least if you have a large home partition (I have nearly 800 GB on mine so yeah). I don't think any system component made me hate it this way, because of how much it can mess with your system! At least now that it's indexed everything, things seem to work alright again.
Comment 67 Julius Schwartzenberg 2016-04-23 22:51:31 UTC
On Kubuntu 16.04, the system is unusable out of the box with Baloo running. This appears to be version 5.18. Disabling Baloo allows me to use the system again.

Maybe Baloo should only run on SSDs?
Comment 68 vigleik.angeltveit 2016-09-13 00:46:24 UTC
After upgrading to Kubuntu 16.04 my desktop would also freeze for a few seconds at random times. In particular it did so every time I ran latex.

I had some trouble tracking down the culprit, but now that I have disabled Baloo in the system settings my desktop is once again responsive.
Comment 69 Mahendra Tallur 2016-12-14 15:09:26 UTC
Hi ! I'm using KDE Neon (latest framework 2.29.0 released yesterday with 4* baloo write speed improvement - I tried with previous versions too with the same result).

There is something seriously wrong with Baloo :)

- My laptop is almost 5 year old. It's a decent i5 with a regular HD. 
- It's pretty responsible overall when Baloo is disabled
- I made a clean Neon install except for my documents, which I keep 
- After 2 hours of initial indexing I noticed my system began to crawl (mouse pointer would get completely stuck every now & then for up to a minute)
- balooctl status would tell me 140.000 files out of 180.000 were indexed. The database took less than 2 GB.
- after a whole day of indexing, balooctl status reported only 143.000 files out of 180.000 and a 5.35 GB index !!
-> the system constantly crawls, the index gets bigger very quick and the number of indexed files grow very slowly.

Even if I disable baloo, I have to wait for a long time (many mintes) before it really stops indexing. (sometimes I even have to reboot).

I know indexing is i/o consuming, but it seems something turned wrong when it reached 140.000 files because it was not THAT slow in the first 2 hours.

What can I do to help ? This is a serious problem IMHO.
Shall I file a new bugreport ? How can I check what are the files currently being indexed ?

Also, I remember a few months ago when I tried baloo that I had the same issue and that even after the initial indexing was done (which is not the case now), downloading some files would cause the PC to crawl because they were being indexed at the same time.

I love KDE, this is my last remaining issue :-)
Thanks to the community, cheers !
Comment 70 joergister 2018-05-01 12:49:39 UTC
this bug is a duplicate of https://bugs.kde.org/show_bug.cgi?id=332421
Comment 71 Axel Braun 2018-10-25 12:01:29 UTC
I'm running baloo 5.45.0 on openSUSE Leap 15, and notice that my complete desktop freezes regularly for 1-2 minutes(!). CPU monitor reports during that time 100% Load on both cores, but top does not show any process of a considerable CU load. The problem is more the couple of baloo and akonadi, as iotop shows:

Total DISK READ :      10.37 M/s | Total DISK WRITE :    1060.53 K/s
Actual DISK READ:      10.37 M/s | Actual DISK WRITE:     197.36 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                      
 4497 idle axel        9.73 M/s    0.00 B/s  0.00 % 99.52 % baloo_file_extractor
 2847 idle axel      651.54 K/s 1058.15 K/s  0.00 % 97.97 % akonadi_indexing_agent --identifier akonadi_indexing_agent
   23 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.10 % [kworker/1:1]
 2479 be/4 axel        0.00 B/s    0.00 B/s  0.00 %  0.08 % plasmashell
  849 be/4 root        4.76 K/s    0.00 B/s  0.00 %  0.00 % [xfsaild/sda2]

(interesting percentage calculation of iotop by the way)

System disk is a SSD, data disk is a hybrid 1TB disk with 8G cache.
Dont know if this is the right bug for this, but it looks like. 
Any idea how to tune/tweak the system to prevent these system freezes?
Comment 72 Axel Braun 2018-10-25 12:36:06 UTC
PS: I have configured the search to not index the file content. Thats why the heavy IO surprises me even more
Comment 73 Stefan Brüns 2018-10-31 17:25:19 UTC
This bug report started with the KDE4 version which still used a Xapian DB.

Adding comments to it messes up any attempts to pinpoint current issues. Please open a new bug report if you have any persisting issues.
Comment 74 Axel Braun 2018-11-05 15:22:57 UTC
(In reply to Stefan Brüns from comment #73)
> This bug report started with the KDE4 version which still used a Xapian DB.
> 
> Adding comments to it messes up any attempts to pinpoint current issues.
> Please open a new bug report if you have any persisting issues.

Done: https://bugs.kde.org/show_bug.cgi?id=400704