Bug 282704

Summary: KDE 4.7 is very slow on ext3 root partition
Product: [I don't know] kde Reporter: Nicola Mori <nicolamori>
Component: generalAssignee: Unassigned bugs mailing-list <unassigned-bugs>
Status: RESOLVED WORKSFORME    
Severity: normal CC: acerspyro, adawit, andrew.crouthamel, cfeck, ingo.gb, jtamate, linuzlover, nicolamori, zanetu, zanetu
Priority: NOR Keywords: triaged
Version: 4.7   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Nicola Mori 2011-09-24 17:50:03 UTC
Version:           4.7 (using KDE 4.7.1) 
OS:                Linux

After upgrading to 4.7.0 I noticed a slowdown in my i686 installation. In particular, KDE start is very slow and opening lancelot for the first time after boot makes the system unresponsive for 5-10 seconds. I find similar "short freezes" also with sutdown dialog and some application launch. During these freezes the HDD spins a lot so I suspected some I/O problem, and I finally tracked it down to having an ext3 root partition. I did two fresh installs of Archlinux, one using ext3 as root fs and the other one using ext4: the first is problematic while the latter works perfectly.

Reproducible: Always

Steps to Reproduce:
Do a fresh install of Linux (I tried with Archlinux) using an ext3 root partition, then install KDE and lancelot. Launch KDE and click on lancelot launcher.

Actual Results:  
Startup is slow, lancelot menu pops up very slowly at first launch then it pops up quickly at successive clicks.

Expected Results:  
Startup should be quick and lancelot should pop up quickly also the first time.

Bug report on Archlinux bugtracker:

https://bugs.archlinux.org/task/25585
Comment 1 Fabian Schwarz 2011-09-25 11:53:39 UTC
I'm experiencing the same problem. KDE 4.7 is much slower than 4.6.5, especially on startup!
I don't use lancelot but also an ext3 partition.
Comment 2 ingo 2011-10-05 10:23:41 UTC
For me it is kswapd0 which eats up CPU (vanilla KDE on 64-bit with 2GB RAM, 3.0 kernel) on an ext4 partition. Switched to xfce and everything is fine. Tried to reduce swappiness but KDE remains unusable on that machine (works fine on 4GB, mind).
Comment 3 Nicola Mori 2011-10-05 10:27:31 UTC
@ingo I have 64 bit KDE on an Intel E6600 with 2 GB ram, NVidia 8800 GTS 320 and ext4 root partition. It works like a charm, while on the very same machine KDE on ext3 shows the problems I reported, even on a fresh installation.
Comment 4 Nicola Mori 2011-10-22 15:13:29 UTC
I found a similar bug reporto on Novell's bugzilla:

https://bugzilla.novell.com/show_bug.cgi?id=709265

No mention to ext3 but it seems that also OpenSUSE users are experiencing freezes and slowdown with 4.7.
Comment 5 Nicola Mori 2011-10-22 17:45:31 UTC
Tried also with Kubuntu 11.10 i386, same bad behavior with ext3. The problem is especially visible with lancelota launcher: boot into KDE, click on lancelot launcher and see the different responsiveness on ext3 and ext4.
I really hope this bug report could first or last catch some attention from some KDE dev...
Comment 6 Jaime Torres 2011-11-04 12:40:43 UTC
Do you have /var/tmp in the root partition?
There is where most of the temporary data of the kde session is stored.
I'm just guessing..could this be the reason?

Possible WorkAround.... to mount /var/tmp as a tmpfs.

But let's try to isolate the problem:

within an open konsole, let's quit plasma and start it with strace.

>kquitapp plasma-desktop
(probably you must write this 3 or more times until there is no more plasma)
>strace -f -r -o plasma.strace.log -e trace=open,close,read,write,stat plasma-desktop
Wait some time, to let the strace calm down, then use the lancelot launcher
>kquitapp plasma-desktop
(to stop stracing plasma)
>plasma-desktop
(to work normally)
Doing this in the two filesystems could be possible to know if the file operations take longer in ext3 than ext4 or if there are a lot of not needed file operations.
If the file plasma.strace.log is quite big, keep only the lancelot part.
Comment 7 Jaime Torres 2011-11-04 12:44:46 UTC
better strace option:
>strace -f -r -o plasma.strace.log -e trace=file plasma-desktop
Comment 8 Nicola Mori 2011-11-05 21:32:04 UTC
Jaime, I tried with strace but I'm really confused, it generates tons of info. Anyway, for what I can see they do not differ for what concerns lancelot, so maybe the anomalous disk activity is generated by another process related to lancelot. I can also paste the logs, if you believe it is useful.

When killing and relaunching plasma-desktop I also monitored disk activity with iotop. For ext4 I had basically no disk activity when launching plasma-desktop for the second or third time, due to caching I think, and this sounds fair. But with ext3 I saw a lot of I/O done mainly by kjournald and flush-8:0. I think they are not KDE processes, but maybe some KDE process is triggering them too frequently when on ext3...
I hope these informations mean something to you, I'm completely puzzled...
Comment 9 Jaime Torres 2011-11-07 08:13:53 UTC
If they do not differ for lancelot, then it should be another problem.
I've been looking for disk sync() calls in kde code and I've seen none. Only fsync() calls, that are normal when you want to ensure all the data stored in the memory buffers is actually stored in the disk file (unless they are called in a loop, and I've seen none so far).

Let's try another approach then:
http://www.redhat.com/support/wpapers/redhat/ext3/

Let's change the data option with the three possibilities to see if it is really a problem with ext3 options:

in the /etc/fstab, in the line for the / partition, in the fourth column, add or change (the options are separated with ,):
* data=writeback
* data=ordered
* data=journal
Comment 10 Nicola Mori 2011-11-13 17:05:22 UTC
I think I'm doing something wrong. In /etc/fstab, I had "defaults" on column 4 for the / entry, so I changed it to "defaults,data=writeback". The result was that / was mounted read-only and I was unable to start X/KDE. So I tried with "noauto,rw,exec,data=writeback", and obtained the same read-only mount. Same behavior with "noauto,rw,exec,data=journal". The only combination that worked is "noauto,rw,exec,data=ordered", but it didn't help in resolving my problem with KDE.

I'm sorry, for sure I must be messing up something when modifying /etc/fstab. I don't know what I am doing wrong, but I can try again if you explain to me how to do it properly.
Comment 11 Nicola Mori 2011-11-13 17:13:47 UTC
One more thing about mount options: some time ago I tried to mount the ext3 root partition as ext4 by using the rootfstype kernel parameter and changing ext3 to ext4 in /ext/fstab, but it didn't help.
Comment 12 Nicola Mori 2011-11-13 17:35:12 UTC
I re-read Jaime's first post (#6), and I realized that he suggested to check /var/tmp and that I forgot to do it. I actually had /var/tmp in the root partition on all of my machines, so I tried to mount it as tmpfs on my ext3 machine. It worked! Now KDE startup is as fast as on ext4, lancelot lag has disappeared and the system is much more responsive. 
Still, there must be some problem with ext3 since with ext4 everything is smooth even with /var/tmp in root partition. But anyway it seems we have a workaround...
One more question: is there any potential problem in running KDE with /var/tmp mounted as tmpfs?
Comment 13 Jaime Torres 2011-11-13 18:37:08 UTC
>Is there any potential problem in running KDE with /var/tmp mounted as tmpfs?
The only "problem", I think, is that you'll lose the new mime associations you do, and that the icons cache will be created, every machine restart.

As your ext3 partition only mounts read-write with data=ordered, we can not test the other options.

If you do not want to keep the tmpfs for /var/tmp... There is another option for ext3 that could fix the problem.. (I forgot about it, and I'm using it), It is the noatime option, that makes linux not to update last accesed/modification time of a file.
Comment 14 Nicola Mori 2011-11-13 21:56:56 UTC
I found this on Archlinux wiki:

https://wiki.archlinux.org/index.php/Fstab#tmpfs

It is explicitly advised to not use tmpfs for /var/tmp. I think I will need to search for some other solution... I will try noatime tomorrow...
Comment 15 Nicola Mori 2011-11-14 08:40:26 UTC
I reverted to having /var/tmp in the root partition, and tried to mount with "defaults,noatime". It does not help at all...
Comment 16 Jaime Torres 2011-11-14 18:30:07 UTC
@Nicola

In the strace logs, did you see lots of lines like:
0.000375 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2593, ...}) = 0

If so, I've found another solution (at least for me):
setting the environment variable TZ=":/etc/localtime" does "fix" this for me, and plasma and kdesc feels now faster than before.
Comment 17 Nicola Mori 2011-11-15 08:40:47 UTC
I found a lot of stat64("/etc/localtime" ...), so I booted logging to console, set TZ variable as you suggested and then started X. Same crappy behavior as before.
I tried again to stop plasma-desktop and strace it, and I noticed that plasma startup hangs when it outputs to console this line:

## Loading catalog liblancelot-datamodels 

After 5 seconds it goes over and finishes the startup. Maybe looking at what lancelot does when it loads its catalog would be of some help, what do you think.
BTW, am I wrong or you have been able to reproduce my problem and then fixed it by setting TZ variable, Jaime?
Comment 18 Jaime Torres 2011-11-15 08:58:24 UTC
Yes, I set in /etc/profile.local (in opensuse, may be different in other distributions)
TZ=":/etc/localtime"
export TZ
and everything feels faster.

I can not reproduce your lancelot problem, because now in 4.8 beta, it is as fast as the default launch applet (and I've not seen the line ## Loading catalog liblancelot-datamodels, and I have almost all debug output enabled)

reducing the debug output could also speed a little more kdesc. execute kdebugdialog and disable all debug output.

In two days (new opensuse version) I'll change my root partition to ext4. I'll see then if kdesc is even faster than now.
Comment 19 Nicola Mori 2011-11-15 09:16:11 UTC
Lancelot is only one piece of the problem, maybe the most evident one but also startup and shutdown of KDE take ages. You say that you see no problem in 4.8 beta, let's hope that 4.8 will fix the problem also for me.
Comment 20 Nicola Mori 2011-11-19 16:09:06 UTC
I found the way to mount filesystem as journal or writeback. As explained here:

http://wiki.centos.org/HowTos/Disk_Optimization

"To use any other mode than 'data=ordered' on the root file system, you have to pass the mode to the kernel as a boot parameter, by adding it to the kernel command line: rootflags=data=writeback."

So I did it and I've been able to mount using kriteback and the journal. But it did not help at all, I see the same regardless of any setting.

But in doing this I found a very interesting thing. I set writeback, then rebooted and logged into console. I checked the mount by typing "cat /etc/mtab" and this is what I got for root partition:

/dev/disk/by-uuid/a6010950-6a72-4422-98a3-0e9f7e2a49f7 / ext3 rw,noatime,errors=continue,barrier=1,data=writeback

OK, it looked fine so I typed startx and started KDE. After KDE's start, I checked again /etc/mtab and I was really surprised to see this:

/dev/disk/by-uuid/a6010950-6a72-4422-98a3-0e9f7e2a49f7 / ext3 rw,noatime,data=writeback,commit=0 0 0

It seems that something in the KDE satrtup procedure changes the mount options. I cannot figure out how this can happen, nor if it is some mistake made by me. But I tried many times and the outcome is always the same. I even cannot say if this is the source of my problems or not, but it is quite weird in my opinion. I also checked that this does not happen when using ext4.
Comment 21 Nicola Mori 2011-11-19 16:33:59 UTC
The change of mount parameters happens also with KDE 4.6.5, so it is the correct behavior. Maybe some parts of 4.7 don't like the modified options...
Comment 22 Jaime Torres 2011-11-19 20:14:47 UTC
I'm still on ext3..I need time to do a backup before doing such big step.
This is what I have in /etc/mtab (using two active kde sessions)
/dev/sdb1 / ext3 rw,noatime,errors=continue,user_xattr,acl,commit=5,barrier=1,data=ordered 0 0
And this in /etc/fstab
/dev/disk/by-id/scsi-SATA_WDC_WD20EARS-00_WD-WMAZA3056259-part1 /                    ext3       noatime,acl,user_xattr        1 1
Comment 23 Sergei Andreev 2011-11-20 16:20:08 UTC
*** This bug has been confirmed by popular vote. ***
Comment 24 Jaime Torres 2011-11-21 08:28:02 UTC
I have a question to make...
Do you have to wait around 10 seconds when pressing the open file until the open file dialog is shown?
I've noticed it yesterday (first time, after a forced restart by lack of  electric energy), and do not know if this is related to this problem or not. (no time to further check yet)
Comment 25 Nicola Mori 2011-11-21 09:28:32 UTC
Jaime, do you mean for example in okular? I have big lag problems like the one you described (more than a minute sometimes), but I think they're due to some misconfigured autofs+nfs on my laptop. At work, where the nfs share I mount with autofs is visible, I have no problem at all, but at home it is a pain.
Anyway, I have this problem since long ago, well before 4.7 release.
Comment 26 Nicola Mori 2011-11-26 14:16:46 UTC
Problems persist with 4.8 beta1.
I found a sort of solution/workaround: I formatted a spare partition on my HDD as ext4 and mounted it as /var/tmp. Everything is smooth now, so the  problem is not that / is ext3 but that /var/tmp is ext3.
I hope this information could help some dev...
Comment 27 zanetu 2012-07-05 09:09:15 UTC
Same problem with kde v4.8.4. Startup is rather slow compared to kde v4.5, which I used to work with.
Comment 28 zanetu 2012-07-05 09:12:16 UTC
(In reply to comment #8)
> Jaime, I tried with strace but I'm really confused, it generates tons of
> info. Anyway, for what I can see they do not differ for what concerns
> lancelot, so maybe the anomalous disk activity is generated by another
> process related to lancelot. I can also paste the logs, if you believe it is
> useful.
> 
> When killing and relaunching plasma-desktop I also monitored disk activity
> with iotop. For ext4 I had basically no disk activity when launching
> plasma-desktop for the second or third time, due to caching I think, and
> this sounds fair. But with ext3 I saw a lot of I/O done mainly by kjournald
> and flush-8:0. I think they are not KDE processes, but maybe some KDE
> process is triggering them too frequently when on ext3...
> I hope these informations mean something to you, I'm completely puzzled...

I also observed a lot of I/O done by kjournald and flush-8:0. Just FYI.
Comment 29 Jaime Torres 2012-07-06 07:11:08 UTC
Can you execute the followings commands as root to see what kind of I/O and what processes are causing it?

before you start the slow process (or KDE session)

echo 1 > /proc/sys/vm/block_dump

after it is running

dmesg | awk '/READ/ {sub(/\(.*\):/,"",$2); print $2}' | sort | uniq -c | sort -rn | head
dmesg | awk '/WRITE/ {sub(/\(.*\):/,"",$2); print $2}' | sort | uniq -c | sort -rn | head
dmesg | awk '/dirtied/ {sub(/\(.*\):/,"",$2); print $2}' | sort | uniq -c | sort -rn | head
dmesg | awk '/WRITE|READ|dirtied/ {sub(/\(.*\):/,"",$2); print $2}' | sort | uniq -c | sort -rn | head
echo 0 > /proc/sys/vm/block_dump

to know the processes to blame. 
NOTE: If you not see program names, only numbers, replace $2 with $3.

seen at
http://balajitheone.blogspot.com.es/2011/04/io-wait-load-tracking-to-process.html

The main problem with this bug is that there are no tools to know where are the I/O bottlenecks in a program (or I do not know any). (just like callgrind but for I/O)
Therefore, we do not know (yet) if this is for opening/closing a lot of files, syncing files, or ...
Comment 30 zanetu 2012-07-06 09:49:45 UTC
(In reply to comment #29)
> Can you execute the followings commands as root to see what kind of I/O and
> what processes are causing it?
> 
> before you start the slow process (or KDE session)
> 
> echo 1 > /proc/sys/vm/block_dump
> 
> after it is running
> 
> dmesg | awk '/READ/ {sub(/\(.*\):/,"",$2); print $2}' | sort | uniq -c |
> sort -rn | head
> dmesg | awk '/WRITE/ {sub(/\(.*\):/,"",$2); print $2}' | sort | uniq -c |
> sort -rn | head
> dmesg | awk '/dirtied/ {sub(/\(.*\):/,"",$2); print $2}' | sort | uniq -c |
> sort -rn | head
> dmesg | awk '/WRITE|READ|dirtied/ {sub(/\(.*\):/,"",$2); print $2}' | sort |
> uniq -c | sort -rn | head
> echo 0 > /proc/sys/vm/block_dump
> 
> to know the processes to blame. 
> NOTE: If you not see program names, only numbers, replace $2 with $3.
> 
> seen at
> http://balajitheone.blogspot.com.es/2011/04/io-wait-load-tracking-to-process.
> html
> 
> The main problem with this bug is that there are no tools to know where are
> the I/O bottlenecks in a program (or I do not know any). (just like
> callgrind but for I/O)
> Therefore, we do not know (yet) if this is for opening/closing a lot of
> files, syncing files, or ...

# echo 1 > /proc/sys/vm/block_dump
# dmesg | awk '/READ/ {sub(/\(.*\):/,"",$3); print $3}' | sort | uniq -c | sort -rn | head
    152 chrome
    149 knotify4
    117 Chrome_FileThre
     64 Chrome_DBThread
     63 firefox-bin
     51 Chrome_CacheThr
     36 Chrome_SafeBrow
     19 PAC
      4 Chrome_IOThread
      1 Chrome_ProcessL
# dmesg | awk '/WRITE/ {sub(/\(.*\):/,"",$3); print $3}' | sort | uniq -c | sort -rn | head
    210 kjournald
    174 flush-8:0
     57 firefox-bin
      5 konsole
      3 rs:main
      1 chrome-sandbox
# dmesg | awk '/dirtied/ {sub(/\(.*\):/,"",$3); print $3}' | sort | uniq -c | sort -rn | head
    253 chrome-sandbox
     93 knotify4
     54 Chrome_FileThre
      7 Chrome_CacheThr
      5 firefox-bin
      1 Chrome_DBThread
# dmesg | awk '/WRITE|READ|dirtied/ {sub(/\(.*\):/,"",$3); print $3}' | sort | uniq -c | sort -rn | head
   1212 kjournald
    170 flush-8:0
    120 Chrome_SyncThre
    112 Chrome_FileThre
     33 Chrome_HistoryT
     25 Chrome_DBThread
     18 chrome
     12 firefox-bin
      5 Chrome_CacheThr
      2 WorkerPool/6761
# echo 0 > /proc/sys/vm/block_dump
Comment 31 Jaime Torres 2012-07-13 10:12:38 UTC
(In reply to comment #30)
knotify4 is doing a lot of reads, probably searching for sounds to reproduce.
in systemsettings, disable the sound in the notifications, test and, please, tell if this makes any  difference in speed.
If it does, then knotify4 needs a cache for sound files to reproduce.
If not, we will continue searching for the I/O bottleneck.

Also, do not forget to take a look at:
http://userbase.kde.org/Tutorials
Comment 32 Zane Tu 2012-07-15 02:34:23 UTC
(In reply to comment #31)
> (In reply to comment #30)
> knotify4 is doing a lot of reads, probably searching for sounds to reproduce.
> in systemsettings, disable the sound in the notifications, test and, please,
> tell if this makes any  difference in speed.
> If it does, then knotify4 needs a cache for sound files to reproduce.
> If not, we will continue searching for the I/O bottleneck.
> 
> Also, do not forget to take a look at:
> http://userbase.kde.org/Tutorials

# echo 1 > /proc/sys/vm/block_dump
# dmesg | awk '/READ/ {sub(/\(.*\):/,"",$3); print $3}' | sort | uniq -c | sort -rn | head
    462 firefox-bin
    268 Chrome_FileThre
    202 dolphin
    171 chrome
     79 Chrome_DBThread
     45 Chrome_SafeBrow
      1 block
# dmesg | awk '/WRITE/ {sub(/\(.*\):/,"",$3); print $3}' | sort | uniq -c | sort -rn | head
    298 kjournald
     89 Chrome_FileThre
      3 konsole
      2 rs:main
      1 dmesg
# dmesg | awk '/dirtied/ {sub(/\(.*\):/,"",$3); print $3}' | sort | uniq -c | sort -rn | head
    287 chrome-sandbox
      9 Chrome_FileThre
      2 exe
      2 Chrome_SafeBrow
      2 Chrome_HistoryT
      2 Chrome_DBThread
# dmesg | awk '/WRITE|READ|dirtied/ {sub(/\(.*\):/,"",$3); print $3}' | sort | uniq -c | sort -rn | head
    317 kjournald
    309 Chrome_FileThre
    253 chrome
    187 Chrome_HistoryT
    148 firefox-bin
    125 Chrome_DBThread
    121 dolphin
    105 chrome-sandbox
    103 flush-8:0
      4 rs:main
# echo 0 > /proc/sys/vm/block_dump

The above output were obtained after I selected "No audio output" in "Player Settings" of "Manage Notifications". There wasn't noticeable change in terms of startup speed. After thinking for a while, however, I suppose that kde might not be the only one to blame. The reason is that I have also upgraded chrome and firefox after upgrading to kde 4.8. It is entirely possible that the upgraded versions of chrome and firefox require much more I/O than previous versions. After I chose not to restore chrome and firefox but remaining processes such as dolphin, konsole and kate at startup, the period between the time kde was launched and the time harddisk light went off was around 80 seconds, which is acceptable.
Comment 33 Dawit Alemayehu 2013-06-23 13:15:29 UTC
Anyone still using ext3 for /var/tmp partition still? If so is this still an issue in KDE 4.10?
Comment 34 Nicola Mori 2013-06-23 17:31:42 UTC
Sorry, I moved to an SSD and ext4 so I can't provide informations...
Comment 35 Maxim Therrien 2014-06-22 23:12:06 UTC
I am on Arch, using KDE 4.13.2, and am affected by this bug, too. All KDE apps take a lot of time to respond and hangs a lot. Right-clicking in Dolphin takes a lot of time to answer, so does starting dolphin, opening a file, changing a property... Opening a link in Konversation always hangs it for a few seconds. As I said, the same happens to all KDE applications. I haven't found a way to debug this.
Comment 36 Christoph Feck 2014-07-21 16:19:08 UTC
Maxim, are you sure you are using ext3? What you see might be completely unrelated, and sounds like some blocking issue with dbus or kded. If you can reproduce such a hang, for example in Dolphin, please try to get a backtrace of it. See also https://community.kde.org/Dolphin/FAQ/Freeze
Comment 37 Andrew Crouthamel 2018-09-25 03:55:33 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days, the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information.

For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please set the bug status as REPORTED so that the KDE team knows that the bug is ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 38 Nicola Mori 2018-09-25 07:46:55 UTC
Well, while I appreciate that someone cares about this bug I'd say that it's coming a bit late. After 7 years since my original report, 4 years after the last post, many confirmations and info posted by me and other users... Nevermind, please close the report, I'm no longer affected and probably nobody still uses KDE4 and/or ext3.
Comment 39 Andrew Crouthamel 2018-09-25 13:46:56 UTC
Thank you for the update, I'll close this out.