Bug 261180

Summary: dbus-daemon runaway at 100% CPU
Product: [Frameworks and Libraries] kdelibs Reporter: Russell Adams <demo>
Component: kdeuiAssignee: Steven M. Parrish <smparrish>
Status: RESOLVED FIXED    
Severity: normal CC: billm, cfeck, dantti12, kde-bugs, kdebugreport, phong, wjones
Priority: NOR    
Version: 4.6   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: [kdelibs] delete DBus object in KStatusNotifierItem

Description Russell Adams 2010-12-24 21:31:01 UTC
Version:           unspecified (using KDE 4.5.1) 
OS:                Linux

Frequent problems with dbus-daemon entering a runaway state, consuming 100% of one core for extended durations. KDE applications fail to open or react in a timely manner while this is happening. Screensaver never happens.

Haven't been able to identify a particular usage pattern. Using Xmonad as KDEWM, launching KDE apps and using KDE's screensaver. Typically takes several days to a week to reoccur.

Reproducible: Always

Steps to Reproduce:
Seems linked to login duration, running KDE apps (or dbus enabled apps) include: Firefox, Amarok, Pidgin, Network Manager. Common non-KDE apps include urxvt and emacs.

Actual Results:  
One core at sustained 100% cpu. dbus-daemon highest process in "top" output. Killing dbus-daemon clears CPU utilization, but then all KDE and dbus applications are flakey or fail to run until logout / login.

Expected Results:  
dbus-daemon should not runaway.

ps -ef output for dbus process:

rladams   2461     1 10 Dec19 ?        11:24:48 /bin/dbus-daemon --fork --print-pid 5 --print-address 7 --sessio

That 11:24 isn't process start time, its total CPU consumed...
Comment 1 Christoph Feck 2010-12-25 14:57:16 UTC
Do you see any other application running at high CPU usage while dbus-daemon consumes CPU? If it is only the daemon, it might be a D-Bus bug.
Comment 2 Russell Adams 2010-12-25 15:26:22 UTC
No, in fact I'm having the issue now, here's a paste from top:

top - 08:21:44 up 5 days, 11:01,  1 user,  load average: 2.57, 2.50, 2.60
Tasks: 335 total,   3 running, 331 sleeping,   0 stopped,   1 zombie
Cpu(s): 15.4%us, 11.8%sy,  0.0%ni, 72.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   3850548k total,  3766888k used,    83660k free,   314860k buffers
Swap:  3903484k total,   143300k used,  3760184k free,  1928040k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                           
 2461 rladams   20   0 55492  20m  616 R  100  0.5   1739:13 dbus-daemon                                        
 1635 root      20   0  276m  49m  18m S    4  1.3 131:04.94 Xorg                                               
13208 rladams   20   0  235m  48m  16m R    2  1.3   5:00.01 npviewer.bin                                       
 2709 rladams    9 -11  344m 6188 5048 S    1  0.2   3:40.21 pulseaudio                                         
17826 rladams   20   0  415m  25m 5256 S    1  0.7  31:03.51 sqliteman                                          
 2704 rladams   20   0  257m 8040 6080 S    1  0.2  73:52.91 gkrellm                                            
29170 rladams   20   0  467m  19m 9740 S    1  0.5   0:03.72 VirtualBox                                         
 1021 root      20   0     0    0    0 S    0  0.0  68:29.03 rtl819xSE/2                                        
 26

It may be a dbus bug, unfortunately I'm not sure where to begin troubleshooting. I've tried tapping tools like dbus-monitor, but they don't show any activity. It seems to be exclusively KDE applications affected by this, so I can't tell if dbus has an issue or a KDE app is spamming it. I'm open to suggestions.

I have noticed that sometimes if I close a few KDE apps that use dbus, it will stop briefly. When I try to reopen them it starts again, and always fewer applications until I'm forced to logout / login to clear the issue. I've only been up 5 days at the moment, logout is a major distraction...

Thanks.
Comment 3 Christoph Feck 2010-12-25 16:45:22 UTC
> It seems to be exclusively KDE applications affected

Sure, because KDE applications use D-Bus a lot. So if the daemon (which is reponsible to transfer messages cross applications or from the system to applications) hangs, KDE applications lock up, too.

The other entries I am seeing in your "top" output are not KDE applications, so I would suggest your report this bug to https://bugs.freedesktop.org/

Of course, a log of D-Bus activity would be helpful, but the people there can be more helpful to explain how to get such.
Comment 4 Warren Jones 2011-02-16 00:14:25 UTC
I've observed the same problem with KDE 4.5.95-3.1 (openSUSE 11.4 milestone
6).  I monitored dbus-daemon with strace, and found this sequence repeated over
and over:

------------------------ strace.out ------------------------
clock_gettime(CLOCK_MONOTONIC, {503730, 307255985}) = 0
poll([{fd=3, events=POLLIN}, {fd=5, events=POLLIN}, {fd=6, events=POLLIN},
{fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12,
events=POLLIN}, {fd=15, events=POLLIN}, {fd=14, events=POLLIN}, {fd=16,
events=POLLIN}, {fd=13, events=POLLIN}, {fd=17, events=POLLIN}, {fd=19,
events=POLLIN}, {fd=21, events=POLLIN}, {fd=18, events=POLLIN}, {fd=22,
events=POLLIN}, {fd=20, events=POLLIN}, {fd=24, events=POLLIN}, {fd=25,
events=POLLIN}, {fd=26, events=POLLIN}, {fd=27, events=POLLIN}, {fd=28,
events=POLLIN}, {fd=29, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31,
events=POLLIN}, {fd=32, events=POLLIN}, {fd=33, events=POLLIN}, {fd=34,
events=POLLIN}, {fd=35, events=POLLIN}, {fd=37, events=POLLIN}, {fd=38,
events=POLLIN}, {fd=39, events=POLLIN}, ...], 1019, -1) = 1 ([{fd=3,
revents=POLLIN}])
clock_gettime(CLOCK_MONOTONIC, {503730, 310057000}) = 0
accept4(3, 0xbfaacfec, [16], SOCK_CLOEXEC) = -1 EMFILE (Too many open files)
fcntl64(-1, F_GETFD)                    = -1 EBADF (Bad file descriptor)
------------------------ strace.out ------------------------

Note the error "EMFILE (Too many open files)".  I found that dbus-daemon had
1024 files open:

    eg> ls /proc/$DBUS_PID/fd | wc -l
    1024

It looks like dbus-daemon has hit the limit for open files:

    eg> ulimit -n
    1024

This looks like a report of the same problem:

    https://bugs.launchpad.net/ubuntu/+source/dbus/+bug/381063

This report includes a suggested work-around:  increasing the limit on open
files.  This seems like a questionable idea, since dbus doesn't normally come
anywhere close to the limit.

I looked for other applications that had a lot of open files open, and found
that "kpackagekitsmarticon" had 976.  This seems to be an applet that sits in
the system tray at tells you if software updates are available.  I killed the
applet, and found that dbus-daemon returned to normal.

I'm not sure if the problem is with dbus-daemon, kpackagekitsmarticon, or some
interaction between the two.  Perhaps other applications can open too many dbus
connections.  For the time being, my workaround is to uninstall kpackagekit.

Found another report that confirms a problem with kpackagekitsmarticon here:

    https://bugzilla.redhat.com/show_bug.cgi?id=667787
Comment 5 Warren Jones 2011-02-18 00:53:18 UTC
After running more or less normally for 24 hours, kpackagekitsmarticon starts opening a new socket every six minutes.  At that rate, it will take about four days to reach the ulimit of 1024 open file descriptors (and so will dbus-daemon, which is connected to each of those sockets).
Comment 6 Christoph Feck 2011-02-18 01:06:19 UTC
Thanks for the information. Hopefully the KPackageKit maintainers can reproduce the problem and find a fix.
Comment 7 Warren Jones 2011-02-18 01:19:43 UTC
The bug in kpackagekitsmarticon has been reported here:

     https://bugs.kde.org/show_bug.cgi?id=266213

It's clear that the primary bug is in kpackagekit, but would it be reasonable for dbus-daemon to limit the number of connections it allows from any one process?  Or perhaps dbus-daemon could selectively close connections when it gets starts getting EMFILE errors?
Comment 8 Christoph Feck 2011-02-18 01:31:19 UTC
Sure, please report that to the D-Bus developers. See comment #3 for the link.
Comment 9 michel kollenhoven 2011-04-22 16:25:07 UTC
hi i want to say 3 things: 
1. ill report for dbus
2. this bug is now verified(only change name to kpackagekitsmarticon open too many files
3. i use this line to find the prosess that have manny open files
ls /proc/| awk '{if($1+0==0) print "  "; else system("echo `ls /proc/"$1+0"/fd|wc -l` "$1" `cat /proc/"$1+0"/cmdline` ")}' | sort -nr | head
top has most open files, 
format files | pid | cmdline
Comment 10 Daniel Nicoletti 2011-05-27 13:58:34 UTC
After digging a bit on this the cause was actually easy to find, KStatusNotifierItem does not properly releases it's connection to DBus.
When I comment the code that creates a KStatusNotifierItem and then deletes it I see that the socket list keeps growing one for each KStatusNotifierItem. When I remove the code that creates a KStatusNotifierItem the FDs don't grow anymore.

I'm reassigning the bug to kdelibs.
Comment 11 Christoph Feck 2011-05-27 16:13:47 UTC
Created attachment 60386 [details]
[kdelibs] delete DBus object in KStatusNotifierItem

Thanks Daniel for the investigation. Can you test this patch for kdelibs?
Comment 12 Daniel Nicoletti 2011-05-27 20:57:32 UTC
(In reply to comment #11)
> Created an attachment (id=60386) [details]
> [kdelibs] delete DBus object in KStatusNotifierItem
> 
> Thanks Daniel for the investigation. Can you test this patch for kdelibs?

No problem, I'll try to test this next week (if I find some time :P ) as I don't have a trunk setup for testing... But probably that patch should fix the issue.
Comment 13 Aaron J. Seigo 2011-06-20 12:07:02 UTC
Git commit d6cc53cd80e326bfea9382b475977e0184a4223b by Aaron Seigo.
Committed on 20/06/2011 at 12:05.
Pushed by aseigo into branch 'master'.

close the dbus connection properly

BUG:261180
REVIEW:101527

M  +2    -0    kdeui/notifications/kstatusnotifieritemdbus_p.cpp     

http://commits.kde.org/kdelibs/d6cc53cd80e326bfea9382b475977e0184a4223b
Comment 14 Daniel Nicoletti 2011-06-20 20:33:40 UTC
*** Bug 266213 has been marked as a duplicate of this bug. ***
Comment 15 Christoph Feck 2011-08-04 21:42:23 UTC
*** Bug 279377 has been marked as a duplicate of this bug. ***