Bug 394360 - kded5 hangs in QProcess::waitForFinished and does not reap setxkbmap
Summary: kded5 hangs in QProcess::waitForFinished and does not reap setxkbmap
Status: REPORTED
Alias: None
Product: frameworks-kded
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 5.45.0
Platform: openSUSE Linux
: NOR normal
Target Milestone: ---
Assignee: David Faure
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-17 08:35 UTC by Jiri Slaby
Modified: 2021-08-24 21:40 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
strace -f of kded5 (packed by xz due to size) (351.98 KB, text/plain)
2018-05-17 12:14 UTC, Jiri Slaby
Details
strace -f of kded5 (packed by xz due to size) (309.04 KB, text/plain)
2018-08-06 11:31 UTC, Jiri Slaby
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jiri Slaby 2018-05-17 08:35:39 UTC
Sometime, when I reconnect my notebook back to docking station, kded5 daemon hangs. I can no longer use global key shortcuts etc.

The stack trace of the thread is as follows:
> #0  0x00007f09368ce276 in __GI_ppoll (fds=fds@entry=0x7ffcd2d514d0, nfds=nfds@entry=4, timeout=<optimized out>, timeout@entry=0x0, sigmask=sigmask@entry=0x0) at ../sysdeps/unix/sysv/linux/ppoll.c:39
> #1  0x00007f0936e7f9f9 in qt_ppoll (timeout_ts=0x0, nfds=4, fds=0x7ffcd2d514d0) at kernel/qcore_unix.cpp:112
> #2  qt_safe_poll (fds=fds@entry=0x7ffcd2d514d0, nfds=4, timeout_ts=<optimized out>) at kernel/qcore_unix.cpp:133
> #3  0x00007f0936db78ab in qt_poll_msecs (timeout=<optimized out>, nfds=<optimized out>, fds=0x7ffcd2d514d0) at ../../include/QtCore/5.10.0/QtCore/private/../../../../../src/corelib/kernel/qcore_unix_p.h:373
> #4  (anonymous namespace)::QProcessPoller::poll (timeout=<optimized out>, this=0x7ffcd2d514d0) at io/qprocess_unix.cpp:189
> #5  QProcessPrivate::waitForFinished (this=this@entry=0x55fae58b8e40, msecs=msecs@entry=-1) at io/qprocess_unix.cpp:841
> #6  0x00007f0936dad76d in QProcess::waitForFinished (this=this@entry=0x7ffcd2d51610, msecs=msecs@entry=-1) at io/qprocess.cpp:1916
> #7  0x00007f0936581b9a in KProcess::execute (this=0x7ffcd2d51610, msecs=-1) at /usr/src/debug/kcoreaddons-5.45.0-1.1.x86_64/src/lib/io/kprocess.cpp:268
> #8  0x00007f090a8ad8d4 in ?? () from /usr/lib64/qt5/plugins/kf5/kded/keyboard.so
> #9  0x00007f090a8aefac in ?? () from /usr/lib64/qt5/plugins/kf5/kded/keyboard.so
> #10 0x00007f090a896763 in KeyboardDaemon::configureKeyboard() () from /usr/lib64/qt5/plugins/kf5/kded/keyboard.so
> #11 0x00007f0936e5a36c in QtPrivate::QSlotObjectBase::call (a=0x7ffcd2d51840, r=0x55fae56981d0, this=0x55fae598b940) at ../../include/QtCore/../../src/corelib/kernel/qobjectdefs_impl.h:378
> #12 QMetaObject::activate (sender=0x55fae5987830, signalOffset=<optimized out>, local_signal_index=<optimized out>, argv=<optimized out>) at kernel/qobject.cpp:3749
> #13 0x00007f090a8ad590 in ?? () from /usr/lib64/qt5/plugins/kf5/kded/keyboard.so
> #14 0x00007f090a8a9a90 in ?? () from /usr/lib64/qt5/plugins/kf5/kded/keyboard.so
> #15 0x00007f0936e294bf in QAbstractEventDispatcher::filterNativeEvent (this=<optimized out>, eventType=..., message=0x55fae553c200, result=0x7ffcd2d51938) at kernel/qabstracteventdispatcher.cpp:466
> #16 0x00007f09273e6b60 in QXcbConnection::handleXcbEvent(xcb_generic_event_t*) () from /usr/lib64/libQt5XcbQpa.so.5
> #17 0x00007f09273e779c in QXcbConnection::processXcbEvents() () from /usr/lib64/libQt5XcbQpa.so.5
> #18 0x00007f0936e5ac52 in QObject::event (this=0x55fae5359c40, e=<optimized out>) at kernel/qobject.cpp:1246
> #19 0x00007f09354e401c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /usr/lib64/libQt5Widgets.so.5
> #20 0x00007f09354eb314 in QApplication::notify(QObject*, QEvent*) () from /usr/lib64/libQt5Widgets.so.5
> #21 0x00007f0936e2c148 in QCoreApplication::notifyInternal2 (receiver=0x55fae5359c40, event=event@entry=0x7f091c003e00) at kernel/qcoreapplication.cpp:1044
> #22 0x00007f0936e2eb25 in QCoreApplication::sendEvent (event=0x7f091c003e00, receiver=<optimized out>) at kernel/qcoreapplication.h:234
> #23 QCoreApplicationPrivate::sendPostedEvents (receiver=receiver@entry=0x0, event_type=event_type@entry=0, data=0x55fae52dcee0) at kernel/qcoreapplication.cpp:1719
> #24 0x00007f0936e2f0a8 in QCoreApplication::sendPostedEvents (receiver=receiver@entry=0x0, event_type=event_type@entry=0) at kernel/qcoreapplication.cpp:1573
> #25 0x00007f0936e84393 in postEventSourceDispatch (s=0x55fae54dab10) at kernel/qeventdispatcher_glib.cpp:276
> #26 0x00007f0932dab277 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
> #27 0x00007f0932dab4b0 in ?? () from /usr/lib64/libglib-2.0.so.0
> #28 0x00007f0932dab53c in g_main_context_iteration () from /usr/lib64/libglib-2.0.so.0
> #29 0x00007f0936e83a0f in QEventDispatcherGlib::processEvents (this=0x55fae53acd40, flags=...) at kernel/qeventdispatcher_glib.cpp:423
> #30 0x00007f0927466781 in ?? () from /usr/lib64/libQt5XcbQpa.so.5
> #31 0x00007f0936e2a95a in QEventLoop::exec (this=this@entry=0x7ffcd2d51f20, flags=..., flags@entry=...) at kernel/qeventloop.cpp:212
> #32 0x00007f0936e33494 in QCoreApplication::exec () at kernel/qcoreapplication.cpp:1332
> #33 0x00007f0927914bf8 in kdemain () from /usr/lib64/libkdeinit5_kded5.so
> #34 0x000055fae46acacf in ?? ()
> #35 0x000055fae46a9cc3 in ?? ()
> #36 0x00007f0936800a87 in __libc_start_main (main=0x55fae46a8d20, argc=5, argv=0x7ffcd2d52498, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcd2d52488) at ../csu/libc-start.c:308
> #37 0x000055fae46aa24a in _start ()

The pid of the KProcess is:
(gdb) p this->pid()
$7 = 13989

which is:
> xslaby   13989  0.0  0.0      0     0 ?        Z    kvě16   0:00  |   \_ [setxkbmap] <defunct>

Well, so KProcess/QProcess waits for setxkbmap to finish which is already a zombie, i.e. died a long time ago.

Bug in QProcess?
Comment 1 Fabian Vogt 2018-05-17 09:06:23 UTC
How well can you reproduce the issue? Can you try to strace -f kded5 while it happens?
Comment 2 Jiri Slaby 2018-05-17 10:56:53 UTC
(In reply to Fabian Vogt from comment #1)
> How well can you reproduce the issue? Can you try to strace -f kded5 while
> it happens?

Not really well. Could you tell me what event triggers the stack trace?

BTW I have just updated to 5.12.5 because I could not find -debuginfo for plasma5-desktop-5.12.4 from Tumbleweed anymore. When this happens again, I will have better stack trace (#8 and #9 filled in the trace too).
Comment 3 Fabian Vogt 2018-05-17 11:06:23 UTC
 (In reply to Jiri Slaby from comment #2)
> (In reply to Fabian Vogt from comment #1)
> > How well can you reproduce the issue? Can you try to strace -f kded5 while
> > it happens?
> 
> Not really well. Could you tell me what event triggers the stack trace?

connect(xEventNotifier, &XInputEventNotifier::newKeyboardDevice, this, &KeyboardDaemon::configureKeyboard);

i.e. plugging in a keyboard.

> BTW I have just updated to 5.12.5 because I could not find -debuginfo for
> plasma5-desktop-5.12.4 from Tumbleweed anymore. When this happens again, I
> will have better stack trace (#8 and #9 filled in the trace too).

I don't think a more detailed stack trace can help here - an strace would be ideal.
Comment 4 Jiri Slaby 2018-05-17 12:14:13 UTC
Created attachment 112709 [details]
strace -f of kded5 (packed by xz due to size)

I cannot reproduce easily. It happened only once during unplug/replug and not in this strace run.
Comment 5 Jiri Slaby 2018-08-06 11:31:11 UTC
Created attachment 114320 [details]
strace -f of kded5 (packed by xz due to size)

Now I reproduced even with strace capturing the process. (It still recurs and is painful.)

13987 is kdeinit.

16233 is the pid of the setxkbmap process which became a zombie.

16213 is the last process which kdeinit called wait4 for. It is another instance of setxkbmap:
> 16213 exit_group(0)                     = ? 
> 16213 +++ exited with 0 +++
> 13987 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16213, si_uid=500, si_status=0, si_utime=0, si_stime=1} ---
> 13987 waitid(P_ALL, 0, {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16213, si_uid=500, si_status=0, si_utime=0, si_stime=0}, WNOHANG|WEXITED|WNOWAIT, NULL) = 0
> 13987 wait4(16213,  <unfinished ...>
> 13987 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], WNOHANG, {ru_utime={tv_sec=0, tv_usec=2524}, ru_stime={tv_sec=0, tv_usec=10098}, ...}) = 16213

setxkbmap is not the only unreaped process. I also saw zombie of xterm pinned to kdeinit (opened via ctrl-alt-t which is a custom shortcut to run xterm).
Comment 6 Fabian Vogt 2018-08-06 13:28:37 UTC
That strace output is very useful:

16233 exit_group(0)                     = ?
16233 +++ exited with 0 +++
13987 <... ppoll resumed> )             = ? ERESTARTNOHAND (To be restarted if no handler)
13987 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=16233, si_uid=500, si_status=0, si_utime=0, si_stime=0} ---
13987 ppoll([{fd=-1}, {fd=-1}, {fd=-1}, {fd=46, events=POLLIN}], 4, NULL, NULL, 8 <unfinished ...>
13990 <... poll resumed> )              = 1 ([{fd=3, revents=POLLIN}])

If I interpret this correctly, kdeinit was currently doing a ppoll (no sigmask, so == poll) and it had no handler, as otherwise it would've done one of the calls in the handler and not continue with ppoll.

This seems to explain it:

13987 rt_sigaction(SIGCHLD, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f0242406110}, NULL, 8) = 0

It's the last call to rt_sigaction in that process, so it removed its signal handler, so it's no surprise it stays a zombie...

Now the question is where the rt_sigaction call comes from. It might be somewhere in forkfd.
Comment 7 Fabian Vogt 2018-08-06 13:43:24 UTC
Please try to run kded5 without the glib event loop:

kquitapp5 kded5; QT_NO_GLIB=1 kded5

If that works fine, it's probably a bug in either glib itself or in its integration.
Comment 8 Jiri Slaby 2018-08-13 08:10:16 UTC
(In reply to Fabian Vogt from comment #7)
> Please try to run kded5 without the glib event loop:
> 
> kquitapp5 kded5; QT_NO_GLIB=1 kded5
> 
> If that works fine, it's probably a bug in either glib itself or in its
> integration.

I don't want to be too conclusive now, but after quite few plugs/unplugs of such kded5:
# grep -z GLIB /proc/`pidof kded5`/environ; echo
QT_NO_GLIB=1

I haven't seen the issue yet.

But to me it seems that it is enough to just re-run kded5 after power on:
kquitapp5 kded5; kded5

I haven't seen the issue with such a restart too. But I only tried few replugs.
Comment 9 Michael Pyne 2018-08-13 08:24:41 UTC
If glib integration is an issue, there's a separate Qt bug we're tracking where timer events can be starved in the glib event loop which may be related, at https://bugs.kde.org/show_bug.cgi?id=230184
Comment 10 Fabian Vogt 2018-08-13 08:27:08 UTC
(In reply to Michael Pyne from comment #9)
> If glib integration is an issue, there's a separate Qt bug we're tracking
> where timer events can be starved in the glib event loop which may be
> related, at https://bugs.kde.org/show_bug.cgi?id=230184

That's unrelated AFAICT - this issue is about glib removing the SIGCHLD handler.
Comment 11 Fabian Vogt 2018-08-13 08:36:56 UTC
Reported upstream as https://bugreports.qt.io/browse/QTBUG-69907
Comment 12 Fabian Vogt 2018-08-13 11:47:35 UTC
Please re-do the strace with "-e trace=rt_sigaction -k" so that we can see which call chain leads to the glib issue. It might be a bug in Qt, glib, kded or any of the currently loaded modules.
Comment 13 Jiri Slaby 2018-08-21 11:05:40 UTC
(In reply to Jiri Slaby from comment #8)
> I don't want to be too conclusive now, but after quite few plugs/unplugs of
> such kded5:
> # grep -z GLIB /proc/`pidof kded5`/environ; echo
> QT_NO_GLIB=1
> 
> I haven't seen the issue yet.
> 
> But to me it seems that it is enough to just re-run kded5 after power on:
> kquitapp5 kded5; kded5

Correct – it is enough to re-run kded5 like this and kded5 no longer spawns children. So neither setxkbmap is exec'ed nor other processes. So unless you have an idea how to run a new (strace'd) instance kded5 which would behave as the one run by the session, I cannot provide the requested info.
Comment 14 Ahmad Samir 2021-08-24 21:40:59 UTC
Might be related to https://mail.kde.org/pipermail/kde-frameworks-devel/2021-January/115829.html