Bug 354250

Summary: "Switch User" sometimes sends computer to Standby
Product: [Plasma] Powerdevil Reporter: Gregor Mi <codestruct>
Component: generalAssignee: Plasma Development Mailing List <plasma-devel>
Status: RESOLVED FIXED    
Severity: normal CC: akarsh.simha, bhush94, kde, koesterreich, mgraesslin, nate, notuxius, oliver.henshaw, plasma-bugs, rdieter
Priority: NOR    
Version: 5.4.2   
Target Milestone: ---   
Platform: openSUSE   
OS: Linux   
Latest Commit: Version Fixed In: 5.5.0
Sentry Crash Report:
Attachments: kded5 log
stacktrace showing that main thread of other user's kded5 is blocked in xcb/xlib
gstack log
extra logging to help understand the problem

Description Gregor Mi 2015-10-23 12:47:16 UTC
This is not always reproducible but these are the steps which I do when it happens:

(- Choose Leave -> Switch user)
- Click on "New session" in krunner
- Confirm the appearing message box
- Computer goes to Standby.

When I wake up the computer by pressing the powerkey, SDDM appears as if nothing had happened. (Though, the list of users is scrolled to the right so that only two of my five users are visible. But that's another issue.)
Comment 1 David Edmundson 2015-10-24 11:40:58 UTC
which distro?
Comment 2 Gregor Mi 2015-10-24 19:31:08 UTC
openSUSE 13.2
Comment 3 Gregor Mi 2015-10-24 19:31:50 UTC
.
Comment 4 Fabian Köster 2015-11-23 13:05:25 UTC
Having a similar problem here on Gentoo Linux with Plasma 5.4 and Plasma 5.5 Beta.

1.) Boot computer into SDDM
2.) Login with user A
3.) From the running Plasma session start a new session for user B
4.) Work more than 10 minutes in user B's session
5.) Logout user B or switch back to user A's session
=> The system suspends

Obviously the problem here is that the activity in user B's session is not accounted as activity in user A's session and therefore after returning to the session which has been inactive for more than 10 minutes (my configured time before standby), the system enters sleep mode.

Proposed solution: When switching sessions, reset inactivity counter. But I am not sure if is worth to do try to fix this for X-based sessions, maybe this problem does not occur on Wayland-sessions anymore? So maybe we just need to be patient as this is not really a huge problem.
Comment 5 Kai Uwe Broulik 2015-11-23 13:44:22 UTC
Seen that too. Looks like PowerDevil resumes and notices the session has, of course, been idle for quite a while and sends it to sleep.
Comment 6 Kai Uwe Broulik 2015-11-23 14:12:33 UTC
Git commit 3c41d4c3eca34c54e0d6a2b8196f9d8e3dd8dc03 by Kai Uwe Broulik.
Committed on 23/11/2015 at 14:10.
Pushed by broulik into branch 'Plasma/5.5'.

Simulate user activity when session becomes active

Sessions usually don't switch themselves. This fixes the computer suspending after
switching to a session that has been idle for a while.
FIXED-IN: 5.5.0

M  +5    -0    daemon/powerdevilpolicyagent.cpp

http://commits.kde.org/powerdevil/3c41d4c3eca34c54e0d6a2b8196f9d8e3dd8dc03
Comment 7 Fabian Köster 2015-11-24 08:10:32 UTC
Nice, thanks!
Comment 8 Fabian Köster 2015-12-10 09:44:55 UTC
The problem still occurs with Plasma 5.5.0 (and powerdevil 5.5.0) on Gentoo. Maybe the fix was not sufficient?
Comment 9 Kai Uwe Broulik 2015-12-20 11:52:02 UTC
*** Bug 356941 has been marked as a duplicate of this bug. ***
Comment 10 Fabian Köster 2015-12-20 11:59:02 UTC
Same with 5.5.1
Comment 11 Gregor Mi 2015-12-29 19:47:41 UTC
Reproduced on 5.5.2
Comment 12 Fabian Köster 2016-06-28 07:38:59 UTC
Still happens in 5.7 Beta (5.6.95)
Comment 13 Kai Uwe Broulik 2016-06-28 11:41:42 UTC
If you can, try https://phabricator.kde.org/D2033 as I don't switch sessions that often :)
Comment 14 Fabian Köster 2016-06-28 17:41:41 UTC
(In reply to Kai Uwe Broulik from comment #13)
> If you can, try https://phabricator.kde.org/D2033 as I don't switch sessions
> that often :)

I applied your patch and tested again, but the issue still happens :-(
Comment 15 Kai Uwe Broulik 2016-06-28 18:06:53 UTC
Meh. I cannot reproduce anymore here. All I get is a bunch of "Idle timeout reached but we're not the active session, not triggering actions" but no suspend/screen dimming.

Can you run kded5 from console on the other session and switch to it after a while? Maybe debug output there can give us a clue what's going on. Look for "powerdevil: Current session is now (in)active" and what's happening around there. Do you have logind?
Comment 16 Fabian Köster 2016-06-28 18:27:47 UTC
(In reply to Kai Uwe Broulik from comment #15)
> Can you run kded5 from console on the other session and switch to it after a
> while? Maybe debug output there can give us a clue what's going on. Look for
> "powerdevil: Current session is now (in)active" and what's happening around
> there.

OK, I will do that and report!

> Do you have logind?

Yes, I am using systemd with logind.

loginctl --version
systemd 226
+PAM -AUDIT -SELINUX +IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP -GCRYPT -GNUTLS +ACL -XZ +LZ4 +SECCOMP +BLKID -ELFUTILS +KMOD +IDN
Comment 17 Fabian Köster 2016-06-28 18:46:25 UTC
Created attachment 99750 [details]
kded5 log

So, thats the requested log of kded5. Hope it helps!
Comment 18 Oliver Henshaw 2016-06-28 19:00:40 UTC
Created attachment 99751 [details]
stacktrace showing that main thread of other user's kded5 is blocked in xcb/xlib

Can you identify the kded5 process PID belonging to the user you're about to switch to with
$ ps aux | grep kded5

Then do
$ pstack PID
(as root or in logged in as the other user via your terminal) and paste the results here?

On Fedora 22 I get the attached result. It looks like powerdevil gets stuck in xcb while waiting for a reply from an Xlib call (one that should return straight away). And I see QXcbConnection::processXcbEvents further up the stack. Maybe there's a problem with nesting the xlib-on-xcb eventloop inside an xcb event?

Whatever's happening, it seems to get unstuck when you actually switch to the other user and all the kidletime events come flooding in. I also see floods of other events in kded, e.g. the software update applet spams me with a load of identical messages telling me how many packages are due for update.

If this is indeed the problem, completing the port of kidletime to xcb looks like part of the long-term solution. I think there were problems when that was tried before but hopefully everything that is called from the xcb event can be ported.
Comment 19 Kai Uwe Broulik 2016-06-28 19:32:53 UTC
@Fabian: Thanks a lot. It seems PowerDevil never actually notices that its no longer the current session, ie. above the gap you can see "Session is now active", and a while down you see it triggering suspend and only thereafter it says again it's now active.

@Oliver: Thanks for the investigation. Indeed, this looks like the cause for the issue. When PowerDevil gets stuck because X blocks it never notices it's no longer in the active session and shit hits the fan.

CC'ing Martin as maintainer of KIdleTime.
Comment 20 Fabian Köster 2016-06-28 19:51:33 UTC
Created attachment 99752 [details]
gstack log

pstack was not available for my platform so I used gstack. Hope it is useful nevertheless.
Comment 21 Oliver Henshaw 2016-06-28 20:04:31 UTC
> @Oliver: Thanks for the investigation. Indeed, this looks like the cause for
> the issue. When PowerDevil gets stuck because X blocks it never notices it's
> no longer in the active session and shit hits the fan.
> 
Actually if you set QT_MESSAGE_PATTERN so you see timestamps, e.g. following https://woboq.com/blog/nice-debug-output-with-qt.html , and compare it to your watch: then you see that the suspend session calls only happen after you've switched to the other user. The "Current session is now active" lines only happen later because (a) that's how the timing of dbus-vs-xcb events works out, and (b) powerdevil does a roundtrip back to the logind dbus service to see what the currently active session is.

So it gets events fired when it wasn't active, but not until it's active again. Then the policy code checks whether it's active and of course it is, so the suspend goes ahead. Then the active-tracking parts of powerdevil realise it's active just after it resumes.
Comment 22 Kai Uwe Broulik 2016-06-28 20:14:44 UTC
Thanks. 
What I get looks ok, safe for the idle event mid-air:

2016-06-28T22:09:55 ACTIVE SESSION PATH CHANGED: "/org/freedesktop/login1/session/_32"
2016-06-28T22:09:55 Current session is now inactive
2016-06-28T22:10:25 Idle timeout reached but we're not the active session, not triggering actions
2016-06-28T22:12:24 Created alarm 18874377
2016-06-28T22:12:24 Idle timeout reached but we're not the active session, not triggering actions
[...]
2016-06-28T22:12:25 ACTIVE SESSION PATH CHANGED: "/org/freedesktop/login1/session/_320"
2016-06-28T22:12:25 Current session is now active

However indeed it should probably unregister all idle events when the session becomes inactive.
Comment 23 Kai Uwe Broulik 2016-06-28 20:38:45 UTC
I just updated the patch at [1] to clear idle timeouts when the session becomes inactive.

[1] https://phabricator.kde.org/D2033
Comment 24 Martin Flöser 2016-06-29 05:50:16 UTC
From comment @18 it looks like X freezes while being on the inactive session. As Fedora is mentioned I assume it's an X not run as root? If X freezes all xcb requests are going to block. From client side there is nothing which can be done about that. There is no timeout and no way to cancel.

I would be interested in seeing a backtrace of X at that time as I assume it's also frozen.
Comment 25 Fabian Köster 2016-06-29 06:03:41 UTC
(In reply to Kai Uwe Broulik from comment #23)
> I just updated the patch at [1] to clear idle timeouts when the session
> becomes inactive.
> 
> [1] https://phabricator.kde.org/D2033

This patch does not solve the issue for me :-/
Comment 26 Oliver Henshaw 2016-06-29 13:35:53 UTC
(In reply to Martin Gräßlin from comment #24)
> From comment @18 it looks like X freezes while being on the inactive
> session. As Fedora is mentioned I assume it's an X not run as root? If X
> freezes all xcb requests are going to block. From client side there is
> nothing which can be done about that. There is no timeout and no way to
> cancel.

Yep, it's xorg-x11-server-Xorg-1.17.4-1.fc22.x86_64 with the server running as non-root.

> 
> I would be interested in seeing a backtrace of X at that time as I assume
> it's also frozen.

# pstack 19966
Thread 2 (Thread 0x7f133d647700 (LWP 19995)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1  0x00007f13401db6da in cnd_wait (mtx=0x27ce4b0, cond=0x27ce4d8) at ../../../../../include/c11/threads_posix.h:154
#2  pipe_semaphore_wait (sema=0x27ce4b0) at ../../../../../src/gallium/auxiliary/os/os_thread.h:259
#3  radeon_drm_cs_emit_ioctl (param=param@entry=0x27ce180) at radeon_drm_winsys.c:636
#4  0x00007f13401dae37 in impl_thrd_routine (p=<optimized out>) at ../../../../../include/c11/threads_posix.h:87
#5  0x00007f134963d555 in start_thread (arg=0x7f133d647700) at pthread_create.c:333
#6  0x00007f1349955ded in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 1 (Thread 0x7f134be499c0 (LWP 19966)):
#0  0x00007f134994c2e3 in select () at ../sysdeps/unix/syscall-template.S:81
#1  0x00000000005935af in WaitForSomething (pClientsReady=pClientsReady@entry=0x2aa4530) at WaitFor.c:227
#2  0x000000000043a3e1 in Dispatch () at dispatch.c:358
#3  0x000000000043e6eb in dix_main (argc=14, argv=0x7fff89842b38, envp=<optimized out>) at main.c:298
#4  0x00007f1349873700 in __libc_start_main (main=0x428ae0 <main>, argc=14, argv=0x7fff89842b38, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff89842b28) at libc-start.c:289
#5  0x0000000000428b19 in _start ()
Comment 27 Martin Flöser 2016-06-29 14:34:30 UTC
both threads are waiting (X can do multi-threading, that's like wow I'm impressed). Though the one is a graphics thread and is somewhere in radeon. That could be an indication for being blocked and waiting for something to happen when resumed. But overall I'm not familiar enough with the code to make a guess.

So I just assume that non-root X is not doing anything when the session is switched away. AFAIU their architecture is very similar to what kwin_wayland does for that situation.
Comment 28 Oliver Henshaw 2016-06-29 14:43:06 UTC
Note that this wasn't a problem on F21, so this is a regression somewhere, not necessarily a misdesign feature. Could be an X regression, could be a kde 4 -> 5 regression, could be a qt 4 -> 5 regression.

Note that I see a similar backtrace for Xorg for the session I'm actually in - that might just be business as usual.
Comment 29 Fabian Köster 2016-06-29 14:47:50 UTC
Please note that I do not use Fedora but Gentoo Linux which still executes X as root.

"ps aux | grep /usr/bin/X" confirms that: 

root      2229  2.2  0.5 407300 89944 tty1     Ssl+ 15:39   1:32 /usr/bin/X -nolisten tcp -auth /var/run/sddm/{c31eee08-f7e1-45f3-974b-14c6c8998aad} -background none -noreset -displayfd 18 vt1
Comment 30 Oliver Henshaw 2016-07-04 15:26:32 UTC
Created attachment 99846 [details]
extra logging to help understand the problem

(In reply to Fabian Köster from comment #14)
> (In reply to Kai Uwe Broulik from comment #13)
> > If you can, try https://phabricator.kde.org/D2033 as I don't switch sessions
> > that often :)
> 
> I applied your patch and tested again, but the issue still happens :-(

Kai's first patch (Diff 1, i.e. https://phabricator.kde.org/D2033?id=4799) seems to work for me, could you try it again with the attached patch on top and post the logs?
Comment 31 Kai Uwe Broulik 2016-09-15 07:17:41 UTC
Git commit ece8adacb35b8897407d92d995f362a6aea644be by Kai Uwe Broulik.
Committed on 15/09/2016 at 07:15.
Pushed by broulik into branch 'master'.

Clear idle timeouts when session becomes inactive

When the session becomes inactive, clear all idle timeouts, so we don't get them delivered
while we're inactive or just when we're about to become active again.

Depending on your setup, PowerDevil might be blocked on X before it gets the "session is now
inactive" signal, so it still suspends when becoming active again :/ It should still make it
a bit less likely (and works for me :D)

Differential Revision: https://phabricator.kde.org/D2033

M  +15   -0    daemon/powerdevilcore.cpp
M  +8    -5    daemon/powerdevilpolicyagent.cpp
M  +1    -0    daemon/powerdevilpolicyagent.h

http://commits.kde.org/powerdevil/ece8adacb35b8897407d92d995f362a6aea644be
Comment 32 Fabian Köster 2016-09-15 08:04:10 UTC
(In reply to Kai Uwe Broulik from comment #31)
> When the session becomes inactive, clear all idle timeouts, so we don't get
> them delivered
> while we're inactive or just when we're about to become active again.
> 
> Depending on your setup, PowerDevil might be blocked on X before it gets the
> "session is now
> inactive" signal, so it still suspends when becoming active again :/ It
> should still make it
> a bit less likely (and works for me :D)

That would be great! Will this patch be included in 5.7.95?
Comment 33 Kai Uwe Broulik 2016-09-15 08:44:32 UTC
I think so. That's why I pushed it this morning, tagging is usually around 10 to 12
Comment 34 Fabian Köster 2016-09-15 16:19:19 UTC
(In reply to Kai Uwe Broulik from comment #33)
> I think so. That's why I pushed it this morning, tagging is usually around
> 10 to 12

Great, will test your patch as soon as I updated my system.
Comment 35 Fabian Köster 2016-09-16 06:52:07 UTC
I can confirm that this bug is fixed in 5.7.95 or at least the probability for it has been reduced.

@Kai
So I guess your patch did help, thank you very much for your persistent effort!
Comment 36 Kai Uwe Broulik 2016-09-30 21:50:21 UTC
*** Bug 369584 has been marked as a duplicate of this bug. ***
Comment 37 Alexander Mentyu 2018-07-23 13:31:22 UTC
Is the issue can be closed?
Comment 38 Nate Graham 2018-07-23 13:35:07 UTC
Let's close it. If anyone an still reproduce this in Plasma 5.13 or later, please re-open and leave a comment.