| Summary: | Session management loses state between sessions. | ||
|---|---|---|---|
| Product: | [Plasma] plasmashell | Reporter: | Unknown <null> |
| Component: | Session Management | Assignee: | Plasma Bugs List <plasma-bugs-null> |
| Status: | CLOSED FIXED | ||
| Severity: | normal | CC: | akontsevich, bugs.kde, imaginator, kairo, kde, KDE, kde, kishore96, natalie_clarius, nate, postix, vaclav.masin, wenzezlaus |
| Priority: | NOR | Keywords: | regression |
| Version First Reported In: | 5.27.11 | ||
| Target Milestone: | 1.0 | ||
| Platform: | openSUSE | ||
| OS: | Linux | ||
| See Also: |
https://bugs.kde.org/show_bug.cgi?id=488520 https://bugs.kde.org/show_bug.cgi?id=491421 |
||
| Latest Commit: | https://invent.kde.org/plasma/plasma-workspace/-/commit/a336dd70bf51cf4d9be8cf1b66b4819b78850838 | Version Fixed/Implemented In: | 6.2.4 |
| Sentry Crash Report: | |||
| Attachments: |
Patch to avoid out-of-bounds access or truncation of magic cookie data in KSMServer
Patch to avoid out-of-bounds access or truncation of magic cookie data in KSMServer |
||
|
Description
Unknown
2024-08-01 15:20:42 UTC
I also seem to have a similar issue. When this happens again, can you check if you have messages like the following in your system logs? ``` $ journalctl -b -0 | grep "Session management error" Aug 04 22:00:34 kishore-thinkpad-e495 kate[4510]: Qt: Session management error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication rejected ... [many more messages like this] ``` Usually this authentication works fine (and the session restores properly), but sometime I get this. Perhaps some sort of race condition? (In reply to Kishore Gopalakrishnan from comment #1) > I also seem to have a similar issue. When this happens again, can you check > if you have messages like the following in your system logs? > > ``` > $ journalctl -b -0 | grep "Session management error" > Aug 04 22:00:34 kishore-thinkpad-e495 kate[4510]: Qt: Session management > error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication > rejected > ... [many more messages like this] > ``` Good hint! Similar, but slightly different, messages here – on all machines affected, one example of many identical: 2024-08-06T01:12:45.845617+02:00 systemname kscreenlocker_greet[25446]: Qt: Session management error: networkIdsList argument is NULL > Usually this authentication works fine (and the session restores properly), > but sometime I get this. Perhaps some sort of race condition? I suspect a bug in the newly implemented option to explicitly manually save and restore sessions … (Which IMHO is conceived quite sub-optimally: The save option is only present when the system is configured to only manually restore sessions – but you have to restart your session to have the "save option" turn up :-( Much better would have been IMHO: You always have the option to manually save your session state with a name of your choice. The automatic saving of the session at logout (name "last session") can be toggled. If there is more than one saved and named session state available, you are offered the choice … ) It seems my bug is different from yours, then. I have now opened a separate report (bug 491421) Have same in 6.1.4. Redirected here from https://bugs.kde.org/show_bug.cgi?id=487912 (In reply to Aleksey Kontsevich from comment #4) > Have same in 6.1.4. Redirected here from > https://bugs.kde.org/show_bug.cgi?id=487912 Some more details here: https://bugs.kde.org/show_bug.cgi?id=487912#c61 > Suspect something still wrong with session restore: > - with Akregator - happens after session restore: Bug 494071 > - with Chromium: if it has multiple windows, after session restore it opens more windows than it had on logout. Every KDE application I start emits this warning on stderr: > Qt: Session management error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication rejected Indeed, it appears as though the apps are failing to register with KSMServer. If I start an app (e.g., KCalc) and then save my current session with a new name… > $ qdbus6 org.kde.ksmserver /KSMServer saveCurrentSessionAs bugtest > $ sed -e '/bugtest/,/^$/!d' ~/.config/ksmserverrc > [LegacySession: bugtest] > count=0 > > [Session: bugtest] > count=0 There's nothing saved in the session. KCalc evidently did indeed fail to authenticate with KSMServer. Here's the relevant bit from an strace of KCalc's startup: > socket(AF_UNIX, SOCK_STREAM, 0) = 14 > getsockopt(14, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 > uname({sysname="Linux", nodename="Crushinator", ...}) = 0 > connect(14, {sa_family=AF_UNIX, sun_path="@/tmp/.ICE-unix/4318"}, 22) = -1 ENOENT (No such file or directory) > close(14) = 0 > socket(AF_UNIX, SOCK_STREAM, 0) = 14 > getsockopt(14, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 > uname({sysname="Linux", nodename="Crushinator", ...}) = 0 > connect(14, {sa_family=AF_UNIX, sun_path="/tmp/.ICE-unix/4318"}, 21) = 0 > fcntl(14, F_SETFD, FD_CLOEXEC) = 0 > write(14, "\0\1\0\0\0\0\0\0", 8) = 8 > read(14, "\0\1\0H\0\0\0\0", 8) = 8 > access("/run/user/1000/iceauth_pnzFrq", R_OK) = 0 > openat(AT_FDCWD, "/run/user/1000/iceauth_pnzFrq", O_RDONLY) = 15 > fstat(15, {st_mode=S_IFREG|0600, st_size=338, ...}) = 0 > read(15, "\0\3ICE\0\0\0&local/Crushinator:@/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20\226J'\364\v\326m~\0\0\0\0\0\0\0\0\0\4XSMP\0\0\0&local/Crushinator:@/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20t\342\30E\351\16i\221\344\250\226W\213\24\252\26\0\3ICE\0\0\0$unix/Crushinator:/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20<S\226\341I\216R\17\34\0\0\0\0\0\0\0\0\4XSMP\0\0\0$unix/Crushinator:/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20<PCu\23\262\377\277\241\301 Q\262d\f\4", 4096) = 338 > read(15, "", 4096) = 0 > close(15) = 0 > write(14, "\0\2\1\1\6\0\0\0\0\0\0\0\0\0\0\0\3\0MIT\0\0\0\3\0001.0\0\0\0\22\0MIT-MAGIC-COOKIE-1\1\0\0\0", 56) = 56 > read(14, "\0\3\0H\1\0\0\0", 8) = 8 > read(14, "\0\0\0\0\0\0\0\0", 8) = 8 > access("/run/user/1000/iceauth_pnzFrq", R_OK) = 0 > openat(AT_FDCWD, "/run/user/1000/iceauth_pnzFrq", O_RDONLY) = 15 > fstat(15, {st_mode=S_IFREG|0600, st_size=338, ...}) = 0 > read(15, "\0\3ICE\0\0\0&local/Crushinator:@/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20\226J'\364\v\326m~\0\0\0\0\0\0\0\0\0\4XSMP\0\0\0&local/Crushinator:@/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20t\342\30E\351\16i\221\344\250\226W\213\24\252\26\0\3ICE\0\0\0$unix/Crushinator:/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20<S\226\341I\216R\17\34\0\0\0\0\0\0\0\0\4XSMP\0\0\0$unix/Crushinator:/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20<PCu\23\262\377\277\241\301 Q\262d\f\4", 4096) = 338 > close(15) = 0 > write(14, "\0\4\1\1\3\0\0\0\20\0\0\0\0\0\0\0<S\226\341I\216R\17\34\0\0\0\0\0\0\0", 32) = 32 > read(14, "\0\0\4\0\7\0\0\0", 8) = 8 > read(14, "\4\1\0\0\3\0\0\0", 8) = 8 > read(14, "*\0MIT-MAGIC-COOKIE-1 authentication rejectedV\0\0\0", 48) = 48 > close(14) = 0 > write(2, "Qt: Session management error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication rejected\n", 107) = 107 Oddly, it appears as though KCalc first attempts to open a Unix socket connection to path "@/tmp/.ICE-unix/4318". That's an invalid path. What it apparently means to do is to connect to an abstract socket named "/tmp/.ICE-unix/4318", but someone forgot to overwrite that ampersat with a null byte. (`socat - ABSTRACT-CONNECT:/tmp/.ICE-unix/4318` does connect successfully, so the abstract socket exists and is listening; KCalc just isn't specifying the correct .sun_path for it.) But no matter: KCalc then falls back to the pathname socket at /tmp/.ICE-unix/4318 and connects successfully. After connecting to the ICE server (which is running inside KSMServer), KCalc sends a ByteOrder declaration and reads the server's ByteOrder declaration. (Why the ICE protocol didn't just specify a fixed byte order is anyone's guess.) Then it opens and reads the magic cookie file located at /run/user/1000/iceauth_pnzFrq, which does indeed exist and was last modified at the time I started my Plasma X11 session. KCalc sends a ConnectionSetup message that offers one version (1.0) with a vendor of "MIT" and a release of "1.0" and one authentication protocol name ("MIT-MAGIC-COOKIE-1"). KCalc reads an AuthenticationRequired reply that requests the MIT-MAGIC-COOKIE-1 authentication protocol. KCalc sends an AuthenticationReply message that contains the magic cookie, which has the correct value as read from /run/user/1000/iceauth_pnzFrq. KCalc reads an Error reply of class AuthenticationRejected, concerning an AuthenticationReply message, of FatalToProtocol severity, with the error message "MIT-MAGIC-COOKIE-1 authentication rejected". So everything looks fine on the client side (other than the mistake in the socket path handling). I am guessing something has become broken in KSMServer. I am suspicious that the ICE magic cookies for my session both end in runs of NUL bytes rather than being a full 16 random bytes. The XSMP magic cookies do not show such oddity: > $ iceauth -f /run/user/1000/iceauth_pnzFrq list > ICE "" local/Crushinator:@/tmp/.ICE-unix/4318 MIT-MAGIC-COOKIE-1 964a27f40bd66d7e0000000000000000 > XSMP "" local/Crushinator:@/tmp/.ICE-unix/4318 MIT-MAGIC-COOKIE-1 74e21845e90e6991e4a896578b14aa16 > ICE "" unix/Crushinator:/tmp/.ICE-unix/4318 MIT-MAGIC-COOKIE-1 3c5396e1498e520f1c00000000000000 > XSMP "" unix/Crushinator:/tmp/.ICE-unix/4318 MIT-MAGIC-COOKIE-1 3c50437513b2ffbfa1c12051b2640c04 Maybe KSMServer is actually expecting magic cookies that haven't had their trailing bytes zeroed out. It could be that there is a use-after-free bug when the iceauth file is being written out during session startup. Perhaps the magic cookie data bytes are getting trashed before they are encoded to hex and written to the file. Notably, KSMServer also fails to clear the padding bytes in the ICE protocol messages it sends. That's bad for protocol extensibility. Created attachment 175571 [details] Patch to avoid out-of-bounds access or truncation of magic cookie data in KSMServer Found the problem. It's actually more laughable than use-after-free. David Edmundson rewrote[1] part of KSMServer to eliminate a dependency on the `iceauth` utility, but he botched the handling of magic cookie data. Specifically, he calls strdup() on a fixed-length character sequence that is not guaranteed to be NUL-terminated and may in fact contain NUL bytes. It's a little amazing that this never crashes KSMServer outright since it is an out-of-bounds memory access. [1] https://invent.kde.org/plasma/plasma-workspace/-/commit/9398f6cf8933055b31506ba155aef2fc2b3561d7 Attached is an untested patch that should fix the issue. I'm rebuilding plasma-workspace with the patch now and will test it out the next time I restart my Plasma session (which could be weeks from now). Created attachment 175572 [details]
Patch to avoid out-of-bounds access or truncation of magic cookie data in KSMServer
Sorry, I forgot that C++ doesn't allow implicit conversions from void*. Revised patch attached.
Thanks for looking into it. I'm not going to argue I botched it, but I'm not sure your comment contains the whole story either
Data we're copying comes from:
(*authDataEntries)[i].auth_data = IceGenerateMagicCookie(MAGIC_COOKIE_LEN);
which is documented at: https://www.x.org/releases/X11R7.7/doc/libICE/ICElib.html
char *IceGenerateMagicCookie(int length);
"The magic cookie returned will be null-terminated."
(In reply to David Edmundson from comment #9) > char *IceGenerateMagicCookie(int length); > "The magic cookie returned will be null-terminated." That would explain why it never crashes. No out-of-bounds access after all. Seems bizarre and pointless to NUL-terminate a character sequence that may contain NUL characters since such a sequence could never correctly be treated as a NUL-terminated string. Presumably this all may also explain why the problem is intermittent. You have a ~11.8% chance of randomly choosing at least one NUL byte among the 32 bytes chosen for the two 16-byte magic cookies. Not quite the "1 out of 5" estimated in Comment #0 but around the right order of magnitude. Actually, there *is* still an out-of-bounds access; it just happens inside IceWriteAuthFileEntry(). You're telling it that auth_data points to an array of auth_data_length==16 characters, yet the actual size of the object allocated by strdup may be less than that. You may be saved only because the minimum size of an object allocated by glibc's heap allocated is 16 bytes. > Seems bizarre and pointless to NUL-terminate a character sequence that may contain NUL characters since such a sequence could never correctly be treated as a NUL-terminated string.
So bizarre you could hardly blame any dev for making incorrect assumptions!
Everything adds up at least, and your patch makes sense given our new findings.
Do you want to make a MR? Or I can and write an attribution to you?
I don't care about attribution. I posted my patch here in the hopes that it would save other people some frustration and that hopefully a developer with commit access would see it and commit a similar or identical fix. That developer is you, I think. :) And yes, I do agree: reading "The magic cookie returned will be null-terminated" absolutely implies that the cookie won't contain NUL bytes. I would even argue, given that the type of IceAuthFileEntry::auth_data is char* (rather than unsigned char*), that it's reasonable to assume that the auth data would be textual. However, I'd also argue that any cursory check of your implementation's output using the iceauth utility should have revealed that the cookies comprise opaque binary data, not text. Anyway, I really don't care to debate blame; the bug is identified, and a fix is known. Moving on. Thanks for your prompt attention. A possibly relevant merge request was started @ https://invent.kde.org/plasma/plasma-workspace/-/merge_requests/4895 (In reply to Matt Whitlock from comment #13) > I don't care about attribution. I posted my patch here in the hopes that it > would save other people some frustration and that hopefully a developer with > commit access would see it and commit a similar or identical fix. That > developer is you, I think. :) Modesty is an adornment but... This wasn't easy to find out and fix. Thus _I_ think that it's only fair enough the commit message contains a reference to your analysis and patch. And to those it may concern: This is a victory of perseverance by Plasma-users. So for the future please - take (more) seriously what users report - don't close bugs prematurely as "resolved/fixed" unless they really are Thanks. This is possibly not relevant to this thread and the latest patch here at all but I guess I'm gonna mention it anyway just in case: I've been struggling with this empty session bug using plasma-workspace-5.27.11.1 since right after upgrading to it. What helped somewhat (though not completely) to resolve the issue for me was the patch on this page: https://bugs.gentoo.org/933435 (in comment #1). The patch on this page doesn't seem to help with my issue at all (tried to apply it twice to the clean plasma-workspace-5.27.11.1 directory every time with the same result). I guess the issue on this page is about something completely different. Sorry for polluting the discussion if that's really the case - just thought I'll let you know. (In reply to Matt Whitlock from comment #8) > Created attachment 175572 [details] > Patch to avoid out-of-bounds access or truncation of magic cookie data in > KSMServer > > Sorry, I forgot that C++ doesn't allow implicit conversions from void*. > Revised patch attached. I rebuilt/installed plasma-workspace-5.27.11.1 with your patch applied in addition to an earlier patch for this problem. Due to the random nature of the bug and its rather seldom occurrence in this version, it's probably too early to say that it's definitely solved. But so far everything is looking good and session-restore works as it should. Thanks for your work and effort! *** Bug 491421 has been marked as a duplicate of this bug. *** Git commit 2c2f7a848006361a1e9d719de05d2ffcc633205c by David Edmundson. Committed on 13/11/2024 at 09:31. Pushed by davidedmundson into branch 'master'. ksmserver: Fix copying of auth data Despite IceGenerateMagicCookie returning null terminated char*, the data itself is pure data may contain null bytes. For this reason we must copy the entire data_length, and not use string functions. This fixes a bug where 1 in 10 logins would not have working session management if the generated data happened to contain a null byte. Thanks to Matt Whitlock kde@mattwhitlock.name M +6 -2 ksmserver/server.cpp https://invent.kde.org/plasma/plasma-workspace/-/commit/2c2f7a848006361a1e9d719de05d2ffcc633205c What version it will be fixed? 6.3? Git commit 8ee3defdf81e9b5ad3edef498d04fca0f4289e4a by David Edmundson, on behalf of David Edmundson. Committed on 18/11/2024 at 10:05. Pushed by davidedmundson into branch 'Plasma/6.2'. ksmserver: Fix copying of auth data Despite IceGenerateMagicCookie returning null terminated char*, the data itself is pure data may contain null bytes. For this reason we must copy the entire data_length, and not use string functions. This fixes a bug where 1 in 10 logins would not have working session management if the generated data happened to contain a null byte. Thanks to Matt Whitlock kde@mattwhitlock.name (cherry picked from commit 2c2f7a848006361a1e9d719de05d2ffcc633205c) c567ada7 ksmserver: Fix copying of auth data Co-authored-by: David Edmundson <kde@davidedmundson.co.uk> M +6 -2 ksmserver/server.cpp https://invent.kde.org/plasma/plasma-workspace/-/commit/8ee3defdf81e9b5ad3edef498d04fca0f4289e4a Git commit a336dd70bf51cf4d9be8cf1b66b4819b78850838 by Kai Uwe Broulik. Committed on 20/11/2024 at 07:52. Pushed by broulik into branch 'Plasma/5.27'. ksmserver: Fix copying of auth data Despite IceGenerateMagicCookie returning null terminated char*, the data itself is pure data may contain null bytes. For this reason we must copy the entire data_length, and not use string functions. This fixes a bug where 1 in 10 logins would not have working session management if the generated data happened to contain a null byte. Thanks to Matt Whitlock kde@mattwhitlock.name (cherry picked from commit 2c2f7a848006361a1e9d719de05d2ffcc633205c) c567ada7 ksmserver: Fix copying of auth data Co-authored-by: David Edmundson <kde@davidedmundson.co.uk> (cherry picked from commit 8ee3defdf81e9b5ad3edef498d04fca0f4289e4a) Co-authored-by: David Edmundson <david@davidedmundson.co.uk> M +6 -2 ksmserver/server.cpp https://invent.kde.org/plasma/plasma-workspace/-/commit/a336dd70bf51cf4d9be8cf1b66b4819b78850838 (In reply to Aleksey Kontsevich from comment #5) > > Suspect something still wrong with session restore: > > - with Akregator - happens after session restore: Bug 494071 > > - with Chromium: if it has multiple windows, after session restore it opens more windows than it had on logout. Above not fixed in 6.2.4. Still happens! Those sound like different issues, and one is even already tracked with a different bugzilla ticket. Please open one for the other issue too. Thanks! (In reply to Nate Graham from comment #25) > Those sound like different issues, and one is even already tracked with a > different bugzilla ticket. Please open one for the other issue too. Thanks! I did it 2 times, not sure which one is it. I think this ticket is the right one, previous name was: [Bug 491130] Session management loses state between sessions. (In reply to Aleksey Kontsevich from comment #27) > I think this ticket is the right one, previous name was: [Bug 491130] > Session management loses state between sessions. I don't know whether 491130 is already fixed in your version. But if so, check your system-log for "Qt: Session management error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication rejected". Only if this coincides with that Chromium-problem 491130 would be the right place to report it. Otherwise I suggest you open a new ticket. (In reply to imaginator from comment #28) > (In reply to Aleksey Kontsevich from comment #27) > > I think this ticket is the right one, previous name was: [Bug 491130] > > Session management loses state between sessions. > > I don't know whether 491130 is already fixed in your version. But if so, > check your system-log for > "Qt: Session management error: Authentication Rejected, reason : > MIT-MAGIC-COOKIE-1 authentication rejected". > > Only if this coincides with that Chromium-problem 491130 would be the right > place to report it. Otherwise I suggest you open a new ticket. > journalctl -b -0 | grep "Session management error" Hint: You are currently not seeing messages from other users and the system. Users in the 'systemd-journal' group can see all messages. Pass -q to turn off this notice. dec 03 01:02:20 Aleksey env[3066]: Qt: Session management error: None of the authentication protocols specified are supported (In reply to Aleksey Kontsevich from comment #29) > (In reply to imaginator from comment #28) > > (In reply to Aleksey Kontsevich from comment #27) > > > I think this ticket is the right one, previous name was: [Bug 491130] > > > Session management loses state between sessions. > > > > I don't know whether 491130 is already fixed in your version. But if so, > > check your system-log for > > "Qt: Session management error: Authentication Rejected, reason : > > MIT-MAGIC-COOKIE-1 authentication rejected". > > > > Only if this coincides with that Chromium-problem 491130 would be the right > > place to report it. Otherwise I suggest you open a new ticket. > > > journalctl -b -0 | grep "Session management error" > Hint: You are currently not seeing messages from other users and the system. > Users in the 'systemd-journal' group can see all messages. Pass -q to > turn off this notice. > dec 03 01:02:20 Aleksey env[3066]: Qt: Session management error: None of the > authentication protocols specified are supported That's a different error-message which probably indicates a different problem. It _may_ be related to 491130 but in order to keep things clearly apart it may be best to open a new ticket. One already opened: Bug 494071. Then let's close this one again. (In reply to Nate Graham from comment #32) > Then let's close this one again. Not agree: initial bug was not about "session management error: Authentication Rejected", but wider cases. So let's rename topic back and fix all the cases. We created a lot, but you closed as duplicates or ignored. Why we should create tickets again and again?! Not fixed! Created the bug: https://bugs.kde.org/show_bug.cgi?id=497520 |