Bug 491130 - On X11, session management error: Authentication Rejected
Summary: On X11, session management error: Authentication Rejected
Status: RESOLVED FIXED
Alias: None
Product: plasmashell
Classification: Plasma
Component: Session Management (show other bugs)
Version: 5.27.11
Platform: openSUSE Linux
: NOR normal
Target Milestone: 1.0
Assignee: Plasma Bugs List
URL:
Keywords: regression
: 491421 (view as bug list)
Depends on:
Blocks:
 
Reported: 2024-08-01 15:20 UTC by Unknown
Modified: 2024-11-20 08:35 UTC (History)
13 users (show)

See Also:
Latest Commit:
Version Fixed In: 6.2.4
Sentry Crash Report:


Attachments
Patch to avoid out-of-bounds access or truncation of magic cookie data in KSMServer (1.51 KB, text/x-patch)
2024-11-06 08:19 UTC, Matt Whitlock
Details
Patch to avoid out-of-bounds access or truncation of magic cookie data in KSMServer (1.53 KB, text/x-patch)
2024-11-06 08:47 UTC, Matt Whitlock
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Unknown 2024-08-01 15:20:42 UTC
SUMMARY
After logout or shutdown, when logging in again, session is occasionally not restored correctly.
When multiple windows of an application were open, sometimes they are not restarted.
(This never happened with Plasma 5.27.9 over a period of more than 20 months, but happens roughly in 1 out of 5 re-logins with Plasma 5.27.11 – so the bug was introduced between)

STEPS TO REPRODUCE
Plasma is configured to restore the session from the last logout.

So far no reproducible trigger could be identified – it seems to happen randomly on several different systems (with the same SW versions).

OBSERVED RESULT
After logout sometimes multiple windows of several programs – observed with Dolphin, Konsole (and some tabs of Firefox, but that might be an uncorrelated issue) – are not restored. Instead either no window or a single window of the program is opened (at the default location – i.e. user home)

Funnily: If then you logout immediately and then re-login, often the windows are restored correctly as you left them with the session second to last … (nice bug – bug, because it should have restored the incomplete state of the previous session – but recovers the state of the penultimate (which the previous session should have had recovered) …

EXPECTED RESULT
KDE/Plasma correctly and reliable recovers the window states of the previous session.
(as can and is expected from X11 window managers since the days of "olvwm" – 35 years ago …)

SOFTWARE/OS VERSIONS
Operating System: openSUSE Leap 15.6
KDE Plasma Version: 5.27.11
KDE Frameworks Version: 5.115.0
Qt Version: 5.15.12
Kernel Version: 6.4.0-150600.23.14-default (64-bit)
Graphics Platform: X11
Processors: 8 × Intel® Core™ i5-10210U CPU @ 1.60GHz
Memory: 7.6 GiB of RAM
Graphics Processor: Mesa Intel® UHD Graphics

ADDITIONAL INFORMATION
–
Comment 1 Kishore Gopalakrishnan 2024-08-04 17:00:57 UTC
I also seem to have a similar issue. When this happens again, can you check if you have messages like the following in your system logs?

```
$ journalctl -b -0 | grep "Session management error"
Aug 04 22:00:34 kishore-thinkpad-e495 kate[4510]: Qt: Session management error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication rejected
... [many more messages like this]
```

Usually this authentication works fine (and the session restores properly), but sometime I get this. Perhaps some sort of race condition?
Comment 2 Unknown 2024-08-06 14:30:43 UTC
(In reply to Kishore Gopalakrishnan from comment #1)
> I also seem to have a similar issue. When this happens again, can you check
> if you have messages like the following in your system logs?
> 
> ```
> $ journalctl -b -0 | grep "Session management error"
> Aug 04 22:00:34 kishore-thinkpad-e495 kate[4510]: Qt: Session management
> error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication
> rejected
> ... [many more messages like this]
> ```

Good hint!

Similar, but slightly different, messages here – on all machines affected, one example of many identical:

2024-08-06T01:12:45.845617+02:00 systemname kscreenlocker_greet[25446]: Qt: Session management error: networkIdsList argument is NULL


> Usually this authentication works fine (and the session restores properly),
> but sometime I get this. Perhaps some sort of race condition?

I suspect a bug in the newly implemented option to explicitly manually save and restore sessions …

(Which IMHO is conceived quite sub-optimally: 
The save option is only present when the system is configured to only manually restore sessions – but you have to restart your session to have the "save option" turn up :-(

Much better would have been IMHO:
You always have the option to manually save your session state with a name of your choice.
The automatic saving of the session at logout (name "last session") can be toggled.
If there is more than one saved and named session state available, you are offered the choice …
)
Comment 3 Kishore Gopalakrishnan 2024-08-08 04:46:40 UTC
It seems my bug is different from yours, then. I have now opened a separate report (bug 491421)
Comment 4 Aleksey Kontsevich 2024-08-30 20:28:38 UTC
Have same in 6.1.4. Redirected here from https://bugs.kde.org/show_bug.cgi?id=487912
Comment 5 Aleksey Kontsevich 2024-10-20 09:07:26 UTC
(In reply to Aleksey Kontsevich from comment #4)
> Have same in 6.1.4. Redirected here from
> https://bugs.kde.org/show_bug.cgi?id=487912

Some more details here:
https://bugs.kde.org/show_bug.cgi?id=487912#c61

> Suspect something still wrong with session restore:
> - with Akregator - happens after session restore: Bug 494071
> - with Chromium: if it has multiple windows, after session restore it opens more windows than it had on logout.
Comment 6 Matt Whitlock 2024-11-06 07:38:21 UTC
Every KDE application I start emits this warning on stderr:

> Qt: Session management error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication rejected

Indeed, it appears as though the apps are failing to register with KSMServer. If I start an app (e.g., KCalc) and then save my current session with a new name…

> $ qdbus6 org.kde.ksmserver /KSMServer saveCurrentSessionAs bugtest
> $ sed -e '/bugtest/,/^$/!d' ~/.config/ksmserverrc 
> [LegacySession: bugtest]
> count=0
> 
> [Session: bugtest]
> count=0

There's nothing saved in the session. KCalc evidently did indeed fail to authenticate with KSMServer.

Here's the relevant bit from an strace of KCalc's startup:

> socket(AF_UNIX, SOCK_STREAM, 0)         = 14
> getsockopt(14, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
> uname({sysname="Linux", nodename="Crushinator", ...}) = 0
> connect(14, {sa_family=AF_UNIX, sun_path="@/tmp/.ICE-unix/4318"}, 22) = -1 ENOENT (No such file or directory)
> close(14)                               = 0
> socket(AF_UNIX, SOCK_STREAM, 0)         = 14
> getsockopt(14, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0
> uname({sysname="Linux", nodename="Crushinator", ...}) = 0
> connect(14, {sa_family=AF_UNIX, sun_path="/tmp/.ICE-unix/4318"}, 21) = 0
> fcntl(14, F_SETFD, FD_CLOEXEC)          = 0
> write(14, "\0\1\0\0\0\0\0\0", 8)        = 8
> read(14, "\0\1\0H\0\0\0\0", 8)          = 8
> access("/run/user/1000/iceauth_pnzFrq", R_OK) = 0
> openat(AT_FDCWD, "/run/user/1000/iceauth_pnzFrq", O_RDONLY) = 15
> fstat(15, {st_mode=S_IFREG|0600, st_size=338, ...}) = 0
> read(15, "\0\3ICE\0\0\0&local/Crushinator:@/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20\226J'\364\v\326m~\0\0\0\0\0\0\0\0\0\4XSMP\0\0\0&local/Crushinator:@/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20t\342\30E\351\16i\221\344\250\226W\213\24\252\26\0\3ICE\0\0\0$unix/Crushinator:/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20<S\226\341I\216R\17\34\0\0\0\0\0\0\0\0\4XSMP\0\0\0$unix/Crushinator:/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20<PCu\23\262\377\277\241\301 Q\262d\f\4", 4096) = 338
> read(15, "", 4096)                      = 0
> close(15)                               = 0
> write(14, "\0\2\1\1\6\0\0\0\0\0\0\0\0\0\0\0\3\0MIT\0\0\0\3\0001.0\0\0\0\22\0MIT-MAGIC-COOKIE-1\1\0\0\0", 56) = 56
> read(14, "\0\3\0H\1\0\0\0", 8)          = 8
> read(14, "\0\0\0\0\0\0\0\0", 8)         = 8
> access("/run/user/1000/iceauth_pnzFrq", R_OK) = 0
> openat(AT_FDCWD, "/run/user/1000/iceauth_pnzFrq", O_RDONLY) = 15
> fstat(15, {st_mode=S_IFREG|0600, st_size=338, ...}) = 0
> read(15, "\0\3ICE\0\0\0&local/Crushinator:@/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20\226J'\364\v\326m~\0\0\0\0\0\0\0\0\0\4XSMP\0\0\0&local/Crushinator:@/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20t\342\30E\351\16i\221\344\250\226W\213\24\252\26\0\3ICE\0\0\0$unix/Crushinator:/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20<S\226\341I\216R\17\34\0\0\0\0\0\0\0\0\4XSMP\0\0\0$unix/Crushinator:/tmp/.ICE-unix/4318\0\22MIT-MAGIC-COOKIE-1\0\20<PCu\23\262\377\277\241\301 Q\262d\f\4", 4096) = 338
> close(15)                               = 0
> write(14, "\0\4\1\1\3\0\0\0\20\0\0\0\0\0\0\0<S\226\341I\216R\17\34\0\0\0\0\0\0\0", 32) = 32
> read(14, "\0\0\4\0\7\0\0\0", 8)         = 8
> read(14, "\4\1\0\0\3\0\0\0", 8)         = 8
> read(14, "*\0MIT-MAGIC-COOKIE-1 authentication rejectedV\0\0\0", 48) = 48
> close(14)                               = 0
> write(2, "Qt: Session management error: Authentication Rejected, reason : MIT-MAGIC-COOKIE-1 authentication rejected\n", 107) = 107

Oddly, it appears as though KCalc first attempts to open a Unix socket connection to path "@/tmp/.ICE-unix/4318". That's an invalid path. What it apparently means to do is to connect to an abstract socket named "/tmp/.ICE-unix/4318", but someone forgot to overwrite that ampersat with a null byte. (`socat - ABSTRACT-CONNECT:/tmp/.ICE-unix/4318` does connect successfully, so the abstract socket exists and is listening; KCalc just isn't specifying the correct .sun_path for it.) But no matter: KCalc then falls back to the pathname socket at /tmp/.ICE-unix/4318 and connects successfully.

After connecting to the ICE server (which is running inside KSMServer), KCalc sends a ByteOrder declaration and reads the server's ByteOrder declaration. (Why the ICE protocol didn't just specify a fixed byte order is anyone's guess.) Then it opens and reads the magic cookie file located at /run/user/1000/iceauth_pnzFrq, which does indeed exist and was last modified at the time I started my Plasma X11 session. KCalc sends a ConnectionSetup message that offers one version (1.0) with a vendor of "MIT" and a release of "1.0" and one authentication protocol name ("MIT-MAGIC-COOKIE-1"). KCalc reads an AuthenticationRequired reply that requests the MIT-MAGIC-COOKIE-1 authentication protocol. KCalc sends an AuthenticationReply message that contains the magic cookie, which has the correct value as read from /run/user/1000/iceauth_pnzFrq. KCalc reads an Error reply of class AuthenticationRejected, concerning an AuthenticationReply message, of FatalToProtocol severity, with the error message "MIT-MAGIC-COOKIE-1 authentication rejected".

So everything looks fine on the client side (other than the mistake in the socket path handling). I am guessing something has become broken in KSMServer. I am suspicious that the ICE magic cookies for my session both end in runs of NUL bytes rather than being a full 16 random bytes. The XSMP magic cookies do not show such oddity:

> $ iceauth -f /run/user/1000/iceauth_pnzFrq list
> ICE "" local/Crushinator:@/tmp/.ICE-unix/4318 MIT-MAGIC-COOKIE-1 964a27f40bd66d7e0000000000000000
> XSMP "" local/Crushinator:@/tmp/.ICE-unix/4318 MIT-MAGIC-COOKIE-1 74e21845e90e6991e4a896578b14aa16
> ICE "" unix/Crushinator:/tmp/.ICE-unix/4318 MIT-MAGIC-COOKIE-1 3c5396e1498e520f1c00000000000000
> XSMP "" unix/Crushinator:/tmp/.ICE-unix/4318 MIT-MAGIC-COOKIE-1 3c50437513b2ffbfa1c12051b2640c04

Maybe KSMServer is actually expecting magic cookies that haven't had their trailing bytes zeroed out. It could be that there is a use-after-free bug when the iceauth file is being written out during session startup. Perhaps the magic cookie data bytes are getting trashed before they are encoded to hex and written to the file.

Notably, KSMServer also fails to clear the padding bytes in the ICE protocol messages it sends. That's bad for protocol extensibility.
Comment 7 Matt Whitlock 2024-11-06 08:19:31 UTC
Created attachment 175571 [details]
Patch to avoid out-of-bounds access or truncation of magic cookie data in KSMServer

Found the problem. It's actually more laughable than use-after-free. David Edmundson rewrote[1] part of KSMServer to eliminate a dependency on the `iceauth` utility, but he botched the handling of magic cookie data. Specifically, he calls strdup() on a fixed-length character sequence that is not guaranteed to be NUL-terminated and may in fact contain NUL bytes. It's a little amazing that this never crashes KSMServer outright since it is an out-of-bounds memory access.

[1] https://invent.kde.org/plasma/plasma-workspace/-/commit/9398f6cf8933055b31506ba155aef2fc2b3561d7

Attached is an untested patch that should fix the issue. I'm rebuilding plasma-workspace with the patch now and will test it out the next time I restart my Plasma session (which could be weeks from now).
Comment 8 Matt Whitlock 2024-11-06 08:47:46 UTC
Created attachment 175572 [details]
Patch to avoid out-of-bounds access or truncation of magic cookie data in KSMServer

Sorry, I forgot that C++ doesn't allow implicit conversions from void*. Revised patch attached.
Comment 9 David Edmundson 2024-11-06 12:25:09 UTC
Thanks for looking into it. I'm not going to argue I botched it, but I'm not sure your comment contains the whole story either

Data we're copying comes from:
        (*authDataEntries)[i].auth_data = IceGenerateMagicCookie(MAGIC_COOKIE_LEN);

which is documented at:  https://www.x.org/releases/X11R7.7/doc/libICE/ICElib.html

char *IceGenerateMagicCookie(int length);
"The magic cookie returned will be null-terminated."
Comment 10 Matt Whitlock 2024-11-06 12:53:17 UTC
(In reply to David Edmundson from comment #9)
> char *IceGenerateMagicCookie(int length);
> "The magic cookie returned will be null-terminated."

That would explain why it never crashes. No out-of-bounds access after all. Seems bizarre and pointless to NUL-terminate a character sequence that may contain NUL characters since such a sequence could never correctly be treated as a NUL-terminated string.

Presumably this all may also explain why the problem is intermittent. You have a ~11.8% chance of randomly choosing at least one NUL byte among the 32 bytes chosen for the two 16-byte magic cookies. Not quite the "1 out of 5" estimated in Comment #0 but around the right order of magnitude.
Comment 11 Matt Whitlock 2024-11-06 12:56:35 UTC
Actually, there *is* still an out-of-bounds access; it just happens inside IceWriteAuthFileEntry(). You're telling it that auth_data points to an array of auth_data_length==16 characters, yet the actual size of the object allocated by strdup may be less than that. You may be saved only because the minimum size of an object allocated by glibc's heap allocated is 16 bytes.
Comment 12 David Edmundson 2024-11-06 13:11:53 UTC
> Seems bizarre and pointless to NUL-terminate a character sequence that may contain NUL characters since such a sequence could never correctly be treated as a NUL-terminated string.

So bizarre you could hardly blame any dev for making incorrect assumptions!

Everything adds up at least, and your patch makes sense given our new findings.
Do you want to make a MR?  Or I can and write an attribution to you?
Comment 13 Matt Whitlock 2024-11-06 13:21:20 UTC
I don't care about attribution. I posted my patch here in the hopes that it would save other people some frustration and that hopefully a developer with commit access would see it and commit a similar or identical fix. That developer is you, I think. :)
Comment 14 Matt Whitlock 2024-11-06 13:27:57 UTC
And yes, I do agree: reading "The magic cookie returned will be null-terminated" absolutely implies that the cookie won't contain NUL bytes. I would even argue, given that the type of IceAuthFileEntry::auth_data is char* (rather than unsigned char*), that it's reasonable to assume that the auth data would be textual. However, I'd also argue that any cursory check of your implementation's output using the iceauth utility should have revealed that the cookies comprise opaque binary data, not text. Anyway, I really don't care to debate blame; the bug is identified, and a fix is known. Moving on. Thanks for your prompt attention.
Comment 15 Bug Janitor Service 2024-11-06 14:20:14 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/plasma-workspace/-/merge_requests/4895
Comment 16 imaginator 2024-11-06 14:31:05 UTC
(In reply to Matt Whitlock from comment #13)
> I don't care about attribution. I posted my patch here in the hopes that it
> would save other people some frustration and that hopefully a developer with
> commit access would see it and commit a similar or identical fix. That
> developer is you, I think. :)

Modesty is an adornment but...
This wasn't easy to find out and fix.  Thus _I_ think that it's only fair enough the commit message contains a reference to your analysis and patch.  

And to those it may concern:
This is a victory of perseverance by Plasma-users.  So for the future please
- take (more) seriously what users report
- don't close bugs prematurely as "resolved/fixed" unless they really are

Thanks.
Comment 17 Vaclav Masin 2024-11-06 16:00:42 UTC
This is possibly not relevant to this thread and the latest patch here at all but I guess I'm gonna mention it anyway just in case: I've been struggling with this empty session bug using plasma-workspace-5.27.11.1 since right after upgrading to it. What helped somewhat (though not completely) to resolve the issue for me was the patch on this page: https://bugs.gentoo.org/933435 (in comment #1).

The patch on this page doesn't seem to help with my issue at all (tried to apply it twice to the clean plasma-workspace-5.27.11.1 directory every time with the same result).

I guess the issue on this page is about something completely different. Sorry for polluting the discussion if that's really the case - just thought I'll let you know.
Comment 18 imaginator 2024-11-07 14:48:23 UTC
(In reply to Matt Whitlock from comment #8)
> Created attachment 175572 [details]
> Patch to avoid out-of-bounds access or truncation of magic cookie data in
> KSMServer
> 
> Sorry, I forgot that C++ doesn't allow implicit conversions from void*.
> Revised patch attached.

I rebuilt/installed plasma-workspace-5.27.11.1 with your patch applied in addition to an earlier patch for this problem.  Due to the random nature of the bug and its rather seldom occurrence in this version, it's probably too early to say that it's definitely solved.  But so far everything is looking good and session-restore works as it should.  Thanks for your work and effort!
Comment 19 Kishore Gopalakrishnan 2024-11-09 09:46:52 UTC
*** Bug 491421 has been marked as a duplicate of this bug. ***
Comment 20 David Edmundson 2024-11-13 09:31:16 UTC
Git commit 2c2f7a848006361a1e9d719de05d2ffcc633205c by David Edmundson.
Committed on 13/11/2024 at 09:31.
Pushed by davidedmundson into branch 'master'.

ksmserver: Fix copying of auth data



Despite IceGenerateMagicCookie returning null terminated char*, the data
itself is pure data may contain null bytes. For this reason we must copy
the entire data_length, and not use string functions.

This fixes a bug where 1 in 10 logins would not have working session
management if the generated data happened to contain a null byte.

Thanks to Matt Whitlock kde@mattwhitlock.name

M  +6    -2    ksmserver/server.cpp

https://invent.kde.org/plasma/plasma-workspace/-/commit/2c2f7a848006361a1e9d719de05d2ffcc633205c
Comment 21 Aleksey Kontsevich 2024-11-13 11:16:16 UTC
What version it will be fixed? 6.3?
Comment 22 David Edmundson 2024-11-18 14:48:59 UTC
Git commit 8ee3defdf81e9b5ad3edef498d04fca0f4289e4a by David Edmundson, on behalf of David Edmundson.
Committed on 18/11/2024 at 10:05.
Pushed by davidedmundson into branch 'Plasma/6.2'.

ksmserver: Fix copying of auth data



Despite IceGenerateMagicCookie returning null terminated char*, the data
itself is pure data may contain null bytes. For this reason we must copy
the entire data_length, and not use string functions.

This fixes a bug where 1 in 10 logins would not have working session
management if the generated data happened to contain a null byte.

Thanks to Matt Whitlock kde@mattwhitlock.name


(cherry picked from commit 2c2f7a848006361a1e9d719de05d2ffcc633205c)

c567ada7 ksmserver: Fix copying of auth data

Co-authored-by: David Edmundson <kde@davidedmundson.co.uk>

M  +6    -2    ksmserver/server.cpp

https://invent.kde.org/plasma/plasma-workspace/-/commit/8ee3defdf81e9b5ad3edef498d04fca0f4289e4a
Comment 23 Kai Uwe Broulik 2024-11-20 08:35:39 UTC
Git commit a336dd70bf51cf4d9be8cf1b66b4819b78850838 by Kai Uwe Broulik.
Committed on 20/11/2024 at 07:52.
Pushed by broulik into branch 'Plasma/5.27'.

ksmserver: Fix copying of auth data



Despite IceGenerateMagicCookie returning null terminated char*, the data
itself is pure data may contain null bytes. For this reason we must copy
the entire data_length, and not use string functions.

This fixes a bug where 1 in 10 logins would not have working session
management if the generated data happened to contain a null byte.

Thanks to Matt Whitlock kde@mattwhitlock.name


(cherry picked from commit 2c2f7a848006361a1e9d719de05d2ffcc633205c)

c567ada7 ksmserver: Fix copying of auth data

Co-authored-by: David Edmundson <kde@davidedmundson.co.uk>

(cherry picked from commit 8ee3defdf81e9b5ad3edef498d04fca0f4289e4a)

Co-authored-by: David Edmundson <david@davidedmundson.co.uk>

M  +6    -2    ksmserver/server.cpp

https://invent.kde.org/plasma/plasma-workspace/-/commit/a336dd70bf51cf4d9be8cf1b66b4819b78850838