516038 – kwin_wayland loses DRM master during S3 suspend and never re-acquires it on resume, resulting in a permanent black screen. The GPU and NVIDIA kernel modules are fully functional after resume (proven by SysRq test), but kwin enters an unrecoverable error l

Bug 516038 - kwin_wayland loses DRM master during S3 suspend and never re-acquires it on resume, resulting in a permanent black screen. The GPU and NVIDIA kernel modules are fully functional after resume (proven by SysRq test), but kwin enters an unrecoverable error l

Summary: kwin_wayland loses DRM master during S3 suspend and never re-acquires it on r...

Status:	REPORTED

Alias:	None

Product:	kwin
Classification:	Plasma
Component:	general (other bugs)
Version First Reported In:	unspecified
Platform:	Fedora RPMs Linux

Importance:	NOR major
Target Milestone:	---
Assignee:	KWin default assignee

URL:
Keywords:

Depends on:
Blocks:

Reported:	2026-02-15 14:11 UTC by thedarkbird
Modified:	2026-02-15 14:32 UTC (History)
CC List:	0 users

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description thedarkbird 2026-02-15 14:11:38 UTC

Please note that I have used Claude AI to analyze this problem, but I have done so thinking along with it, asking critical questions, over the course of several days. I do not have the extensive linux knowledge required to do these kinds of analyses (although I understand the basic functionality of a linux system). It seemed to me that its final conclusion made enough sense to post it here.

What follows below is a Claude-generated summary of the issue:

SYSTEM INFORMATION

Distro: Fedora 43
Plasma: 6.5
KWin: 6.5.5 (Wayland)
Kernel: 6.18.9-200.fc43.x86_64
GPU: NVIDIA RTX 4080 (proprietary driver 580.119.02, open kernel modules)
CPU/iGPU: AMD Ryzen 7000 (Raphael iGPU, no displays connected)
Displays: DP-1 + HDMI-A-1, both on NVIDIA GPU
DRM devices: card1 = NVIDIA (pci-0000:01:00.0), card2 = amdgpu (pci-0000:14:00.0)
Sleep mode: S3 deep sleep ([deep] in /sys/power/mem_sleep)
Initramfs: NVIDIA modules loaded as to correctly display LUKS password prompt

STEPS TO REPRODUCE

1. Boot normally, log into Plasma Wayland session
2. Suspend to RAM (S3 deep sleep) - not a manually forced sleep, but having the system do it by itself after the set amount of time
3. Wake the system (press power button or keyboard)

EXPECTED BEHAVIOR

Displays turn on, lock screen appears, session resumes normally.

ACTUAL BEHAVIOR

System wakes (fans, disks spin up), but both displays remain permanently black. Ctrl+Alt+F3 (VT switch) also produces no output. The system is otherwise alive (accessible via SSH). Without intervention, a hard reboot is required.

KEY FINDING: GPU IS FUNCTIONAL AFTER RESUME

Pressing Alt+SysRq+REISUB after a failed resume brings the display back at the S (sync) step. By that point, E (SIGTERM) and I (SIGKILL) have killed all userspace processes including kwin_wayland. The kernel reclaims DRM master and fbcon takes over the display successfully.

This proves the GPU hardware and NVIDIA kernel modules are fully functional after resume. The failure is in kwin_wayland, not in the kernel driver.

JOURNAL EVIDENCE

Resume timeline (journalctl -b -1):

14:09:18 nvidia-suspend.service runs successfully
14:09:19 System enters S3 deep sleep
14:31:05 System wakes — kernel resumes, CPUs come back online
14:31:05 amdgpu resumes normally (no displays connected, expected)
14:31:06 session-2.scope thawed — kwin_wayland is unfrozen
14:31:06 kwin_wayland: Failed to open drm node: "/dev/dri/card0" (card0 doesn't exist, harmless)
14:31:06 nvidia-resume.service starts
14:31:06 kwin_wayland: Atomic modeset test failed! Permission denied <-- FIRST FAILURE
14:31:06 kwin_wayland: Applying output configuration failed!
14:31:06 nvidia-resume.service finishes successfully
14:31:06 kwin_wayland: Setting dpms mode failed!
14:31:15 Hundreds of "Atomic modeset test failed! Permission denied" — never recovers
14:32:20 Still spamming errors — kwin is permanently stuck

Relevant kwin_wayland messages:

kwin_wayland[2972]: Failed to open drm node: "/dev/dri/card0"
kwin_wayland[2972]: Failed to open drm node: "/dev/dri/card0"
kwin_wayland[2972]: Atomic modeset test failed! Permission denied
kwin_wayland[2972]: Applying output configuration failed!
kwin_wayland[2972]: Atomic modeset test failed! Permission denied
kwin_wayland[2972]: Setting dpms mode failed!
(repeats hundreds of times, never recovers)

logind only logs "Operation 'suspend' finished." — there is no evidence of DRM master being re-granted to the session.

nvidia-resume.service ran and completed successfully. The NVIDIA kernel driver resumed without errors.

ANALYSIS

The "Permission denied" error from drmModeAtomicCommit() indicates kwin has lost DRM master status during S3 suspend. Two problems prevent recovery:

1. DRM master is not re-granted after resume. logind does not appear to re-issue DRM master to the active session's kwin instance after S3 resume completes.

2. kwin has no recovery mechanism. Once the first atomic modeset fails, kwin enters an infinite error loop, retrying the same failing operation without ever attempting to re-acquire DRM master. A fresh kwin instance (started after the old one is killed) acquires DRM master from logind without issues.

There is also a possible race condition: kwin is unfrozen and attempts modesetting at the same moment nvidia-resume.service is still running. However, the errors persist long after nvidia-resume.service completes, so the race is at most a trigger — the lack of DRM master recovery is the root cause.

WHY THIS IS A KWIN BUG (NOT NVIDIA)

- The SysRq test proves the GPU and nvidia-drm kernel module are fully operational after resume — fbcon can drive the displays via the same hardware.
- A freshly started kwin_wayland (after killing the stuck one) acquires DRM master and works perfectly.
- The failure is kwin not recovering from a lost DRM master state, regardless of why the DRM master was lost.

Bug 477738 was closed as RESOLVED DOWNSTREAM, attributing this to NVIDIA. The SysRq evidence contradicts that conclusion — the kernel driver works, but kwin does not attempt to re-acquire DRM master when it loses it during suspend.

RELATED BUGS

Bug 477738 — Same error signature ("Atomic commit failed! Permission denied" after resume). Closed DOWNSTREAM. The SysRq evidence shows the issue is in kwin's lack of DRM master recovery.

Bug 509439 — Fixed in KWin 6.5.0 (EGL context handling on resume). We run 6.5.5; this fix is present but insufficient.

Bug 478090 — Fixed in Plasma 6.3.1 (lock screen black screen). Present in our version, not our issue.

WORKAROUND

Pressing Alt+SysRq+E kills all userspace. SDDM restarts, a fresh kwin acquires DRM master, and the session can be restored (unsaved work is lost). Mostly a technical workaround, not a functional one.