Bug 456298

Summary: UI freezes and kwin_wayland goes into an infinite loop when turning on TV screen with VR headset connected
Product: [Plasma] kwin Reporter: Maciej Stanczew <maciej.stanczew>
Component: wayland-genericAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: nate, xaver.hugl
Priority: NOR Keywords: regression
Version: 5.25.2   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=456369
Latest Commit: Version Fixed In: 5.25.3
Attachments: 'journalctl -b | grep kwin_wayland_drm' for 5.25.1 and 5.25.2
journal for 5.25.2 with VR headset unplugged
journal for 5.25.2 + MR 2600

Description Maciej Stanczew 2022-07-04 01:03:16 UTC
Created attachment 150374 [details]
'journalctl -b | grep kwin_wayland_drm' for 5.25.1 and 5.25.2

SUMMARY
I have an LG TV connected to an AMD Radeon RX 5700 XT through HDMI. Whenever the TV auto-powers off, and then I turn it on (with the remote), the UI freezes and becomes completely unresponsive – the music is still playing, but anything on graphical and input side is frozen. I can SSH to the system and see that kwin_wayland is consuming 100% of the CPU, and is infinitely printing logs about DRM objects (attached).

At first I thought it was bug 455814, however a) the patch (merge request 2602) did not help, and b) my issue only got introduced in 5.25.2 (it works fine in 5.25.1).

I'm in the process of running a bisection, however it takes some time, as the issue does not seem to reproduce if I manually turn off the TV, only if it automatically turns off after a set time due to power saving (on TV's side, not PC's).

In the meantime I attached two logs. In the log from 5.25.2 (TV turned on at 14:57:15) we can see that kwin_wayland gets stuck in a loop, until I manually kill the process (at 15:05:15). On the other hand in 5.25.1 (TV turned on at 02:13:20) initially it looks similar, but then we get "Applying KScreen config failed!", 7 seconds of pause, and then the screen is detected correctly and system is functional.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Arch Linux
KDE Plasma Version: 5.25.2
KDE Frameworks Version: 5.95.0
Qt Version: 5.15.5
Kernel Version: 5.18.8
CPU: AMD Ryzen 3700X
GPU: AMD Radeon RX 5700 XT
Display: LG C1
Comment 1 Maciej Stanczew 2022-07-04 19:26:38 UTC
Created attachment 150395 [details]
journal for 5.25.2 with VR headset unplugged

Looking through the commits between 5.25.1 and 5.25.2, I see mention of setups with VR headsets; and I do in fact have a Valve Index connected to the PC.

As a test I unplugged the headset, and the issue disappeared. Now there are just two short sets of messages in the logs (attached): one about the TV output disappearing (18:20:51 = when power on button is pressed on the remote), and one about the output reappearing after a couple of seconds (18:20:58 = when the TV actually turns on).

Bisection (with the VR headset connected) was a bit problematic, because after commit ea8b0d962 and before 3bac0cd07 kwin would just hang on boot; I had to either not include any of those commits, or include both of them. Ultimately I confirmed that commit cb5981a16 (= v5.25.1) works (despite still printing some "Atomic test for (0) failed! Invalid argument" messages), while commit ea8b0d962 with cherry-picked 3bac0cd07 doesn't work (leads to the infinite loop in kwin_wayland).
Comment 2 Zamundaaa 2022-07-05 10:53:34 UTC
Can you test https://invent.kde.org/plasma/kwin/-/merge_requests/2600?
Comment 3 Maciej Stanczew 2022-07-05 19:02:16 UTC
Created attachment 150423 [details]
journal for 5.25.2 + MR 2600

Yup, with commit ecff4a2c43e7c10dfa709c4ca71cf1a37dae6121 applied on top of 5.25.2, the issue is gone.
I'm attaching journal logs for completeness; TV power on was pressed at 17:43:24 and then again at 20:52:25.
Thank you.

(I assume I should wait for the change to be merged before marking this as resolved, right?)
Comment 4 Zamundaaa 2022-07-05 19:23:24 UTC
Git commit a71146c999728b06654ba247644b6b22609f0dff by Xaver Hugl.
Committed on 05/07/2022 at 19:11.
Pushed by zamundaaa into branch 'master'.

backends/drm: don't remove connectors the kernel doesn't consider removed

Removing connectors that are still powered leads to a mismatch in atomic
commits: the crtc is still powered, but the connector also still there.
If KWin tries to disable the crtc afterwards, the atomic commits fail because
the connector needs to be disabled at the same time and it's missing from the
atomic commit request.

To fix this, whenever we fail to fetch information or get wrong data from
the kernel (like 0 modes), use the cached information instead and keep the
connector.

M  +22   -28   src/backends/drm/drm_gpu.cpp
M  +5    -4    src/backends/drm/drm_object_connector.cpp

https://invent.kde.org/plasma/kwin/commit/a71146c999728b06654ba247644b6b22609f0dff
Comment 5 Zamundaaa 2022-07-05 19:36:34 UTC
Git commit 913ca1d6e8cf06602816cd3c7f26b0c2a47bdbd7 by Xaver Hugl.
Committed on 05/07/2022 at 19:36.
Pushed by zamundaaa into branch 'Plasma/5.25'.

backends/drm: don't remove connectors the kernel doesn't consider removed

Removing connectors that are still powered leads to a mismatch in atomic
commits: the crtc is still powered, but the connector also still there.
If KWin tries to disable the crtc afterwards, the atomic commits fail because
the connector needs to be disabled at the same time and it's missing from the
atomic commit request.

To fix this, whenever we fail to fetch information or get wrong data from
the kernel (like 0 modes), use the cached information instead and keep the
connector.


(cherry picked from commit a71146c999728b06654ba247644b6b22609f0dff)

M  +22   -28   src/backends/drm/drm_gpu.cpp
M  +5    -4    src/backends/drm/drm_object_connector.cpp

https://invent.kde.org/plasma/kwin/commit/913ca1d6e8cf06602816cd3c7f26b0c2a47bdbd7