Bug 473734 - Screen not restored after awakening from hibernate
Summary: Screen not restored after awakening from hibernate
Status: RESOLVED UPSTREAM
Alias: None
Product: kwin
Classification: Plasma
Component: wayland-generic (other bugs)
Version First Reported In: 5.27.7
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-24 20:17 UTC by sk.griffinix
Modified: 2023-09-11 21:43 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
journalctl log (120.13 KB, text/plain)
2023-08-24 20:17 UTC, sk.griffinix
Details
dmsg log with bug NOT Reproduced (2.20 MB, text/x-log)
2023-08-25 18:00 UTC, sk.griffinix
Details
dmesg log with bug reproduced (96.13 KB, text/x-log)
2023-08-31 11:51 UTC, sk.griffinix
Details

Note You need to log in before you can comment on or make changes to this bug.
Description sk.griffinix 2023-08-24 20:17:15 UTC
Created attachment 161164 [details]
journalctl log

created in response to bug https://bugs.kde.org/show_bug.cgi?id=473235 being labelled not a bug

In many cases after hibernating/waking from sleep, I get a blank screen. In past kde versions, switching tty session usually solved the issue. For example, switching back and forth once would lead to screen with artefacts, then repeating it couple of times restored the display. In the latest version, no amount of switching helps, since I can only get screen with artefacts, and only solutions appear to be restarting kwin, or resetting AMD driver, both of which end the applications running in that session (which is undesirable).

STEPS TO REPRODUCE
1. Hibernate plasma
2. Restart computer 

OBSERVED RESULT
The screen is blank. On switching between ttys, screen artefacts appear (unusable with horizontal banding in entire screen). Resolved by restarting kwin. Replicable about 50% of times. No particular trigger or cause that I could identify

EXPECTED RESULT
Everything should be restored as normal

SOFTWARE/OS VERSIONS
Linux Kernel Version: 6.4.6-1-MANJARO (64-bit)
AMD driver: amdgpu
Mesa version: 23.0.2-2
KDE Frameworks Version: 5.108.0
Qt Version: 5.15.10
GPU: Advanced Micro Devices, Inc. [AMD/ATI] Bonaire XTX [Radeon R7 260X/360]

ADDITIONAL INFORMATION
Attaching journalctl from a few minutes before hibernating until restoring plasma session. Awekening from hibernate starts at Aug 25 01:24:56
Comment 1 Zamundaaa 2023-08-25 11:35:43 UTC
Please enable drm debug logging with
> echo 0x1FF | sudo tee /sys/module/drm/parameters/debug
start writing the log with
> sudo dmesg -w > dmesg.log
then reproduce the problem and stop the last command. The resulting log file might contain information telling us what's happening.

Note that the logging is *very+ verbose, so if reproducing the problem doesn't work the first time, it's best to stop the dmesg command and start it again before trying again, so that it's easier to find the relevant parts of the log.
Comment 2 sk.griffinix 2023-08-25 17:37:01 UTC
(In reply to Zamundaaa from comment #1)
> Please enable drm debug logging with
> > echo 0x1FF | sudo tee /sys/module/drm/parameters/debug
> start writing the log with
> > sudo dmesg -w > dmesg.log
> then reproduce the problem and stop the last command. The resulting log file
> might contain information telling us what's happening.
> 
> Note that the logging is *very+ verbose, so if reproducing the problem
> doesn't work the first time, it's best to stop the dmesg command and start
> it again before trying again, so that it's easier to find the relevant parts
> of the log.

So if I am understanding correctly, hibernate after running the above command is creating log and upload the log file in case I get the bug. Ok, I will get on with it. Just a doubt though. On running `modinfo -p drm`, I get a bunch of codes but 0x1FF is not one of them. What does this code do?
Comment 3 Zamundaaa 2023-08-25 17:47:43 UTC
Yes, that is correct.

> On running `modinfo -p drm`, I get a bunch of codes but 0x1FF is not one of them. What does this code do?
0x1FF is just the combination of all the flags, with all bits set. The goal is to get all drm-related logging from the kernel, so that we don't miss anything that might be relevant
Comment 4 sk.griffinix 2023-08-25 18:00:57 UTC
Created attachment 161180 [details]
dmsg log with bug NOT Reproduced

I tried a couple of times but couldn't reproduce the bug. I am attaching dmsg log just in case its needed for comparing with log where bug is reproduced. Its really very verbose.
Comment 5 sk.griffinix 2023-08-31 11:51:57 UTC
Created attachment 161308 [details]
dmesg log with bug reproduced

Had this bug today after hibernating today
Comment 6 sk.griffinix 2023-09-06 16:15:53 UTC
Should I provide any further info? Had this issue again and lost data again. This is making it very difficult for me to use kde for any serious endeavour. If nothing, is there some way to remove flickering horizontal lines by restarting kwin or resetting graphics srivers or any other method that does not remove all running programs?
Comment 7 Zamundaaa 2023-09-06 18:43:35 UTC
Sorry, it's been a busy week and I haven't been able to go through all my emails yet. The most suspicious things in your log are
> [35707.959388] [drm] Fence fallback timer expired on ring sdma0
a bunch of times (which also happens once in your first log, so not necessarily related) and
> [35707.977814] [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* channel eq failed: 5 tries
> [35707.978425] [drm:amdgpu_atombios_dp_link_train [amdgpu]] *ERROR* channel eq failed
which does sound related. Link training is about creating a working connection with your monitor, so if that doesn't work properly it could definitely cause issues like you describe.

Please put `KWIN_DRM_NO_AMS=1` into /etc/environment, reboot and then check if your workaround starts working again with that
Comment 8 sk.griffinix 2023-09-09 05:58:59 UTC
(In reply to Zamundaaa from comment #7)
> Please put `KWIN_DRM_NO_AMS=1` into /etc/environment, reboot and then check if your workaround starts working again with that

the workaround to switch tty to restore screen didn't work after adding the above variable
Comment 9 Zamundaaa 2023-09-10 13:39:44 UTC
Okay, then I'm pretty certain that it's not caused by something that KWin does wrong. Please report the bug at https://gitlab.freedesktop.org/drm/amd/-/issues, hopefully they can help fix the underlying kernel problem. It should be helpful for them if you attach the dmesg log from here as well
Comment 10 sk.griffinix 2023-09-11 21:43:13 UTC
Submitted to https://gitlab.freedesktop.org/drm/amd/-/issues/2845