Bug 486735 - System completely locking down "randomly?" when using an eGPU
Summary: System completely locking down "randomly?" when using an eGPU
Status: RESOLVED UPSTREAM
Alias: None
Product: kwin
Classification: Plasma
Component: core (other bugs)
Version First Reported In: 6.0.4
Platform: Fedora RPMs Linux
: NOR grave
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-07 15:20 UTC by Lachlan Craig
Modified: 2024-05-09 09:54 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
Some (maybe useful) logs captured via sudo journalctl -b-1 | tail -n50 after a crash. (7.63 KB, text/plain)
2024-05-07 15:20 UTC, Lachlan Craig
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lachlan Craig 2024-05-07 15:20:13 UTC
Created attachment 169278 [details]
Some (maybe useful) logs captured via sudo journalctl -b-1 | tail -n50 after a crash.

***
If you're not sure this is actually a bug, instead post about it at https://discuss.kde.org

If you're reporting a crash, attach a backtrace with debug symbols; see https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports
***

SUMMARY
System completely locking down "randomly?" when using an eGPU


STEPS TO REPRODUCE
1. Use KDE Plasma 6 Wayland
2. Have an eGPU connected (for me, it's an AMD RX580 connected via TB3)
3. Wait, and eventually, after an undetermined and possibly variable time, the whole system will completely freeze.

OBSERVED RESULT
 When using my eGPU (laptop screen turned off), KDE Plasma (kwin) runs alright but I had to set KWIN_DRM_DEVICES in /etc/environment to make it use the eGPU by default so I can get the full 144fps in kwin.
But, at random times (have not found any pattern), the system completely locks down :

    the external screen is on, but completely frozen

    NO input is taken (not even the REISUB method works)

    even SSH doesn't connect

I have to force shut down my laptop by long pressing the power button. 

EXPECTED RESULT

The system shouldn't freeze / lock down at all.


SOFTWARE/OS VERSIONS
Windows: N/A
macOS: N/A
Linux/KDE Plasma: Fedora 40 Linux 6.8.8-300
(available in About System)
KDE Plasma Version: 6.0.4
KDE Frameworks Version: 6.1.0
Qt Version: 6.7.0

ADDITIONAL INFORMATION

Full Specs :

i5-1335U
16GB DDR5
1TB NVMe
Intel Iris Xe (iGPU) -> 1920*1200 @60hz laptop screen
AMD RX580 (eGPU) -> connected via HDMi to a 1920*1080 @144hz external display

The eGPU is connected via ThunderBolt3 (TB3)
Comment 1 Zamundaaa 2024-05-07 21:35:26 UTC
> Some (maybe useful) logs captured via sudo journalctl -b-1 | tail -n50 after a crash
Unfortunately afaict there's nothing that would even hint towards a crash in there.

> NO input is taken (not even the REISUB method works)
> even SSH doesn't connect
In that case, it's some sort of kernel bug. As it only happens with the eGPU connected, https://gitlab.freedesktop.org/drm/amd/-/issues is likely the best place to find answers
Comment 2 Lachlan Craig 2024-05-09 09:54:32 UTC
(In reply to Zamundaaa from comment #1)
> > Some (maybe useful) logs captured via sudo journalctl -b-1 | tail -n50 after a crash
> Unfortunately afaict there's nothing that would even hint towards a crash in
> there.
> 
> > NO input is taken (not even the REISUB method works)
> > even SSH doesn't connect
> In that case, it's some sort of kernel bug. As it only happens with the eGPU
> connected, https://gitlab.freedesktop.org/drm/amd/-/issues is likely the
> best place to find answers

I see, thank you, I had my doubts about that but decided to post here anyway because I didn't recall it happening in GNOME. I'll file a bug there then.