Bug 499523 - Kwin or AMDGPU crashes while in-game on Halo Infinite
Summary: Kwin or AMDGPU crashes while in-game on Halo Infinite
Status: RESOLVED UPSTREAM
Alias: None
Product: kwin
Classification: Plasma
Component: wayland-generic (other bugs)
Version First Reported In: 6.2.5
Platform: Fedora RPMs Linux
: NOR crash
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-02-04 22:42 UTC by Roguefort
Modified: 2025-02-21 23:51 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Journal ctl (584.27 KB, text/plain)
2025-02-04 22:42 UTC, Roguefort
Details
Another journalctl log (401.66 KB, text/plain)
2025-02-05 22:32 UTC, Roguefort
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roguefort 2025-02-04 22:42:38 UTC
Created attachment 177975 [details]
Journal ctl

SUMMARY

While on a match on Halo Infinite, there's a chance that either Kwin or AMDGPU (Or mesa I think) freezes and crashes, leaving a no signal image with no form recovery.

STEPS TO REPRODUCE
1. Have a AMD graphics card
2. Play Halo Infinite

OBSERVED RESULT

The screen freezes for a few seconds and then turns black, after a few seconds the monitor shows no signal. Sound still works. The AMDGPU tries to reset but fails to do so.

EXPECTED RESULT

The game continues with no problem

SOFTWARE/OS VERSIONS

Operating System: Fedora Linux 41
KDE Plasma Version: 6.2.5
KDE Frameworks Version: 6.10.0
Qt Version: 6.8.1
Kernel Version: 6.12.11-200.fc41.x86_64 (64-bit)
Graphics Platform: Wayland


ADDITIONAL INFORMATION

Might be a mesa bug, but I'm unsure.
Comment 1 Vlad Zahorodnii 2025-02-05 12:53:49 UTC
Relevant bits:

fev 04 21:55:22 fedora kernel: gmc_v10_0_process_interrupt: 45 callbacks suppressed
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32799)
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:  in process HaloInfinite.ex pid 12774 thread vkd3d_queue pid 12936
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00401431
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          Faulty UTCL2 client ID: SQC (data) (0xa)
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          MORE_FAULTS: 0x1
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          WALKER_ERROR: 0x0
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          PERMISSION_FAULTS: 0x3
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          MAPPING_ERROR: 0x0
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          RW: 0x0
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:4 pasid:32799)
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:  in process HaloInfinite.ex pid 12774 thread vkd3d_queue pid 12936
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:   in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00000000
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          Faulty UTCL2 client ID: CB/DB (0x0)
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          MORE_FAULTS: 0x0
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          WALKER_ERROR: 0x0
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          PERMISSION_FAULTS: 0x0
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          MAPPING_ERROR: 0x0
fev 04 21:55:22 fedora kernel: amdgpu 0000:2d:00.0: amdgpu:          RW: 0x0
fev 04 21:55:27 fedora kwin_wayland[2039]: kwin_scene_opengl: 0x3: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
fev 04 21:55:27 fedora kwin_wayland[2039]: kwin_scene_opengl: 0x3: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
fev 04 21:55:27 fedora kwin_wayland[2039]: kwin_scene_opengl: 0x3: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
fev 04 21:55:28 fedora kwin_wayland[2039]: kwin_scene_opengl: 0x3: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
fev 04 21:55:28 fedora kwin_wayland[2039]: kwin_scene_opengl: 0x3: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
fev 04 21:55:28 fedora kwin_wayland[2039]: kwin_scene_opengl: 0x3: GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
fev 04 21:55:32 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: Dumping IP State
fev 04 21:55:32 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: Dumping IP State Completed
fev 04 21:55:32 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=12426568, emitted seq=12426570
fev 04 21:55:32 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: Process information: process HaloInfinite.ex pid 12774 thread vkd3d_queue pid 12935
fev 04 21:55:32 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: GPU reset begin!
fev 04 21:55:32 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: MODE1 reset
fev 04 21:55:32 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: GPU mode1 reset
fev 04 21:55:32 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: GPU smu mode1 reset
fev 04 21:55:44 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: GPU reset succeeded, trying to resume
fev 04 21:55:44 fedora kernel: [drm] PCIE GART of 512M enabled (table at 0x0000008000300000).
fev 04 21:55:44 fedora kernel: [drm] VRAM is lost due to GPU reset!
fev 04 21:55:44 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: PSP is resuming...
fev 04 21:55:52 fedora kernel: [drm:psp_v11_0_memory_training [amdgpu]] *ERROR* send training msg failed.
fev 04 21:55:52 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: Failed to process memory training!
fev 04 21:55:52 fedora kernel: [drm:amdgpu_device_fw_loading [amdgpu]] *ERROR* resume of IP block <psp> failed -62
fev 04 21:55:52 fedora kernel: amdgpu 0000:2d:00.0: amdgpu: GPU reset(2) failed
fev 04 21:55:52 fedora kwin_wayland_wrapper[2167]: amdgpu: The CS has cancelled because the context is lost. This context is innocent.
...
fev 04 21:55:57 fedora kwin_wayland[2039]: kwin_wayland_drm: Pageflip timed out! This is a kernel bug

---

It looks like a graphics reset occurs but the system doesn't recover properly.
Comment 2 Roguefort 2025-02-05 22:32:29 UTC
Created attachment 178003 [details]
Another journalctl log

This one I waited 5 minutes for the gpu to possibly reset, but nothing. Weird boot messages.
Comment 3 TraceyC 2025-02-12 02:26:35 UTC
From bug 494440 this ca be ignored:
 GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)

The crash looks to be entirely in the AMD gpu drivers
Comment 4 Roguefort 2025-02-21 23:51:29 UTC
(In reply to TraceyC from comment #3)
> From bug 494440 this ca be ignored:
>  GL_INVALID_OPERATION in glDrawBuffers(unsupported buffer GL_BACK_LEFT)
> 
> The crash looks to be entirely in the AMD gpu drivers

Then it is a Mesa bug. Wasn't sure which one crapped the bed as there are many components that could've done it.