Bug 478308 - VMware kernel oops with KWIN_DRM_NO_AMS=0; desktop does not repaint
Summary: VMware kernel oops with KWIN_DRM_NO_AMS=0; desktop does not repaint
Status: RESOLVED UPSTREAM
Alias: None
Product: kwin
Classification: Plasma
Component: platform-drm (show other bugs)
Version: git master
Platform: Other Other
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-09 15:24 UTC by Stefan Hoffmeister
Modified: 2023-12-25 19:41 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Hoffmeister 2023-12-09 15:24:01 UTC
SUMMARY

Running with `export KWIN_DRM_NO_AMS=0` causes KDE Plasma 6 to trigger kernel oops in VMware graphics `vmw_du_cursor_plane_cleanup_fb` on Wayland. This then results in a desktop that doesn't refresh.

I tried `KWIN_DRM_NO_AMS=0` explicitly to force rendering onto the default path (i.e. the "do not work around virtual machine challenges"), as this execution path will become active in January 2024, with kernel 6.8 (see recent commits on kwin) and some additions there.

I was expecting offset cursors (as on Plasma 5) as a challenge, but right now KDE oopses indeed the kernel, and the desktop is unusable.

STEPS TO REPRODUCE
1. configure Wayland + KWIN_DRM_NO_AMS=0
2. log into desktop
3. to some UI work
// ... after a very short while kernel oops

This is on Fedora Rawhide (40) with kernel 6.7.rc4, KDE Plasma 6 git master (as of this writing; past beta 1)

OBSERVED RESULT

```
Dec 09 16:09:08 fedora kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028                                                                                                                                                                                                             Dec 09 16:09:08 fedora kernel: #PF: supervisor read access in kernel mode                                                                                                                                                                                                                                  Dec 09 16:09:08 fedora kernel: #PF: error_code(0x0000) - not-present page                                                                                                                                                                                                                                  Dec 09 16:09:08 fedora kernel: PGD 0 P4D 0                                                                                                                                                                                                                                                                 Dec 09 16:09:08 fedora kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI                                                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: CPU: 4 PID: 710 Comm: kworker/u256:10 Not tainted 6.7.0-0.rc4.20231206gitbee0e7762ad2.37.fc40.x86_64 #1                                                                                                                                                                     Dec 09 16:09:08 fedora kernel: Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023                                                                                                                                                 Dec 09 16:09:08 fedora kernel: Workqueue: events_unbound commit_work                                                                                                                                                                                                                                       Dec 09 16:09:08 fedora kernel: RIP: 0010:vmw_du_cursor_plane_cleanup_fb+0x14d/0x170 [vmwgfx]                                                                                                                                                                                                               Dec 09 16:09:08 fedora kernel: Code: 00 00 00 00 00 00 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75 29 48 83 c4 10 5b 5d 41 5c c3 cc cc cc cc 48 8b 86 98 00 00 00 <48> 8b 78 28 e8 0a f1 00 00 c6 83 c0 00 00 00 00 e9 d2 fe ff ff e8                                                                     Dec 09 16:09:08 fedora kernel: RSP: 0018:ffffc90000857e00 EFLAGS: 00010202                                                                                                                                                                                                                                 Dec 09 16:09:08 fedora kernel: RAX: 0000000000000000 RBX: ffff888105edac00 RCX: 0000000000000000                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: RDX: ffff88810bc40000 RSI: ffff888105edac00 RDI: ffff88810d1f4c38                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: RBP: ffff88810d1f4c38 R08: ffff8881834582e0 R09: 0000000000000040                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: R10: 000000000000000f R11: fefefefefefefeff R12: 0000000000000000                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: R13: 0000000000000000 R14: ffff8881001ce005 R15: ffff88810c5f72e0                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: FS:  0000000000000000(0000) GS:ffff88842df00000(0000) knlGS:0000000000000000                                                                                                                                                                                                Dec 09 16:09:08 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: CR2: 0000000000000028 CR3: 0000000183414002 CR4: 0000000000f70ef0                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: PKRU: 55555554                                                                                                                                                                                                                                                              Dec 09 16:09:08 fedora kernel: Call Trace:                                                                                                                                                                                                                                                                 Dec 09 16:09:08 fedora kernel:  <TASK>                                                                                                                                                                                                                                                                     Dec 09 16:09:08 fedora kernel:  ? __die+0x23/0x70                                                                                                                                                                                                                                                          Dec 09 16:09:08 fedora kernel:  ? page_fault_oops+0x171/0x4e0                                                                                                                                                                                                                                              Dec 09 16:09:08 fedora kernel:  ? exc_page_fault+0x7f/0x180                                                                                                                                                                                                                                                Dec 09 16:09:08 fedora kernel:  ? asm_exc_page_fault+0x26/0x30                                                                                                                                                                                                                                             Dec 09 16:09:08 fedora kernel:  ? vmw_du_cursor_plane_cleanup_fb+0x14d/0x170 [vmwgfx]                                                                                                                                                                                                                      Dec 09 16:09:08 fedora kernel:  drm_atomic_helper_cleanup_planes+0x9b/0xc0                                                                                                                                                                                                                                 Dec 09 16:09:08 fedora kernel:  commit_tail+0xd1/0x130                                                                                                                                                                                                                                                     Dec 09 16:09:08 fedora kernel:  process_one_work+0x171/0x340                                                                                                                                                                                                                                               Dec 09 16:09:08 fedora kernel:  worker_thread+0x27b/0x3a0                                                                                                                                                                                                                                                  Dec 09 16:09:08 fedora kernel:  ? __pfx_worker_thread+0x10/0x10                                                                                                                                                                                                                                            Dec 09 16:09:08 fedora kernel:  kthread+0xe5/0x120                                                                                                                                                                                                                                                         Dec 09 16:09:08 fedora kernel:  ? __pfx_kthread+0x10/0x10                                                                                                                                                                                                                                                  Dec 09 16:09:08 fedora kernel:  ret_from_fork+0x31/0x50                                                                                                                                                                                                                                                    Dec 09 16:09:08 fedora kernel:  ? __pfx_kthread+0x10/0x10                                                                                                                                                                                                                                                  Dec 09 16:09:08 fedora kernel:  ret_from_fork_asm+0x1b/0x30                                                                                                                                                                                                                                                Dec 09 16:09:08 fedora kernel:  </TASK>                                                                                                                                                                                                                                                                    Dec 09 16:09:08 fedora kernel: Modules linked in: uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defra>Dec 09 16:09:08 fedora kernel: CR2: 0000000000000028                                                                                                                                                                                                                                                       Dec 09 16:09:08 fedora kernel: ---[ end trace 0000000000000000 ]---                                                                                                                                                                                                                                        Dec 09 16:09:08 fedora kernel: RIP: 0010:vmw_du_cursor_plane_cleanup_fb+0x14d/0x170 [vmwgfx]                                                                                                                                                                                                               Dec 09 16:09:08 fedora kernel: Code: 00 00 00 00 00 00 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75 29 48 83 c4 10 5b 5d 41 5c c3 cc cc cc cc 48 8b 86 98 00 00 00 <48> 8b 78 28 e8 0a f1 00 00 c6 83 c0 00 00 00 00 e9 d2 fe ff ff e8                                                                     Dec 09 16:09:08 fedora kernel: RSP: 0018:ffffc90000857e00 EFLAGS: 00010202                                                                                                                                                                                                                                 Dec 09 16:09:08 fedora kernel: RAX: 0000000000000000 RBX: ffff888105edac00 RCX: 0000000000000000                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: RDX: ffff88810bc40000 RSI: ffff888105edac00 RDI: ffff88810d1f4c38                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: RBP: ffff88810d1f4c38 R08: ffff8881834582e0 R09: 0000000000000040                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: R10: 000000000000000f R11: fefefefefefefeff R12: 0000000000000000                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: R13: 0000000000000000 R14: ffff8881001ce005 R15: ffff88810c5f72e0                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: FS:  0000000000000000(0000) GS:ffff88842df00000(0000) knlGS:0000000000000000                                                                                                                                                                                                Dec 09 16:09:08 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033                                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: CR2: 0000000000000028 CR3: 0000000183414002 CR4: 0000000000f70ef0                                                                                                                                                                                                           Dec 09 16:09:08 fedora kernel: PKRU: 55555554                                                                                                                                                                                                                                                              Dec 09 16:09:08 fedora kernel: note: kworker/u256:10[710] exited with irqs disabled                                                                                                                                                                                                                        Dec 09 16:09:08 fedora systemd[1287]: Finished plasma-ksplash.service - Splash screen shown during boot.                                                                                                                                                                                                   ```

EXPECTED RESULT

no oops
Comment 1 Zamundaaa 2023-12-10 18:46:08 UTC
Sorry, but we can't do anything about this. Please report this to the kernel
Comment 2 Neal Gompa 2023-12-11 08:04:08 UTC
The issue tracker for Linux Direct Rendering Manager (DRM) is: https://gitlab.freedesktop.org/drm/misc/-/issues

Please file an issue there.
Comment 3 Stefan Hoffmeister 2023-12-12 11:41:42 UTC
(In reply to Neal Gompa from comment #2)
> The issue tracker for Linux Direct Rendering Manager (DRM) is:
> https://gitlab.freedesktop.org/drm/misc/-/issues
> 
> Please file an issue there.

Reported https://gitlab.freedesktop.org/drm/misc/-/issues/34
Comment 4 Stefan Hoffmeister 2023-12-12 11:46:55 UTC
(In reply to Stefan Hoffmeister from comment #3)
> (In reply to Neal Gompa from comment #2)
> > The issue tracker for Linux Direct Rendering Manager (DRM) is:
> > https://gitlab.freedesktop.org/drm/misc/-/issues
> > 
> > Please file an issue there.
> 
> Reported https://gitlab.freedesktop.org/drm/misc/-/issues/34

FWIW - in https://gitlab.freedesktop.org/drm/misc/-/issues/33#note_2168879 a maintainer writes: "No one reads this issue tracker."

I'll try to fill the proper channel to report this.
Comment 5 Stefan Hoffmeister 2023-12-14 15:10:27 UTC
As the drm project does not read the issue tracker that they have open for use, I have sent email to the mailing list, see https://lore.kernel.org/all/20231214122709.Horde.5IIbIXWYbtITSEoTi0k2e1H@webmail.your-server.de/
Comment 6 Stefan Hoffmeister 2023-12-14 18:16:48 UTC
A really good way to make this problem appear is
* have KDE Plasma run with atomic mode-setting on, no software cursors
* switch to a TTY

--> immediate hang of system (in virtual machine)

So KDE Plasma + hardware cursor + atomic mode-setting on vmwgfx / VMware Workstation == really, really serious trouble.

systemd-logind in this case seems to DRM_IOCTL_DROP_MASTER and there are many more interesting things (for upstream) to look at in kernel logs from the DRM subsystem.

The backtrace below is just so much longer (and possibly more meaningful) than what I had before.

I'll try to communicate this on the dri-devel mailing list, too.
```
Dec 14 19:06:01 fedora kernel: BUG: kernel NULL pointer dereference, address: 0000000000000028
Dec 14 19:06:01 fedora kernel: #PF: supervisor read access in kernel mode
Dec 14 19:06:01 fedora kernel: #PF: error_code(0x0000) - not-present page
Dec 14 19:06:01 fedora kernel: PGD 0 P4D 0 
Dec 14 19:06:01 fedora kernel: Oops: 0000 [#1] PREEMPT SMP NOPTI
Dec 14 19:06:01 fedora kernel: CPU: 6 PID: 899 Comm: systemd-logind Not tainted 6.7.0-0.rc5.20231212git26aff849438c.42.fc40.x86_64 #1
Dec 14 19:06:01 fedora kernel: Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.21805430.B64.2305221830 05/22/2023
Dec 14 19:06:01 fedora kernel: RIP: 0010:vmw_du_cursor_plane_cleanup_fb+0x14d/0x170 [vmwgfx]
Dec 14 19:06:01 fedora kernel: Code: 00 00 00 00 00 00 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75 29 48 83 c4 10 5b 5d 41 5c c3 cc cc cc cc 48 8b 86 98 00 00 00 <48> 8b 78 28 e8 0a f1 00 00 c6 83 c0 00 00 00 00 e9 d2 fe ff ff e8
Dec 14 19:06:01 fedora kernel: RSP: 0018:ffffc90000f4b8c8 EFLAGS: 00010202
Dec 14 19:06:01 fedora kernel: RAX: 0000000000000000 RBX: ffff88836c2ada00 RCX: ffff88810bad0000
Dec 14 19:06:01 fedora kernel: RDX: ffffffffc02f9500 RSI: ffff88836c2ada00 RDI: ffff888103417c38
Dec 14 19:06:01 fedora kernel: RBP: ffff888103417c38 R08: 0000000000000000 R09: 0000000000000000
Dec 14 19:06:01 fedora kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Dec 14 19:06:01 fedora kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810bad0000
Dec 14 19:06:01 fedora kernel: FS:  00007f1cf9ae59c0(0000) GS:ffff88842df80000(0000) knlGS:0000000000000000
Dec 14 19:06:01 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 14 19:06:01 fedora kernel: CR2: 0000000000000028 CR3: 000000010e8fe001 CR4: 0000000000f70ef0
Dec 14 19:06:01 fedora kernel: PKRU: 55555554
Dec 14 19:06:01 fedora kernel: Call Trace:
Dec 14 19:06:01 fedora kernel:  <TASK>
Dec 14 19:06:01 fedora kernel:  ? __die+0x23/0x70
Dec 14 19:06:01 fedora kernel:  ? page_fault_oops+0x171/0x4e0
Dec 14 19:06:01 fedora kernel:  ? exc_page_fault+0x7f/0x180
Dec 14 19:06:01 fedora kernel:  ? asm_exc_page_fault+0x26/0x30
Dec 14 19:06:01 fedora kernel:  ? __pfx_vmw_du_cursor_plane_cleanup_fb+0x10/0x10 [vmwgfx]
Dec 14 19:06:01 fedora kernel:  ? vmw_du_cursor_plane_cleanup_fb+0x14d/0x170 [vmwgfx]
Dec 14 19:06:01 fedora kernel:  drm_atomic_helper_cleanup_planes+0x47/0x70
Dec 14 19:06:01 fedora kernel:  commit_tail+0xd1/0x130
Dec 14 19:06:01 fedora kernel:  drm_atomic_helper_commit+0x11a/0x140
Dec 14 19:06:01 fedora kernel:  drm_atomic_commit+0x97/0xd0
Dec 14 19:06:01 fedora kernel:  ? __pfx___drm_printfn_info+0x10/0x10
Dec 14 19:06:01 fedora kernel:  drm_client_modeset_commit_atomic+0x203/0x250
Dec 14 19:06:01 fedora kernel:  drm_client_modeset_commit_locked+0x5a/0x160
Dec 14 19:06:01 fedora kernel:  drm_fb_helper_pan_display+0xc9/0x1f0
Dec 14 19:06:01 fedora kernel:  fb_pan_display+0x83/0x140
Dec 14 19:06:01 fedora kernel:  fb_set_var+0x21a/0x420
Dec 14 19:06:01 fedora kernel:  ? __cond_resched+0x36/0x50
Dec 14 19:06:01 fedora kernel:  ? __flush_work.isra.0+0x1aa/0x280
Dec 14 19:06:01 fedora kernel:  ? update_load_avg+0x7e/0x7d0
Dec 14 19:06:01 fedora kernel:  fbcon_blank+0x213/0x310
Dec 14 19:06:01 fedora kernel:  do_unblank_screen+0xa9/0x160
Dec 14 19:06:01 fedora kernel:  complete_change_console+0x54/0x120
Dec 14 19:06:01 fedora kernel:  vt_ioctl+0xd8b/0x13f0
Dec 14 19:06:01 fedora kernel:  tty_ioctl+0x4ea/0x8b0
Dec 14 19:06:01 fedora kernel:  __x64_sys_ioctl+0x94/0xd0
Dec 14 19:06:01 fedora kernel:  do_syscall_64+0x61/0xe0
Dec 14 19:06:01 fedora kernel:  ? do_syscall_64+0x70/0xe0
Dec 14 19:06:01 fedora kernel:  ? do_syscall_64+0x70/0xe0
Dec 14 19:06:01 fedora kernel:  entry_SYSCALL_64_after_hwframe+0x6e/0x76
Dec 14 19:06:01 fedora kernel: RIP: 0033:0x7f1cfa5039ed
Dec 14 19:06:01 fedora kernel: Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
Dec 14 19:06:01 fedora kernel: RSP: 002b:00007fff61c54b80 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Dec 14 19:06:01 fedora kernel: RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007f1cfa5039ed
Dec 14 19:06:01 fedora kernel: RDX: 0000000000000001 RSI: 0000000000005605 RDI: 000000000000001c
Dec 14 19:06:01 fedora kernel: RBP: 00007fff61c54bd0 R08: 00007fff61c54b80 R09: 000055cacc5754a8
Dec 14 19:06:01 fedora kernel: R10: 00007fff61c54bb0 R11: 0000000000000246 R12: 0000000000000000
Dec 14 19:06:01 fedora kernel: R13: 000055cacc575d30 R14: 00007fff61c54c68 R15: 00007fff61c54c70
Dec 14 19:06:01 fedora kernel:  </TASK>
Dec 14 19:06:01 fedora kernel: Modules linked in: uinput snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr snd_seq_midi snd_seq_midi_event vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci>
Dec 14 19:06:01 fedora kernel: CR2: 0000000000000028
Dec 14 19:06:01 fedora kernel: ---[ end trace 0000000000000000 ]---
Dec 14 19:06:01 fedora kernel: RIP: 0010:vmw_du_cursor_plane_cleanup_fb+0x14d/0x170 [vmwgfx]
Dec 14 19:06:01 fedora kernel: Code: 00 00 00 00 00 00 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75 29 48 83 c4 10 5b 5d 41 5c c3 cc cc cc cc 48 8b 86 98 00 00 00 <48> 8b 78 28 e8 0a f1 00 00 c6 83 c0 00 00 00 00 e9 d2 fe ff ff e8
Dec 14 19:06:01 fedora kernel: RSP: 0018:ffffc90000f4b8c8 EFLAGS: 00010202
Dec 14 19:06:01 fedora kernel: RAX: 0000000000000000 RBX: ffff88836c2ada00 RCX: ffff88810bad0000
Dec 14 19:06:01 fedora kernel: RDX: ffffffffc02f9500 RSI: ffff88836c2ada00 RDI: ffff888103417c38
Dec 14 19:06:01 fedora kernel: RBP: ffff888103417c38 R08: 0000000000000000 R09: 0000000000000000
Dec 14 19:06:01 fedora kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Dec 14 19:06:01 fedora kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810bad0000
Dec 14 19:06:01 fedora kernel: FS:  00007f1cf9ae59c0(0000) GS:ffff88842df80000(0000) knlGS:0000000000000000
Dec 14 19:06:01 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 14 19:06:01 fedora kernel: CR2: 0000000000000028 CR3: 000000010e8fe001 CR4: 0000000000f70ef0
Dec 14 19:06:01 fedora kernel: PKRU: 55555554
Dec 14 19:06:01 fedora kernel: note: systemd-logind[899] exited with irqs disabled
```
Comment 7 Stefan Hoffmeister 2023-12-22 17:58:42 UTC
VMware have reacted on the dri gitlab issue tracker, ACKed the report, confirmed reproduction on their end, and noted that they now have an internal tracking item for this.
Comment 8 Stefan Hoffmeister 2023-12-25 19:41:46 UTC
There is a working patch in https://lore.kernel.org/all/20231225202541.Horde.tXckv5NJBOomrZjolmTSDS4@webmail.your-server.de/ which fixes the oops.

With that patch in place, it is now possible to detect that atomic mode-setting cursors are _not shown_, see https://bugs.kde.org/show_bug.cgi?id=479002