Summary: | (KDE / NVidia / Wayland) Monitor freezes when another monitor is turned on | ||
---|---|---|---|
Product: | [Plasma] kwin | Reporter: | nickc01 |
Component: | multi-screen | Assignee: | KWin default assignee <kwin-bugs-null> |
Status: | RESOLVED UPSTREAM | ||
Severity: | major | CC: | kde, nate, xaver.hugl |
Priority: | NOR | Keywords: | wayland-only |
Version First Reported In: | 5.27.4 | ||
Target Milestone: | --- | ||
Platform: | Arch Linux | ||
OS: | Linux | ||
See Also: | https://bugs.kde.org/show_bug.cgi?id=461068 | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
JournalCTL Log. The "kscreen_osd_ser" core dump begins at line - 11814
Night Color option already set to Always Off New journalctl log where all I do is boot the os, turn on the third monitor, and reboot The display settings for each of my three screens A log after adding QT_LOGGING_RULES="kwin_wayland_*.debug=true" to /etc/environment Result of running drm-debug log commands |
Description
nickc01
2023-04-25 21:16:16 UTC
Created attachment 158424 [details]
JournalCTL Log. The "kscreen_osd_ser" core dump begins at line - 11814
The kscreen_osd_ser crash is perhaps related, but probably a secondary symptom of the issue rather than its cause. Moving to KWin. There's two important parts here that could be the cause: Apr 25 14:46:34 nickarchlinux kwin_wayland[906]: kwin_wayland_drm: Failed to create gamma blob! Invalid argument This matches this bug report: https://bugs.kde.org/show_bug.cgi?id=468895 where a similar symptom occurs. Can you confirm if this is reproducible? If so does disabling night colour make a difference? Apr 25 14:46:34 nickarchlinux kwin_wayland[906]: kwin_scene_opengl: A graphics reset attributable to the current GL context occurred. This is also quite important and needs following up on (In reply to David Edmundson from comment #3) > There's two important parts here that could be the cause: > > Apr 25 14:46:34 nickarchlinux kwin_wayland[906]: kwin_wayland_drm: Failed to > create gamma blob! Invalid argument > > This matches this bug report: https://bugs.kde.org/show_bug.cgi?id=468895 > where a similar symptom occurs. > Can you confirm if this is reproducible? If so does disabling night colour > make a difference? > > > Apr 25 14:46:34 nickarchlinux kwin_wayland[906]: kwin_scene_opengl: A > graphics reset attributable to the current GL context occurred. > > This is also quite important and needs following up on Yes the bug is reproducible. So far, I haven't had a situation where the monitor didn't freeze up. My night color option has already been set to "Always Off" (see attachment), and the bug still occurs. I've also attached a new journalctl log, where all I do is boot up the OS, switch on the third monitor (causing a freeze), and reboot. While the gamma blob error seems to show up in 3 locations in the file, the "kwin_scene_opengl" error doesn't seem to show up in this new file at all. Also, this may or may not be relevant to be bug, but running "eglinfo" in the console causes it to crash with "corrupted size vs. prev_size". The crash also occurs on a separate Manjaro installation i have had setup for a while now. ----------EGL_INFO_OUTPUT---------- [nicholas@nickarchlinux ~]$ eglinfo EGL client extensions string: EGL_EXT_client_extensions, EGL_EXT_device_base, EGL_EXT_device_enumeration, EGL_EXT_device_query, EGL_EXT_explicit_device, EGL_EXT_platform_base, EGL_EXT_platform_device, EGL_EXT_platform_wayland, EGL_EXT_platform_x11, EGL_EXT_platform_xcb, EGL_KHR_client_get_all_proc_addresses, EGL_KHR_debug, EGL_KHR_platform_gbm, EGL_KHR_platform_wayland, EGL_KHR_platform_x11, EGL_MESA_platform_gbm, EGL_MESA_platform_surfaceless GBM platform: EGL API version: 1.5 EGL vendor string: NVIDIA EGL version string: 1.5 EGL client APIs: OpenGL_ES OpenGL EGL extensions string: EGL_EXT_buffer_age, EGL_EXT_client_sync, EGL_EXT_create_context_robustness, EGL_EXT_image_dma_buf_import, EGL_EXT_image_dma_buf_import_modifiers, EGL_EXT_output_base, EGL_EXT_output_drm, EGL_EXT_present_opaque, EGL_EXT_protected_content, EGL_EXT_stream_acquire_mode, EGL_EXT_stream_consumer_egloutput, EGL_EXT_sync_reuse, EGL_IMG_context_priority, EGL_KHR_config_attribs, EGL_KHR_context_flush_control, EGL_KHR_create_context, EGL_KHR_create_context_no_error, EGL_KHR_fence_sync, EGL_KHR_get_all_proc_addresses, EGL_KHR_gl_colorspace, EGL_KHR_gl_renderbuffer_image, EGL_KHR_gl_texture_2D_image, EGL_KHR_gl_texture_3D_image, EGL_KHR_gl_texture_cubemap_image, EGL_KHR_image, EGL_KHR_image_base, EGL_KHR_no_config_context, EGL_KHR_partial_update, EGL_KHR_reusable_sync, EGL_KHR_stream, EGL_KHR_stream_attrib, EGL_KHR_stream_consumer_gltexture, EGL_KHR_stream_cross_process_fd, EGL_KHR_stream_fifo, EGL_KHR_stream_producer_eglsurface, EGL_KHR_surfaceless_context, EGL_KHR_swap_buffers_with_damage, EGL_KHR_wait_sync, EGL_MESA_image_dma_buf_export, EGL_NV_nvrm_fence_sync, EGL_NV_output_drm_flip_event, EGL_NV_quadruple_buffer, EGL_NV_robustness_video_memory_purge, EGL_NV_stream_attrib, EGL_NV_stream_consumer_eglimage, EGL_NV_stream_consumer_gltexture_yuv, EGL_NV_stream_cross_display, EGL_NV_stream_cross_object, EGL_NV_stream_cross_process, EGL_NV_stream_cross_system, EGL_NV_stream_dma, EGL_NV_stream_fifo_next, EGL_NV_stream_fifo_synchronous, EGL_NV_stream_flush, EGL_NV_stream_metadata, EGL_NV_stream_origin, EGL_NV_stream_remote, EGL_NV_stream_reset, EGL_NV_stream_socket, EGL_NV_stream_socket_inet, EGL_NV_stream_socket_unix, EGL_NV_stream_sync, EGL_NV_system_time, EGL_NV_triple_buffer, EGL_WL_bind_wayland_display, EGL_WL_wayland_eglstream corrupted size vs. prev_size Aborted (core dumped) ----------OUTPUT IN JOURNALCTL---------- Apr 27 11:52:06 nickarchlinux systemd-coredump[4921]: [🡕] Process 4919 (eglinfo) of user 1000 dumped core. Stack trace of thread 4919: #0 0x00007f58f43128ec n/a (libc.so.6 + 0x878ec) #1 0x00007f58f42c3ea8 raise (libc.so.6 + 0x38ea8) #2 0x00007f58f42ad53d abort (libc.so.6 + 0x2253d) #3 0x00007f58f42ae29e n/a (libc.so.6 + 0x2329e) #4 0x00007f58f431c657 n/a (libc.so.6 + 0x91657) #5 0x00007f58f431d15e n/a (libc.so.6 + 0x9215e) #6 0x00007f58f431d2d0 n/a (libc.so.6 + 0x922d0) #7 0x00007f58f431f7d0 n/a (libc.so.6 + 0x947d0) #8 0x00007f58f43209f2 malloc (libc.so.6 + 0x959f2) #9 0x00007f58f2496155 n/a (libnvidia-eglcore.so.530.41.03 + 0x1496155) #10 0x00007f58f2489bfd n/a (libnvidia-eglcore.so.530.41.03 + 0x1489bfd) #11 0x00007f58f248a91c n/a (libnvidia-eglcore.so.530.41.03 + 0x148a91c) #12 0x00007f58f248aa3c n/a (libnvidia-eglcore.so.530.41.03 + 0x148aa3c) #13 0x00007f58f24a457b n/a (libnvidia-eglcore.so.530.41.03 + 0x14a457b) #14 0x00007f58f24a4700 n/a (libnvidia-eglcore.so.530.41.03 + 0x14a4700) #15 0x00007f58f3e42d12 n/a (libEGL_nvidia.so.0 + 0x42d12) #16 0x00007f58f3e48344 n/a (libEGL_nvidia.so.0 + 0x48344) #17 0x0000557781c64823 n/a (eglinfo + 0x8823) #18 0x0000557781c683a5 n/a (eglinfo + 0xc3a5) #19 0x0000557781c6026e n/a (eglinfo + 0x426e) #20 0x00007f58f42ae790 n/a (libc.so.6 + 0x23790) #21 0x00007f58f42ae84a __libc_start_main (libc.so.6 + 0x2384a) #22 0x0000557781c606e5 n/a (eglinfo + 0x46e5) ELF object binary architecture: AMD x86-64 ----------RUNNING COREDUMPCTL GDB ON THE EGL_INFO CRASH---------- Debuginfod has been enabled. To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit. Downloading separate debug info for /usr/bin/eglinfo Reading symbols from /home/nicholas/.cache/debuginfod_client/39a6232bddf644b420c136dd02d436f65fd1b3ea/debuginfo... [New LWP 4919] warning: Section `.reg-xstate/4919' in core file too small. Downloading separate debug info for /usr/lib/libEGL_nvidia.so.0 Downloading separate debug info for /usr/lib/libnvidia-glsi.so.530.41.03 Downloading separate debug info for /usr/lib/libnvidia-eglcore.so.530.41.03 Downloading separate debug info for /usr/lib/libnvidia-egl-gbm.so.1 Downloading separate debug info for /usr/lib/gbm/nvidia-drm_gbm.so Downloading separate debug info for system-supplied DSO at 0x7ffe2e2b6000 --Type <RET> for more, q to quit, c to continue without paging--c [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Core was generated by `eglinfo'. Program terminated with signal SIGABRT, Aborted. warning: Section `.reg-xstate/4919' in core file too small. #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007f58f4312953 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #2 0x00007f58f42c3ea8 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007f58f42ad53d in __GI_abort () at abort.c:79 #4 0x00007f58f42ae29e in __libc_message (fmt=fmt@entry=0x7f58f442577e "%s\n") at ../sysdeps/posix/libc_fatal.c:150 #5 0x00007f58f431c657 in malloc_printerr (str=str@entry=0x7f58f44231d9 "corrupted size vs. prev_size") at malloc.c:5651 #6 0x00007f58f431d15e in unlink_chunk (p=<optimized out>, av=0x7f58f4463aa0 <main_arena>) at malloc.c:1605 #7 0x00007f58f431d2d0 in malloc_consolidate (av=av@entry=0x7f58f4463aa0 <main_arena>) at malloc.c:4766 #8 0x00007f58f431f7d0 in _int_malloc (av=av@entry=0x7f58f4463aa0 <main_arena>, bytes=1472) at malloc.c:3951 #9 0x00007f58f43209f2 in __GI___libc_malloc (bytes=<optimized out>) at malloc.c:3297 #10 0x00007f58f2496155 in ?? () from /usr/lib/libnvidia-eglcore.so.530.41.03 #11 0x00007f58f2489bfd in ?? () from /usr/lib/libnvidia-eglcore.so.530.41.03 #12 0x00007f58f248a91c in ?? () from /usr/lib/libnvidia-eglcore.so.530.41.03 #13 0x00007f58f248aa3c in ?? () from /usr/lib/libnvidia-eglcore.so.530.41.03 #14 0x00007f58f24a457b in ?? () from /usr/lib/libnvidia-eglcore.so.530.41.03 #15 0x00007f58f24a4700 in ?? () from /usr/lib/libnvidia-eglcore.so.530.41.03 #16 0x00007f58f3e42d12 in ?? () from /usr/lib/libEGL_nvidia.so.0 #17 0x00007f58f3e48344 in ?? () from /usr/lib/libEGL_nvidia.so.0 #18 0x0000557781c64823 in createEGLContext (d=d@entry=0x557783a926a0, conf=conf@entry=0xcaf32c, api=api@entry=12450, khr_create_context=khr_create_context@entry=1, core_profile=core_profile@entry=1, context_version=context_version@entry=0x7ffe2e20b09c) at ../mesa-demos-9.0.0/src/egl/opengl/eglinfo.c:436 #19 0x0000557781c683a5 in doOneDisplay (d=0x557783a926a0, name=<optimized out>, opts=...) at ../mesa-demos-9.0.0/src/egl/opengl/eglinfo.c:575 #20 0x0000557781c6026e in main (argc=<optimized out>, argv=<optimized out>) at ../mesa-demos-9.0.0/src/egl/opengl/eglinfo.c:850 Created attachment 158491 [details]
Night Color option already set to Always Off
Created attachment 158492 [details]
New journalctl log where all I do is boot the os, turn on the third monitor, and reboot
UPDATE The bug doesn't seem to occur if my GIGABYTE monitor on the left is disconnected from the setup. If I start out with only my ASUS monitor (my right monitor) turned on, and then turn on the TV screen after I fully booted and logged into the OS, no freezing occurs. I can turn off and on either of the two screens and no freezing occurs. However, if I then connect the GIGABYTE monitor to the setup, that is when the freezing begins. My GIGABYTE monitor would freeze up, and occasionally my ASUS monitor would freeze too. I've attached my monitor display settings below I also just tested to see if setting "Adaptive Sync" to "Never" on the Gigabyte monitor would change anything, but it still freezes up. My GIGABYTE monitor is connected to my computer via DisplayPort My ASUS monitor is connected to my computer via DisplayPort The TV screen is connected to my computer via HDMI Created attachment 158493 [details]
The display settings for each of my three screens
The GIGABYTE monitor is also the only monitor in the setup that has support for HDR, if that is relevant to the problem Please enable debug logging for KWin by putting QT_LOGGING_RULES="kwin_wayland_*.debug=true" into /etc/environment and rebooting, then cause the issue again and attach the output of `journalctl --user-unit plasma-kwin_wayland --boot 0` afterwards (In reply to Zamundaaa from comment #10) > Please enable debug logging for KWin by putting > QT_LOGGING_RULES="kwin_wayland_*.debug=true" > into /etc/environment and rebooting, then cause the issue again and attach > the output of `journalctl --user-unit plasma-kwin_wayland --boot 0` > afterwards I added "QT_LOGGING_RULES="kwin_wayland_*.debug=true"" into /etc/environment, rebooted, and triggered the freeze. I turned on the third monitor, which caused the first monitor to freeze, then I turned the third monitor back off again, which brought the first monitor back to life. I added the journal ctl log as an attachment. Created attachment 160800 [details]
A log after adding QT_LOGGING_RULES="kwin_wayland_*.debug=true" to /etc/environment
hmm, that is surprising, there are no relevant warnings or anything. We'll need to dig deeper. Please enable drm debug logging with > echo 0x1FF | sudo tee /sys/module/drm/parameters/debug start writing the log with > sudo dmesg -w > drm-debug.log leave that running, and then trigger the freeze again. Afterwards stop the command and attach the log here Alright, I ran the commands you told me, and I triggered the freeze again while the dmesg command was running. Because the output drm-debug.log was over 6.8 megabytes in size, I couldn't attach it to this bug report normally. I didn't want to risk trimming the file and loosing critical information, so I uploaded it to my Onedrive, and here's the link so you can view it: https://1drv.ms/u/s!Aj62egREH4PT0nNny4mmBaDxsm2t?e=FlxMdK I also attached the output of running the commands in the terminal if that's relevant Created attachment 160894 [details]
Result of running drm-debug log commands
Sorry, this got completely lost in my emails. Looking at that log, I see that KWin does the modeset for the three outputs correctly... but then KWin only presents stuff on the one output until the next hotplug event comes in. So this is at least unlikely to be a driver bug, as the driver never gets the requests to show stuff. Can you still reproduce this on KWin 5.27.10? Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging If you have already provided the requested information, please mark the bug as REPORTED so that the KDE team knows that the bug is ready to be confirmed. Thank you for helping us make KDE software even better for everyone! (In reply to Zamundaaa from comment #16) > Sorry, this got completely lost in my emails. Looking at that log, I see > that KWin does the modeset for the three outputs correctly... but then KWin > only presents stuff on the one output until the next hotplug event comes in. > So this is at least unlikely to be a driver bug, as the driver never gets > the requests to show stuff. > > Can you still reproduce this on KWin 5.27.10? Sorry for the delay I actually did two tests, one from 17 days ago (January 23rd), and another one today (February 9th). The crash was able to occur when I tested it on January 23rd, but because I forgot to report it here, I tried testing it again today to see if I could recreate it again. I updated my system again today, and I spent about 20 minutes turning on and off the different monitors, but I wasn't able to get it to crash. It seems like the issue is fixed now. Here's the journalctl log and pacman.log from today: https://1drv.ms/f/s!Aj62egREH4PT1DTAgJABgF6XJ448?e=aaof5Q The crash on January 23rd took me roughly 10 - 15 minutes to trigger it. When it triggered, the GIGABYTE monitor was completely frozen, and once it was frozen, no amount of turning on and off any of the monitors fixed it again. Here's the journalctl log and crash logs from January 23rd if it helps : https://1drv.ms/f/s!Aj62egREH4PT1BPdebxJyAMYROt3?e=2NBx3e Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging If you have already provided the requested information, please mark the bug as REPORTED so that the KDE team knows that the bug is ready to be confirmed. Thank you for helping us make KDE software even better for everyone! I don't see anything odd in those logs. I assume this was fixed on the driver side; if it happens again, just reopen this bug report. |