Summary: | composited mpv fullscreen video induces a GPU reset in double buffering vsync processing | ||
---|---|---|---|
Product: | [Plasma] kwin | Reporter: | Mark <markg85> |
Component: | scene-opengl | Assignee: | KWin default assignee <kwin-bugs-null> |
Status: | RESOLVED WORKSFORME | ||
Severity: | crash | Flags: | thomas.luebking:
NVIDIA+
|
Priority: | NOR | ||
Version: | 5.3.1 | ||
Target Milestone: | --- | ||
Platform: | Other | ||
OS: | Linux | ||
See Also: | https://bugs.kde.org/show_bug.cgi?id=362955 | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
support information output, right after reset
support information output, before reset |
Description
Mark
2015-08-09 12:34:54 UTC
Note: ticking the "Suspend compositor for full screen windows" seems to fix this issue for me. But this seems like a workaround to me since that feature gives warnings that not all hardware might support it. My hardware apparently does. That's actually no bug (in KWin) - everything acts as expected :-P When you set mpv fullscreen, for some™ reason, the nvidia driver starts to hang. KWin detects that and restarts the compositor. The only question is *why* fullscreen mpv AND kwin compositing makes the nvidia driver burp. I've some suspicions here, but let's see ;-) a) please dump "qdbus org.kde.KWin /KWin supportInformation" before *and* after the hang (and compositor restart) b) attach /var/log/Xorg.0.log *after* the reset c) attach "dmesg | tail -100" (right) *after* the reset d) attach "dmesg | grep NVRM" *after* the reset Please check the output of mpv, notably what video output (vdpau? opengl?) is used. ---- Suspending redirection for fullscreen windows is not known to cause trouble on nvidia - the unnamed child here is intel; and we meanwhile forcefully disable the feature on intel chips because of the daily bugreports ;-) It's left as a general warning (and because ppl. belive it gains them much performance, what's no more true since ages. Suspending the compositor altogether does. Just not redirecting one window has little impact in this regard) Weird, I've had this issue for weeks! Now - while trying to reproduce it to provide the requested information - i'm unable to reproduce it.. sigh.. I will post the info you requested as soon as i get this issue again :) setting to waitingforinfo till you are able to reproduce it again. Just had the issue again. The information: dmesg | tail -100 --- right after the reset (i don't think it tells anything though.) └─> $ dmesg | tail -100 [ 1.225055] usb 5-1.3: new full-speed USB device number 3 using xhci_hcd [ 1.275812] systemd-journald[84]: Received SIGTERM from PID 1 (systemd). [ 1.378217] usb 5-1.4: new low-speed USB device number 4 using xhci_hcd [ 1.398167] tsc: Refined TSC clocksource calibration: 3503.445 MHz [ 1.398172] clocksource tsc: mask: 0xffffffffffffffff max_cycles: 0x32800736690, max_idle_ns: 440795289710 ns [ 1.450458] systemd[1]: RTC configured in localtime, applying delta of 120 minutes to system time. [ 1.473919] usb 5-1.4: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes [ 1.504941] ip_tables: (C) 2000-2006 Netfilter Core Team [ 1.505003] systemd[1]: Inserted module 'ip_tables' [ 1.805326] EXT4-fs (sda4): re-mounted. Opts: data=ordered [ 2.184891] input: Power Button as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input2 [ 2.184899] ACPI: Power Button [PWRB] [ 2.184975] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3 [ 2.184978] ACPI: Power Button [PWRF] [ 2.197915] wmi: Mapper loaded [ 2.229445] random: nonblocking pool is initialized [ 2.234438] ACPI Warning: SystemIO range 0x0000000000000428-0x000000000000042F conflicts with OpRegion 0x0000000000000400-0x000000000000047F (\PMIO) (20150410/utaddress-254) [ 2.234447] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [ 2.234455] ACPI Warning: SystemIO range 0x0000000000000540-0x000000000000054F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20150410/utaddress-254) [ 2.234458] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [ 2.234460] ACPI Warning: SystemIO range 0x0000000000000530-0x000000000000053F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20150410/utaddress-254) [ 2.234465] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [ 2.234473] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052F conflicts with OpRegion 0x0000000000000500-0x000000000000051F (\LED_) (20150410/utaddress-254) [ 2.234478] ACPI Warning: SystemIO range 0x0000000000000500-0x000000000000052F conflicts with OpRegion 0x0000000000000500-0x0000000000000563 (\GPIO) (20150410/utaddress-254) [ 2.234486] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [ 2.234488] lpc_ich: Resource conflict(s) found affecting gpio_ich [ 2.234583] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 2.235218] thermal LNXTHERM:00: registered as thermal_zone0 [ 2.235221] ACPI: Thermal Zone [TZ00] (28 C) [ 2.235427] EDAC MC: Ver: 3.0.0 [ 2.235918] thermal LNXTHERM:01: registered as thermal_zone1 [ 2.235921] ACPI: Thermal Zone [TZ01] (30 C) [ 2.268674] input: PC Speaker as /devices/platform/pcspkr/input/input4 [ 2.280528] i801_smbus 0000:00:1f.3: enabling device (0001 -> 0003) [ 2.280844] ACPI Warning: SystemIO range 0x000000000000F000-0x000000000000F01F conflicts with OpRegion 0x000000000000F000-0x000000000000F00F (\_SB_.PCI0.SBUS.SMBI) (20150410/utaddress-254) [ 2.280850] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver [ 2.281276] systemd-journald[223]: Received request to flush runtime journal from PID 1 [ 2.292340] [drm] Initialized drm 1.1.0 20060810 [ 2.295682] alx 0000:06:00.0 eth0: Qualcomm Atheros AR816x/AR817x Ethernet [90:2b:34:5a:1f:67] [ 2.305076] iTCO_vendor_support: vendor-support=0 [ 2.309150] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11 [ 2.309182] iTCO_wdt: unable to reset NO_REBOOT flag, device disabled by hardware/BIOS [ 2.309246] alx 0000:06:00.0 enp6s0: renamed from eth0 [ 2.314199] AVX version of gcm_enc/dec engaged. [ 2.314201] AES CTR mode by8 optimization enabled [ 2.342820] snd_hda_intel 0000:00:1b.0: enabling device (0000 -> 0002) [ 2.342974] snd_hda_intel 0000:01:00.1: enabling device (0000 -> 0002) [ 2.342997] snd_hda_intel 0000:01:00.1: Disabling MSI [ 2.343002] snd_hda_intel 0000:01:00.1: Handle VGA-switcheroo audio client [ 2.364641] snd_hda_codec_via hdaudioC0D2: autoconfig for VT2020: line_outs=3 (0x24/0x25/0x26/0x0/0x0) type:line [ 2.364645] snd_hda_codec_via hdaudioC0D2: speaker_outs=0 (0x0/0x0/0x0/0x0/0x0) [ 2.364647] snd_hda_codec_via hdaudioC0D2: hp_outs=1 (0x28/0x0/0x0/0x0/0x0) [ 2.364649] snd_hda_codec_via hdaudioC0D2: mono: mono_out=0x0 [ 2.364650] snd_hda_codec_via hdaudioC0D2: dig-out=0x2d/0x2e [ 2.364652] snd_hda_codec_via hdaudioC0D2: inputs: [ 2.364654] snd_hda_codec_via hdaudioC0D2: Front Mic=0x29 [ 2.364656] snd_hda_codec_via hdaudioC0D2: Rear Mic=0x2b [ 2.364658] snd_hda_codec_via hdaudioC0D2: Line=0x2a [ 2.373982] input: HDA Digital PCBeep as /devices/pci0000:00/0000:00:1b.0/sound/card0/input6 [ 2.374150] input: HDA Intel PCH Front Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input7 [ 2.374185] input: HDA Intel PCH Rear Mic as /devices/pci0000:00/0000:00:1b.0/sound/card0/input8 [ 2.374224] input: HDA Intel PCH Line as /devices/pci0000:00/0000:00:1b.0/sound/card0/input9 [ 2.374259] input: HDA Intel PCH Line Out Front as /devices/pci0000:00/0000:00:1b.0/sound/card0/input10 [ 2.374297] input: HDA Intel PCH Line Out Surround as /devices/pci0000:00/0000:00:1b.0/sound/card0/input11 [ 2.374344] input: HDA Intel PCH Line Out CLFE as /devices/pci0000:00/0000:00:1b.0/sound/card0/input12 [ 2.374397] input: HDA Intel PCH Front Headphone as /devices/pci0000:00/0000:00:1b.0/sound/card0/input13 [ 2.390605] hidraw: raw HID events driver (C) Jiri Kosina [ 2.397249] Switched to clocksource tsc [ 2.411198] intel_rapl: Found RAPL domain package [ 2.411201] intel_rapl: Found RAPL domain core [ 2.412010] usbcore: registered new interface driver usbhid [ 2.412012] usbhid: USB HID core driver [ 2.421488] nvidia: module license 'NVIDIA' taints kernel. [ 2.421492] Disabling lock debugging due to kernel taint [ 2.435196] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem [ 2.435493] [drm] Initialized nvidia-drm 0.0.0 20150116 for 0000:01:00.0 on minor 0 [ 2.435497] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 352.30 Tue Jul 21 18:53:45 PDT 2015 [ 2.444635] input: Logitech Logitech Illuminated Keyboard as /devices/pci0000:00/0000:00:1c.4/0000:03:00.0/usb5/5-1/5-1.3/5-1.3:1.0/0003:046D:C318.0001/input/input14 [ 2.497402] hid-generic 0003:046D:C318.0001: input,hidraw0: USB HID v1.11 Keyboard [Logitech Logitech Illuminated Keyboard] on usb-0000:03:00.0-1.3/input0 [ 2.497958] input: Logitech Logitech Illuminated Keyboard as /devices/pci0000:00/0000:00:1c.4/0000:03:00.0/usb5/5-1/5-1.3/5-1.3:1.1/0003:046D:C318.0002/input/input15 [ 2.550562] hid-generic 0003:046D:C318.0002: input,hiddev0,hidraw1: USB HID v1.11 Device [Logitech Logitech Illuminated Keyboard] on usb-0000:03:00.0-1.3/input1 [ 2.550660] input: Logitech USB Laser Mouse as /devices/pci0000:00/0000:00:1c.4/0000:03:00.0/usb5/5-1/5-1.4/5-1.4:1.0/0003:046D:C069.0003/input/input16 [ 2.550733] hid-generic 0003:046D:C069.0003: input,hidraw2: USB HID v1.10 Mouse [Logitech USB Laser Mouse] on usb-0000:03:00.0-1.4/input0 [ 2.553677] mousedev: PS/2 mouse device common for all mice [ 2.730921] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input17 [ 2.730965] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input18 [ 2.731009] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input19 [ 2.731046] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:01.0/0000:01:00.1/sound/card1/input20 [ 3.255145] Guest personality initialized and is inactive [ 3.255193] VMCI host device registered (name=vmci, major=10, minor=58) [ 3.255195] Initialized host personality [ 3.304296] NET: Registered protocol family 40 [ 3.341431] fuse init (API version 7.23) [ 3.489668] ppdev: user-space parallel port driver [ 4.637196] NVRM: Your system is not currently configured to drive a VGA console [ 4.637200] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver [ 4.637202] NVRM: requires the use of a text-mode VGA console. Use of other console [ 4.637203] NVRM: drivers including, but not limited to, vesafb, may result in [ 4.637204] NVRM: corruption and stability problems, and is not supported. [ 5.296941] alx 0000:06:00.0 enp6s0: NIC Up: 1 Gbps Full dmesg | grep NVRM --- right after the reset. └─> $ dmesg | grep NVRM 826:[ 2.435497] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 352.30 Tue Jul 21 18:53:45 PDT 2015 844:[ 4.637196] NVRM: Your system is not currently configured to drive a VGA console 845:[ 4.637200] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver 846:[ 4.637202] NVRM: requires the use of a text-mode VGA console. Use of other console 847:[ 4.637203] NVRM: drivers including, but not limited to, vesafb, may result in 848:[ 4.637204] NVRM: corruption and stability problems, and is not supported. I will attach the support information outputs right after this post. Created attachment 94062 [details]
support information output, right after reset
Created attachment 94063 [details]
support information output, before reset
> [ 4.637196] NVRM: Your system is not currently configured to drive a VGA console > [ 4.637200] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver > [ 4.637202] NVRM: requires the use of a text-mode VGA console. Use of other console > [ 4.637203] NVRM: drivers including, but not limited to, vesafb, may result in > [ 4.637204] NVRM: corruption and stability problems, and is not supported. Please see https://wiki.archlinux.org/index.php/GRUB/Tips_and_tricks#Disable_framebuffer Most distros use /etc/default/grub before: > Painting blocks for vertical retrace: yes after: > Painting blocks for vertical retrace: no Do you use triple buffering grep -i triple /var/log/Xorg.0.log or export __GL_YIELD to "USLEEP"? tr '\0' '\n' < "/proc/`pidof kwin_x11`/environ" | grep -i yield Hi Thomas, Thank you very much for looking into this. Really appreciated! Yes, i use USLEEP: └─> $ env | grep "GL" 27:__GL_YIELD=USLEEP 30:__GL_THREADED_OPTIMIZATIONS=1 I remember adding that because video wasn't playing smooth without it. The xorg log shows nothing for triple buffering. Ok, please run "kcmshell5 kwincompositing" and set the "tearing prevention" to "never". Double buffering + usleep will get you swapcontrol (v'sync), but because swapping blocks (the function stalls kwin while waiting for the next vblank signal from your screen), things act a bit different (we process events between glFlush and glSwap to not loose to much time on waiting fot the next estimated vblank) As triple buffering is misdetected on resuming the compositor, you're suddenly in the non-blocking swap path (which may bypass the collision with mpv) - as you'll be w/o v'sync (as obviously glSwap doesn't have to wait here) Hi Thomas, Thank you for the suggestion and the elaborate explanation. I changed that setting to "never". I hope this fixes it :) I will report back within a few weeks with my findings. If i haven't done that in - lets say - 3 weeks, then please do remind me :) Just wondering, would it make sense for KWin to check the __GL_YIELD env variable and when it's set at USLEEP to just put tearing prevention on the "never" value by default? And probably warn the user that if it's changed, the behavior could have unexpected consequences? Or would that be too vendor specific? (In reply to Mark from comment #12) > Just wondering, would it make sense for KWin to check the __GL_YIELD env > variable and when it's set at USLEEP to just put tearing prevention on the > "never" value by default? No ;-) The entire purpose of this complex system is to allow vertical synchronisation (so the screen doesn't "tear" on updates) We require __GL_YIELD to be USLEEP because otherwise the nvidia driver starts busy waits (you can see one core spike and dispite of other claims, hear your fan start off ;-) - ie. for efficiency reasons. Without triple buffering, you'll however always run into the shifted cycle with event processing between flush and swap. We'll have to figure why this is a problem in order to actually fix this bug (or blame nvidia =) A maybe better workaround on your side would be to enable triple buffering in the driver (after we determined that the shifted cycle *is* the trigger here) > If i haven't done that in - lets say - 3 weeks, then please do remind me :)
3 weeks passed, here's the reminder :-)
Thank you for the reminder, Martin. I followed the tips given by Thomas and haven't observed the reported issue since then. However, having the tearing prevention set to never did present me with tearing (most notable when watching videos). So a few days ago i decided to go for the real solution (also suggested by Thomas) to enable triple buffering and re-enable tearing prevention in kwin. Videos seems to be playing smoothly now and no hangs in mpv at all. We ("I") need to test double buffering on the nvidia blob, there're too many related bug reports :-( Dear Bug Submitter, This bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? I am setting the status to NEEDSINFO pending your response, please change the Status back to REPORTED when you respond. Thank you for helping us make KDE software even better for everyone! Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging If you have already provided the requested information, please mark the bug as REPORTED so that the KDE team knows that the bug is ready to be confirmed. Thank you for helping us make KDE software even better for everyone! This bug has been in NEEDSINFO status with no change for at least 30 days. The bug is now closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging Thank you for helping us make KDE software even better for everyone! |