Summary: | KWIN broken vsync and wrong fps. | ||
---|---|---|---|
Product: | [Plasma] kwin | Reporter: | Alexey Dyachenko <adotfive> |
Component: | general | Assignee: | KWin default assignee <kwin-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | adotfive, jan, vmlinuz386, walmartshopper |
Priority: | NOR | Flags: | thomas.luebking:
NVIDIA+
thomas.luebking: ReviewRequest+ |
Version: | 5.2.95 | ||
Target Milestone: | --- | ||
Platform: | Arch Linux | ||
OS: | Linux | ||
URL: | https://git.reviewboard.kde.org/r/125659/ | ||
See Also: | https://bugs.kde.org/show_bug.cgi?id=344433 | ||
Latest Commit: | http://commits.kde.org/kwin/8bea96d7018d02dff9462326ca9456f48e9fe9fb | Version Fixed In: | 5.5 |
Sentry Crash Report: | |||
Attachments: |
kwin 5.2.95 support information dump
Triple buffered vs double buffered Workaround for double buffered case |
Description
Alexey Dyachenko
2015-04-16 19:34:31 UTC
a) Do not enfore a false "KWIN_TRIPLE_BUFFER=1". This is to skip heuristic detection. b) if it tears, vsync is likely not enabled. c) if you get 90FPS, vsync is _certainly_ not enabled. d) why is NoFlip enabled? -> That explains the tearing (whether waiting for the retrace or not, copying the buffer is not fast enough to happen during the retrace) There's also a runtime setting in nvidia-settings, don't turn flipping off unless you've a *REALLY* good reason (performance killer) e) __GL_YIELD=usleep is only required if triple buffering is NOT enabled f) there's a reported bug that detecting triple buffering fails during login with KDE 5. suspending/resuming the compositor (shift+alt+f12) or restarting "kwin_x11 --replace &" usually "fixes" it. a,d,e -- I tried alot of options in various combinations just to diagnose things and to see if it gets any better. b -- indeed. c -- the point I was trying to make is that somehow sleeping interval gets modified with vsync enabled and misdetected 50Hz it will be fixed 47-50 or 70 fps with no fluctuations, while with vsync enabled and 60Hz it will be fixed 90-91 fps. 60Hz is correct one, my display and xrandr both say 60Hz, but this is another bug, probably. Main clue here is that with disabled vsync my fps is capped at 60-61fps. Another clue is that modifying VBlankTime to 2 gets me correct 60fps with vsync enabled in kwin, but tearing will still be present. e,f -- I am aware of longstanding bug and kwin code involving __GL_YIELD. My previous 5.2 configuration was default one, with empty xorg.conf and flipping enabled (by default). Note that I also did not set any environment variables in 5.2 (and before): I waited till vsync gets disabled by faulty tripple buffering detection, then I simply flip GL3->GL2 or vice versa to get working vsync and smooth 60fps. (In reply to Alexey Dyachenko from comment #2) > Main clue here is that with disabled vsync my fps is capped at 60-61fps. That's no clue - there's simply a timer hitting every 16ms to repaint. > a,d,e -- I tried alot of options in various combinations just to diagnose > things and to see if it gets any better. Ok, please a) return to a vanilla state w/o random silly config settings (that will expectably fail) b) Choose and record the choice for triple buffering c) login, wait a few minutes and then d) dump and attach the output of "qdbus org.kde.KWin /KWin supportInformation" > c -- the point I was trying to make is that somehow sleeping interval gets > modified You don't understand: if vsync is enabled, the driver waits for the next retrace signal with every swap - you cannot possibly *increase* the framerate > the refreshrate of the screen. If you could, that would indicate a driver bug for sure. > Another clue is that modifying VBlankTime to 2 literally "2"? - that's as good as "0" will mean to constantly miss retraces (lag a frame). The default is 6144. > My previous 5.2 configuration was default one, with empty xorg.conf and > flipping enabled (by default). > > Note that I also did not set any environment variables in 5.2 (and before): > I waited till vsync gets disabled by faulty tripple buffering detection, > then I simply flip GL3->GL2 or vice versa to get working vsync and smooth > 60fps. Errr... fyi: No xorg settings means no triple buffering. No environment (ie. no __GL_YIELD=usleep) would result in vsync being forcefully disabled because of correctly detected absence of triple buffering (and a CPU hungry driver performing busy waits) (In reply to Thomas Lübking from comment #3) > (In reply to Alexey Dyachenko from comment #2) > > > Main clue here is that with disabled vsync my fps is capped at 60-61fps. > > That's no clue - there's simply a timer hitting every 16ms to repaint. > > > a,d,e -- I tried alot of options in various combinations just to diagnose > > things and to see if it gets any better. > > Ok, please > a) return to a vanilla state w/o random silly config settings (that will > expectably fail) > b) Choose and record the choice for triple buffering > c) login, wait a few minutes and then > d) dump and attach the output of "qdbus org.kde.KWin /KWin > supportInformation" I will. If there is reasonable number of dependencies I will also try to bisect, probably will be much faster to find root of the evil. > > c -- the point I was trying to make is that somehow sleeping interval gets > > modified > > You don't understand: > if vsync is enabled, the driver waits for the next retrace signal with every > swap - you cannot possibly *increase* the framerate > the refreshrate of the > screen. > If you could, that would indicate a driver bug for sure. But if there is no vsync being actually done than I can draw with any fps till hardware limit, right? > > Another clue is that modifying VBlankTime to 2 > literally "2"? - that's as good as "0" will mean to constantly miss retraces > (lag a frame). The default is 6144. Value in config is multiplied by 1000 so thats 2000. > > My previous 5.2 configuration was default one, with empty xorg.conf and > > flipping enabled (by default). > > > > Note that I also did not set any environment variables in 5.2 (and before): > > I waited till vsync gets disabled by faulty tripple buffering detection, > > then I simply flip GL3->GL2 or vice versa to get working vsync and smooth > > 60fps. > > Errr... fyi: > No xorg settings means no triple buffering. > No environment (ie. no __GL_YIELD=usleep) would result in vsync being > forcefully disabled because of correctly detected absence of triple > buffering (and a CPU hungry driver performing busy waits) Plain simple double buffered vsync should work, no? Well, there is a piece of code inside kwin, that would disable vsync forcefully if it sees no __GL_YIELD being set, and it actually was triggered every time I log in. So it was setting either __GL_YIELD or KWIN_TRIPPLE_BUFFER to live peacefully, or get disabled vsync after 10 seconds and re-enable it manually. Yes, the situation was quite ridiculous, to live through 4.x like that and to see same behaviour in 5.x but, what can I do? Well I could simply rip out that piece of code but thats is still spending some effort every version change. Apart from that I had no reportred increase in CPU usage and vsync worked after mentioned above workarounds. But all of that doesn't apply since 5.2.95. (In reply to Alexey Dyachenko from comment #4) > I will. If there is reasonable number of dependencies I will also try to > bisect later, first let's get an idea of the state. The problem may have been caused by libGL or nvidia driver update. > But if there is no vsync being actually done than I can draw with any fps > till hardware limit, right? Yes. No. There's still a timer, but it will run too fast if it would assume a locking buffer swap. > Value in config is multiplied by 1000 so thats 2000. Yes. As the comment suggests: 2000 NANO seconds. ie. "Nothing". > Plain simple double buffered vsync should work, no? Yes, but is vastly expensive due to the busy waiting driver. Except vsync wasn't running by your changes OR gl yields by a usleep. (that's why it's required) > So it was setting either __GL_YIELD good. > or KWIN_TRIPPLE_BUFFER to live BAD! However "KWIN_TRIPPLE_BUFFER" [sic!] would do nothing anyway. The problem w/ __GL_YIELD is that the workload of kwin is linked in from libkdeinit5_* and that broke linking in "libnvidiahack" to set environments before the GL lib is linked. If you manage to set the environment from the process (and in time) - any patch would be more than welcome! Created attachment 92100 [details]
kwin 5.2.95 support information dump
(In reply to Thomas Lübking from comment #5) > (In reply to Alexey Dyachenko from comment #4) > > or KWIN_TRIPPLE_BUFFER to live > BAD! > However "KWIN_TRIPPLE_BUFFER" [sic!] would do nothing anyway. I can ensure you that actual variables were set without typos (= https://bugs.kde.org/attachment.cgi?id=92100 I added requested support information. xorg.conf.d/20-nvidia.conf has only NoLogo .nvidia-setting-rc file was removed ("Sync to VBlank" and "Allow Flipping" are both enabled, I checked this) Dump was taken immediately after login, it shows about 55 fps (tearing is present). Then fps jump to 99 and following changes can be seen: glPreferBufferSwap: 99 Painting blocks for vertical retrace: no changes to glPreferBufferSwap: 0 Painting blocks for vertical retrace: yes then I restart compositing or kwin and observe 60 fps (tearing is present) and variable changes to Painting blocks for vertical retrace: no Following step was putting "export __GL_YIELD=usleep" into .xprofile and repeating the procedure. Results were same. I actually only just noticed that kwin_x11 sits at 10% of one core, something that was never observed. When I move windows it jumps to 25%. __GL_YIELD is set. Restarting kwin or compositing changes nothing. Situation is exactly same with 346.59 from Arch repos. I also tried to simply get back to 5.2.2 to obtain some info, but it crashes without full KDE downgrade. (In reply to Alexey Dyachenko from comment #7) > Following step was putting "export __GL_YIELD=usleep" into .xprofile and > repeating the procedure. Results were same. Sure .xprofile is interpreted (in time)? what's the output of tr '\0' '\n' < /proc/`pidof kwin_x11`/environ (In reply to Thomas Lübking from comment #10) > (In reply to Alexey Dyachenko from comment #7) > > > Following step was putting "export __GL_YIELD=usleep" into .xprofile and > > repeating the procedure. Results were same. > > Sure .xprofile is interpreted (in time)? > > what's the output of > tr '\0' '\n' < /proc/`pidof kwin_x11`/environ Yep, __GL_YIELD is there. And I found out that vsync doesn't work only in double buffered case. If I run without variable (I don't like it being set globaly) and use "Option "TripleBuffering" "True"", then vsync works if I re-enable it after faulty triple buffering detection. I don't need triple buffering being forced on all GL programs, so I need to find out why double buffered case, which worked up till 5.2.2, stopped working in 5.2.95. And btw I experience no weird CPU usage, my previous message was caused by kwin fps plugin. If you "kwin_x11 --replace &" from konsole (w/ double buffering) do receive a warning about no triple buffering nor usleep yielding (and thus vsync deactivation)? (In reply to Thomas Lübking from comment #12) > If you "kwin_x11 --replace &" from konsole (w/ double buffering) do receive > a warning about no triple buffering nor usleep yielding (and thus vsync > deactivation)? Yes I get: kwin_core: Triple buffering detection: "NOT available" - Mean block time: 8.01076 ms as expected, but then I do " __GL_YIELD=usleep kwin_x11 --replace" and still get the same message. (In reply to Alexey Dyachenko from comment #13) > (In reply to Thomas Lübking from comment #12) > > If you "kwin_x11 --replace &" from konsole (w/ double buffering) do receive > > a warning about no triple buffering nor usleep yielding (and thus vsync > > deactivation)? > > Yes I get: > kwin_core: Triple buffering detection: "NOT available" - Mean block time: > 8.01076 ms > as expected, but then I do " __GL_YIELD=usleep kwin_x11 --replace" and still > get the same message. Yes, because you don't have it enabled. The crucial message would be It seems you are using the nvidia driver without triple buffering\n" "You must export __GL_YIELD=\"USLEEP\" to prevent large CPU overhead on synced swaps\n" "Preferably, enable the TripleBuffer Option in the xorg.conf Device\n" "For this reason, the tearing prevention has been disabled.\n" "See https://bugs.kde.org/show_bug.cgi?id=322060\n"; __GL_YIELD=usleep __GL_YIELD=USLEEP - "Size does matter!"™ Created attachment 92104 [details]
Triple buffered vs double buffered
Alright it does have some effect! I guess there is no tearing, but its very laggy and feels like 20 fps. https://bugs.kde.org/attachment.cgi?id=92104 Please check this attachment with kwin fps plugin for both triple buffered and double buffered (with USLEEP) cases. In 5.2.2 and prior I did not use __GL_YIELD, and was reactivating vsync every login, and it was very smooth, like now with triple buffering. Now triple buffered is only possible use scenario. I started having this problem when I upgraded from 5.2 to 5.3 on Arch. Here's what I've seen: I have four monitors, 3 at 60hz and 1 at 110hz. With kwin 5.2, after logging in, I would switch vsync to Never then back to Automatic. After that, it was buttery smooth on the 110hz monitor. After upgrading to 5.3, I noticed it looked choppy (60fps looks choppy after getting used to 110). When kwin first starts with vsync enabled, it's running at 48fps (half of 96hz). Then after a few seconds, it jumps to 72fps (half of 144hz). If I disable vsync, it runs at 60fps. I have tried playing with triple buffering and __GL_YIELD, but I can't get it to run above 72fps. It's almost like kwin has some standard refresh rates hard-coded. But there are overclockable monitors that could be running at any refresh rate via a custom edid file (that's why mine is 110hz). So I don't know what changed, but in 5.2 I got 110fps, and with 5.3 I can't get more than 72fps, and it's a very noticeable difference. (In reply to walmartshopper from comment #17) > I started having this problem when I upgraded from 5.2 to 5.3 on Arch. > Here's what I've seen: > > I have four monitors, 3 at 60hz and 1 at 110hz. With kwin 5.2, after > logging in, I would switch vsync to Never then back to Automatic. After > that, it was buttery smooth on the 110hz monitor. After upgrading to 5.3, I > noticed it looked choppy (60fps looks choppy after getting used to 110). > When kwin first starts with vsync enabled, it's running at 48fps (half of > 96hz). Then after a few seconds, it jumps to 72fps (half of 144hz). If I > disable vsync, it runs at 60fps. I have tried playing with triple buffering > and __GL_YIELD, but I can't get it to run above 72fps. It's almost like > kwin has some standard refresh rates hard-coded. But there are > overclockable monitors that could be running at any refresh rate via a > custom edid file (that's why mine is 110hz). So I don't know what changed, > but in 5.2 I got 110fps, and with 5.3 I can't get more than 72fps, and it's > a very noticeable difference. Can you post KWin support information (see posts above)? Also check kwin output; I have two 60Hz monitors but kwin detects my refresh rate as 50Hz, so I have to set RefreshRate=60 in ~/.config/kwinrc This is another bug introduced in 5.3 and hopefully its reported elsewhere. I have replaced 670 with 980 and double buffering works as pre 5.3. I dont export __GL_YIELD and have triple buffering disabled, all I have to do is usual voodoo dance with re-enabling vsync. Actually can we have checkbox for double buffered vsync? e.g. not checking for tripple buffering and simply doing swaps without blocking retraces (as I inderstand it is doing now), looking as I dont need to export __GL_YIELD. Thanks, updating my kwinrc fixed it. I added RefreshRate=110 and MaxFPS=120, and now it's running smoothly again. > I have replaced 670 with 980 and double buffering works as pre 5.3.
This has really no impact from our side (ie. you've a faster GPU now, that's all - both should however be overdimensioned for KWin needs)
Without triple buffering, synced swaps will _always_ block.
What you may want is "ignore that the driver performs a busy wait", but that's actually silly: just export __GL_YIELD=USLEEP
I linked in another bug which may be related, there's sth. randomly causing slow fence syncing (w/ triple buffering, though)
The refresh rate detection is a bit odd.
The 50Hz "issue" was originally a design defect in the nvidia driver, due to their TwinView implementation. We worked around that by avoiding xrandr and calling into xf86vm (xvidmode) or even asking nvidia-settings as last resort.
Then nvidia introduced full xrandr 1.3 support and "xrandr -q" actually still reports proper refresh rates here.
So after a long time we removed the workarounds.
And no lately at least the xcb connections return that 50Hz again ... :-(
(In reply to Thomas Lübking from comment #21) > that's actually silly: just export __GL_YIELD=USLEEP I would be happy to, but if I do that everything is laggy and kwin timings become sawtoothy, like in attachment https://bugs.kde.org/attachment.cgi?id=92104 (left is just triple buffering enabled, right is only __GL_YIELD="USLEEP"), as I said before. (In reply to Thomas Lübking from comment #21) > that's actually silly: just export __GL_YIELD=USLEEP I would be happy to, but if I do that everything is laggy and kwin timings become sawtoothy, like in attachment https://bugs.kde.org/attachment.cgi?id=92104 (left is just triple buffering enabled, right is only __GL_YIELD="USLEEP"), as I said before. Yes, triple buffering is supposed to produce better results, but I thought it does not for you, thus you *want* to use double buffering? For a detailed explanation of the behavior: ---------------------------------------------------------- a) If triple buffering is enabled in the driver AND (not! "or") this is correctly detected or enforced, KWin will swap buffers every ~16ms or as fast as possible (if painting takes longer) and rely on the driver to move the 3rd buffer to scanout in sync to the screen. b) If triple buffering is disabled AND this is correctly detected or enforced AND __GL_YIELD=USLEEP is exported, KWin will swap buffers faster (10ms) to be early for the retrace and rely on the buffer swap to block until the retrace is finished. c) If triple buffering is assumed to be not available (correct or not) AND __GL_YIELD=USLEEP is NOT exported, vsync will be disabled. Whether for this or explicit configuration, the behavior is exactly the same as for triple buffering, just that you'll get tearing. d) If triple buffering is enabled, but KWin falsely assumes "permitted" (__GL_YIELD=USLEEP) double buffering, KWin will flood the buffer, ie. swap ~ every 10ms and the driver will block the third swap for a frame. e) If triple buffering is disabled, but KWin falsely assumes it enabled, it will swap too slow. It'll be too late for almost every frame and then spend nearly the entire next frame waiting for the next retrace. The FPS will drop to 30Hz. Your lasted comment would fit the (d) condition. There is currently 3 scenarious I encounter: 1) I set "TripleBuffer" "True" in corg.conf, no __GL_YIELD or nothing. VSYNC works out of the box. Satisfying, but triple buffering is undesirable for video playback or maybe games. 2) Empty xorg.conf, no __GL_YIELD. I have vsync but it gets disabled, I flip backend OGL3->OGL2 and viceversa and VSYNC is back. Timings in fps plugin are same as in (1), e.g. everything is smooth. (with 670 on 5.2 and before, and 980 on 5.3, currently) It is best scenario except having to manualy re-enable vsync. 3) Empty xorg.conf, __GL_YIELD="USLEEP", I get lag and sawtooth timings. With what you said and what is happening I don't see any use in setting __GL_YIELD="USLEEP" as kwin correctly detects lack of triple buffering and double buffered vsync simply works after re-enabling. Then why kwin assumes hacky behaviour as default and makes user suffer? Created attachment 92474 [details]
Workaround for double buffered case
(In reply to Alexey Dyachenko from comment #26) > Created attachment 92474 [details] > Workaround for double buffered case Re-enabling vsync on each login is getting on my nerves so I decided to hack around the problem. As one can see from the support information above, re-enabling vsync after kwin automatically disables it in case of double buffering sets BlocksForRetrace to false. So instead of disabling vsync to re-enable it manually later I made that if block simply set BlocksForRetrace to false. Now vsync works OOB in double buffered case like in any other compositor ^ ^ Check attachment for details. That patch makes *zero* sense. You claim triple buffering to be disabled (thus swapping does actually block) In total, you're enforcing case (e) => Unless you artificially boost the refreshrate, KWin will systematically swap too slow/late and then has to wait for an entire frame. You'll end up w/ sth. between 30-60Hz (what is exactly what I get here for such change) You'll find the buffer swap a line UP from the patch position, add a m_swapProfiler.begin(); } + QElapsedTimer profile; + static quint64 swaptime = 0; + static uint swapcounter = 0; + profile.start(); glXSwapBuffers(display(), glxWindow); + swaptime += profile.nsecsElapsed(); + if (++swapcounter == 100) { + qDebug() << "average swap time" << swaptime / 100000000.0 << "ms"; + swaptime = swapcounter = 0; + } if (gs_tripleBufferNeedsDetection) { To print out how much time you spend waiting for the vertical retrace. If that number is very small, your swap don't block, ie. you _are_ triple buffering. If it's very large (> 1ms), you're loosing frames for pretty much sure. Please deactivate the FPS counter plugin and actually don't care too much about its output. It pollutes what it measures and is rather unmaintained. (In reply to Thomas Lübking from comment #28) > That patch makes *zero* sense. I understand what I'm doing very little, however patch works for me. KWIN says TB is not present and I should export __GL_YIELD. As I do that (your case (b)) I get massive lag, so something is very wrong somewhere. > You claim triple buffering to be disabled (thus swapping does actually block) > In total, you're enforcing case (e) It is indeed disabled. NVIDIA docs say its off by default, but just in case I have it disabled, here is my /etc/X11/xorg.conf.d/20-nvidia.conf Section "Monitor" Identifier "HTHG100233" DisplaySize 600 340 # In millimeters EndSection Section "Device" Identifier "Default Nvidia Device" Option "NoLogo" "True" Option "TripleBuffer" "False" Option "UseEdidDpi" "False" Option "DPI" "108 x 108" Option "Monitor-DP-4" "HTHG100233" EndSection and KWIN says kwin_core: Triple buffering detection: "NOT available" - Mean block time: 8.47859 ms > => Unless you artificially boost the refreshrate, KWin will systematically > swap too slow/late and then has to wait for an entire frame. > You'll end up w/ sth. between 30-60Hz (what is exactly what I get here for > such change) I have RefreshRate=60 in kwinrc kwin_core: Vertical Refresh rate 60 Hz > You'll find the buffer swap a line UP from the patch position, add a > > m_swapProfiler.begin(); > } > + QElapsedTimer profile; > + static quint64 swaptime = 0; > + static uint swapcounter = 0; > + profile.start(); > glXSwapBuffers(display(), glxWindow); > + swaptime += profile.nsecsElapsed(); > + if (++swapcounter == 100) { > + qDebug() << "average swap time" << swaptime / 100000000.0 << "ms"; > + swaptime = swapcounter = 0; > + } > if (gs_tripleBufferNeedsDetection) { > > To print out how much time you spend waiting for the vertical retrace. > If that number is very small, your swap don't block, ie. you _are_ triple > buffering. > If it's very large (> 1ms), you're loosing frames for pretty much sure. > > Please deactivate the FPS counter plugin and actually don't care too much > about its output. > It pollutes what it measures and is rather unmaintained. Here is the output: average swap time 0.0362707 ms average swap time 0.0408241 ms average swap time 0.0406007 ms average swap time 0.0330792 ms average swap time 0.0294599 ms kwin_core: Triple buffering detection: "NOT available" - Mean block time: 8.47859 ms average swap time 0.0622002 ms average swap time 0.0973264 ms average swap time 0.0755968 ms average swap time 0.0808665 ms average swap time 0.0717898 ms Just to be clear, the output above is with my patch, you can actually see the moment BlocksForRetrace get set to false. Here is vanilla 5.3.0 without triple buffering WITH __GL_YIELD=USLEEP average swap time 0.235458 ms average swap time 0.23607 ms average swap time 0.237488 ms average swap time 0.24853 ms average swap time 0.261053 ms kwin_core: Triple buffering detection: "NOT available" - Mean block time: 6.06375 ms average swap time 4.94924 ms average swap time 4.88566 ms average swap time 4.93876 ms average swap time 5.32374 ms average swap time 5.02611 ms I got second nvidia system (with 960), and its exhibiting exactly same beaviour with a fresh install of 5.4.1 and nvidia 355.11, no custom configs or any options set. Setting BlocksForRetrace to false heals vsync when no tripple buffering available. I have no other nvidia system to test this, but I'm gonna make a guess that something is buggy in kwin, a regression or it never worked properly at all with a blob. whatever the problem is: it *can* not be worked around by your patch, the resolution will be some side effect. if you want to try, try scratching the usleep requirement (so you can have enabled swapcontrol on double buffering w/o usleeping) and if you can, ping me a reminder at ~20:00 utc to disable triple buffering here ;-) I never meant my patch to be a proper workaround, it only leads to same result, as if user hits OpenGL3->2->Apply->3->Apply, as a matter of fact I'm too lazy to rebuild package everytime it changes, so I'm still doing it by hand on every boot. > if you want to try, try scratching the usleep requirement (so you can have enabled swapcontrol on double buffering w/o usleeping) I'm not sure what you mean, I totally don't know what happens inside KWIN as I don't have time to study it properly. Are you talking about __GL_YEILD here? If so, then I'm not. The whole point here is that __GL_YEILD=USLEEP causes massive lag (see comment #31), but supposed to be official workaround. Sorry for the double posting. Just to bring some clarity to the topic, this is what vanilla KWIN does itself when vsync gets broken and then I change backend OpenGL versions: $ qdbus org.kde.KWin /KWin supportInformation | grep block Painting blocks for vertical retrace: yes (change backend version and back) $ qdbus org.kde.KWin /KWin supportInformation | grep block Painting blocks for vertical retrace: no If this is relevant, I use 'Re-use screen content'. This patch: diff --git a/eglonxbackend.cpp b/eglonxbackend.cpp index 314bfb2..7f68424 100644 --- a/eglonxbackend.cpp +++ b/eglonxbackend.cpp @@ -344,7 +344,7 @@ void EglOnXBackend::present() gs_tripleBufferUndetected = gs_tripleBufferNeedsDetection = false; if (result == 'd' && GLPlatform::instance()->driver() == Driver_NVidia) { // TODO this is a workaround, we should get __GL_YIELD set before libGL checks it - if (qstrcmp(qgetenv("__GL_YIELD"), "USLEEP")) { + if (false && qstrcmp(qgetenv("__GL_YIELD"), "USLEEP")) { options->setGlPreferBufferSwap(0); eglSwapInterval(eglDisplay(), 0); qCWarning(KWIN_CORE) << "\nIt seems you are using the nvidia driver without triple buffering\n" diff --git a/glxbackend.cpp b/glxbackend.cpp index 0abb1e3..acbd64a 100644 --- a/glxbackend.cpp +++ b/glxbackend.cpp @@ -635,7 +635,7 @@ void GlxBackend::present() gs_tripleBufferUndetected = gs_tripleBufferNeedsDetection = false; if (result == 'd' && GLPlatform::instance()->driver() == Driver_NVidia) { // TODO this is a workaround, we should get __GL_YIELD set before libGL checks it - if (qstrcmp(qgetenv("__GL_YIELD"), "USLEEP")) { + if (false && qstrcmp(qgetenv("__GL_YIELD"), "USLEEP")) { options->setGlPreferBufferSwap(0); setSwapInterval(0); qCWarning(KWIN_CORE) << "\nIt seems you are using the nvidia driver without triple buffering\n" --------- What I meant is that the direct impact of your change can hardly improve things; thus I assume an indirect effect will do. Hmmm - no problems with double buffered compositing and __GL_YIELD=USLEEP here. Can you elaborate on your testcase/scenario (cpu load, running glxgears, moving around windows, .... etc.) Once again and to be sure: hands off from the fps counter effect for this. Alright I have very good news. First, I am unable to reproduce comment #31 anymore. Second, I falsely assumed BlocksForRetrace=false to be the cure, instead, as you suggested removing whole if-block (if you trace back to my original patch it also did this very thing), so that swaps are not disabled and that is enough, at least for my machines. if (result == 'd' && GLPlatform::instance()->driver() == Driver_NVidia) Resulting kwin cpu usage is 0-2% You insist BlocksForRetrace=false to be improper in such case, but changing backend OpenGL versions changes this variable from YES (starting value) to NO (comment #35), so you probably should look into that behaviour. no, actually nothing's good in double buffering. adter figuring that usleep was actually not ultimately exported to kwin and fixing that and fixing the redetection after suspend/resume, dragging windows feels like working in jelly. the cause seems to be that vey often waitTime in setCompositeTimer ends up being very small, so we basically trigger a swap every other eventcycle (which will then block quite some time) also we seem to skew against the vblank rate, what causes regular frame skips (other bugs reported) i assume the cause is that the last swap time is included into the frame rendering time. we'll see whether that causes the skew as well. of course blocking purely randomly (~30%) too long is better than blocking too long systematically i'll have a closer look and patches tonight. LOL. This patch should a) redetect double/triple buffering on suspend resume cycles b) properly hint whether painting currently blocks (says "no" if swap control is turned off) c) improve double buffered rendering "snappiness" (I think we still skew, gonna take another look) The problem is that nvidia seems to manage to not block double buffer swapping (and we then flood the queue) Ultimately this makes it "safe" for nvidia users to enforce export KWIN_TRIPLE_BUFFER=1 but that's oc. non-reliable behavior. But good to know. ----------- diff --git a/glxbackend.cpp b/glxbackend.cpp index 0abb1e3..c767eef 100644 --- a/glxbackend.cpp +++ b/glxbackend.cpp @@ -119,6 +119,9 @@ GlxBackend::GlxBackend() init(); } +static bool gs_tripleBufferUndetected = true; +static bool gs_tripleBufferNeedsDetection = false; + GlxBackend::~GlxBackend() { if (isFailed()) { @@ -129,6 +132,9 @@ GlxBackend::~GlxBackend() cleanupGL(); doneCurrent(); + gs_tripleBufferUndetected = true; + gs_tripleBufferNeedsDetection = false; + if (ctx) glXDestroyContext(display(), ctx); @@ -142,9 +148,6 @@ GlxBackend::~GlxBackend() delete m_overlayWindow; } -static bool gs_tripleBufferUndetected = true; -static bool gs_tripleBufferNeedsDetection = false; - void GlxBackend::init() { initGLX(); @@ -629,8 +632,8 @@ void GlxBackend::present() m_swapProfiler.begin(); } glXSwapBuffers(display(), glxWindow); + glXWaitGL(); if (gs_tripleBufferNeedsDetection) { - glXWaitGL(); if (char result = m_swapProfiler.end()) { gs_tripleBufferUndetected = gs_tripleBufferNeedsDetection = false; if (result == 'd' && GLPlatform::instance()->driver() == Driver_NVidia) { @@ -638,6 +641,8 @@ void GlxBackend::present() if (qstrcmp(qgetenv("__GL_YIELD"), "USLEEP")) { options->setGlPreferBufferSwap(0); setSwapInterval(0); + // hint proper behavior + result = 0; qCWarning(KWIN_CORE) << "\nIt seems you are using the nvidia driver without triple buffering\n" "You must export __GL_YIELD=\"USLEEP\" to prevent large CPU overhead on synced swaps\n" "Preferably, enable the TripleBuffer Option in the xorg.conf Device\n" Could you test the latest patch? (In reply to Thomas Lübking from comment #41) > Could you test the latest patch? Yes, I started testing it immediately. As I understand KWIN_TRIPLE_BUFFER=1 makes it bypass that if(nvidia) block, so no more hacks necessary to run double buffered out of the box. Anyway, I've been running with variable exported and it works good. Still, my humble opinion, considering that TripleBuffering with nvidia is non-default driver behavior and requires user to edit xorg config, it is currently one too many workarounds to do to have vsync working with nvidia out of the box. Since TripleBuffering is non-default, as KWIN is able to properly to detect buffering type used, couldn't KWIN behave as if double buffering is the default (as far as my hardware goes, 670, 980, 960, __GL_YIELD is optional)? How does the patched version behave for you (with *only* __GL_YIELD=USLEEP exported)? > As I understand KWIN_TRIPLE_BUFFER=1 makes it bypass that if(nvidia) block, so no more > hacks necessary to run double buffered out of the box. You should not have to export KWIN_TRIPLE_BUFFER and you should (usually, we're in a special condition atm) BY NO MEANS set it to one if you're actually double buffering. This *only* worked because nvidia didn't block on glxSwapBuffers, but with the patch it *will* block on glxWaitGl > Since TripleBuffering is non-default, as KWIN is able to properly to detect buffering type used, > couldn't KWIN behave as if double buffering is the default This makes no sense. Aside that there's actually no reliable way to detect the buffer count, if KWin can detect it, the default does absolutely not matter (but for the initial 500 frames during the detectin phase, which is neglectable) Okay, here are testing results: -- _GL_YIELD=USLEEP works as good as KWIN_TRIPLE_BUFFER=1, seriously. -- With no __GL_YIELD set and no TripleBuffer enabled it is no longer possible to re-enable vsync as before (by switching GL versons). As I see this is intended by patch. Now, to have vsync in the latter case I commented out // options->setGlPreferBufferSwap(0); // setSwapInterval(0); (In reply to Alexey Dyachenko from comment #44) > no longer > possible to re-enable vsync as before (by switching GL versons). As I see > this is intended by patch. Yes, this is intended - former behavior was clearly a bug. > Now, to have vsync in the latter case I commented out > > // options->setGlPreferBufferSwap(0); > // setSwapInterval(0); The altered patch allows you to enforce KWIN_TRIPLE_BUFFER=1 on double buffering (as *atm*. the nvidia blob doesn't seem to block on swapping, what's the relevant aspect for kwin) Git commit 8bea96d7018d02dff9462326ca9456f48e9fe9fb by Thomas Lübking. Committed on 11/11/2015 at 21:18. Pushed by luebking into branch 'master'. wait for GL after swapping otherwise at least on the nvidia blob the swapping doesn't block even for double buffering REVIEW: 125659 Related: bug 351700 FIXED-IN: 5.5 M +4 -0 glxbackend.cpp http://commits.kde.org/kwin/8bea96d7018d02dff9462326ca9456f48e9fe9fb |