Bug 329821

Summary: no triple buffer detection on buffer_age support
Product: [Plasma] kwin Reporter: Michael Marley <michael>
Component: scene-openglAssignee: KWin default assignee <kwin-bugs-null>
Severity: normal Flags: thomas.luebking: ReviewRequest+
Priority: NOR    
Version: 4.11.5   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
URL: https://git.reviewboard.kde.org/r/115306/
Latest Commit: Version Fixed In: 4.11.6
Attachments: KWin debugging output with KWIN_TRIPLE_BUFFER undefined

Description Michael Marley 2014-01-10 19:28:35 UTC
On my Kubuntu 14.04 (Trusty) system with an Nvidia graphics card and version 331.20 of the binary driver, I began having a problem with very jerky/juddery graphics after updating from kde-workspace 4.11.4 to 4.11.5.  (The rest of the system is on 4.12.)  About every half-second or so, everything moving or animating on the screen jerks very noticeably.  This occurs for both OpenGL things like glxgears and for 2D things like scrolling in Firefox.

Looking through the changelog for 4.11.5, it would seem that the issue probably would have been caused by this: https://git.reviewboard.kde.org/r/114162/.  However, using the KWIN_USE_BUFFER_AGE=0 environment variable does not eliminate the jerkiness.  It does, however, introduce tearing, unless I set "Tearing Prevention (VSync)" to "Full Scene Repaints."

Reproducible: Always

Steps to Reproduce:
1. Get a system with an Nvidia graphics card and install a GLX_EXT_BUFFER_AGE-supporting driver version.
2. Install/upgrade to kwin 4.11.5.
3. Launch glxgears
Actual Results:  
glxgears should run smoothly without dropping any frames.

Expected Results:  
glxgears jerks noticeably about twice every second.
Comment 1 Thomas Lübking 2014-01-10 19:32:56 UTC
please provide the output of "qdbus org.kde.kwin /KWin supportInformation", "env | grep GL" and "grep -i triple /var/log/Xorg.0.log"

you may also blindly try
export __GL_YIELD="USLEEP"
kwin --replace &
Comment 2 Michael Marley 2014-01-10 19:36:08 UTC
I already had __GL_YIELD set to USLEEP and triple buffering turned on in my xorg.conf.

When I tried to run your first command, I got "qdbus: could not find a Qt installation of ''", but I think this is an unrelated issue with my system that I am troubleshooting now.
Comment 3 Michael Marley 2014-01-10 19:36:58 UTC
I also tried all combinations of __GL_YIELD and triple-buffering settings, but none had any effect.
Comment 4 Thomas Lübking 2014-01-10 19:37:16 UTC
install the qtchooser package (also there's maybe "qdbus-qt4")
Comment 5 Michael Marley 2014-01-10 19:39:12 UTC
Thanks, that worked.  Here is the output: http://pastebin.kde.org/pgzbj8jy4
Comment 6 Thomas Lübking 2014-01-10 19:58:03 UTC
it says: "KWin version: 4.11.4"?
Also the determined tearing prevention is frontbuffer re-usage (likely from "auto")

what happens if you disable buffer_age, set tearing prevention to none (cheap, full repaints - *not* front buffer copying nor automatic) and restart "kwin --replace&"?
Comment 7 Michael Marley 2014-01-10 20:00:09 UTC
Yeah, sorry about that.  I downgraded back to 4.11.4 to get rid of the jerking.  If you want, I can upgrade back to 4.11.5 and run the command again.

On 4.11.5, if I disable buffer_age and set full repaints, I get smooth motion and no tearing.
Comment 8 Thomas Lübking 2014-01-10 20:04:35 UTC
what about jerkyness for no buffer_age and front buffer re-usage?
Comment 9 Michael Marley 2014-01-10 20:07:25 UTC
With that combination, I get both tearing and jerkiness.
Comment 10 Michael Marley 2014-01-10 20:10:18 UTC
Additionally, it seems that with those settings, the tearing and the jerkiness are "synchronized."  At the same time glxgears jerks, the line of tearing appears on the konsole window I am dragging above it.
Comment 11 Thomas Lübking 2014-01-10 20:12:54 UTC
-> buffer_age on and tearing prevention to none?
Comment 12 Michael Marley 2014-01-10 20:17:52 UTC
In that mode, 2D applications like Firefox are smooth but with lots of tearing.  glxgears tears heavily in the middle 1/3 of the screen.  It is jerky in the area that tears but smooth otherwise.
Comment 13 Thomas Lübking 2014-01-10 20:20:22 UTC
-> "only when cheap" (what should be the case for buffer_age)
Comment 14 Michael Marley 2014-01-10 20:22:09 UTC
That setting produces the same results as Automatic.  (Everything jerks twice a second, no tearing.)
Comment 15 Michael Marley 2014-01-13 17:48:44 UTC
I just tried with the newly-released Nvidia 331.38 driver and the bug still occurs.
Comment 16 Thomas Lübking 2014-01-13 20:13:28 UTC
Random guess:
1. ensure that triple buffering is really enabled:
  grep -i triple /var/log/Xorg.0.log
2. next convince kwin about it
   kwin --replace &

The problem here is that full scene repaints do not seem to cause a problem, so it cannot be swapping by itself (though you should ensure that flipping is enabled in nvidia-settings, GL settings page) and either frontbuffer reading or buffer_age.

What's even more weird is that you claim tearing for frontbuffer reading, what can only have two pot. causes:
1. no flipping (see above)
2. tearing in the client (activate the "show paint" effect, it's worthless for buffer_age or full scene repaints, though)

But either case would also apply to full scene repaints.

One last resort: try to disable blurring.
Comment 17 Michael Marley 2014-01-13 21:03:51 UTC
Thanks!  The "export KWIN_TRIPLE_BUFFER=1" thing completely clears up the jerkiness!

I still consider this a bug though, because the jerkiness occurs even when I have triple buffering turned off in xorg.conf.  Perhaps kwin should be able to automatically detect when triple buffering is taking place?
Comment 18 Thomas Lübking 2014-01-13 21:40:57 UTC

It does try to detect whether triple buffering is enabled and that is (in a way) crucial to know.
Unfortunately there's no "legal" way to know this, so it's measured at runtime and that used to work nicely in the past (and still does here)

-> Can you please check how much time buffer swapping takes during the detection?

To do so, you'd have to run "kdebugdialog --fullmode", filter for kwin (1212) and redirect all output to some file, e.g. /tmp/kwin.dbg

The file will after a short time (500 screen updates) contain a "Triple buffering detection" line which will indicate whether triple buffering is assumed to be available and the mean blocking time of glSwapBuffers().

related bug #322060 and bug #329297
Comment 19 Michael Marley 2014-01-14 19:08:22 UTC
I tried this but I do not get any such message about triple buffering.
Comment 20 Thomas Lübking 2014-01-14 20:15:41 UTC
Sorry, I should have mentioned that you must *not* export KWIN_TRIPLE_BUFFER or the heuristic detection won't take place at all.
Comment 21 Michael Marley 2014-01-14 21:07:32 UTC
I tried again just to make sure, but even when I comment out the export from my .profile and reboot, I still don't get anything about triple buffering detection in the debug output.
Comment 22 Thomas Lübking 2014-01-14 22:24:09 UTC
it requires 500 full repaints - if you did not enable buffer_age or full scene repaints or frontbuffer copying as tearing prevention, this can last quite a while.

Can you attach the generated file?
Comment 23 Michael Marley 2014-01-14 22:26:38 UTC
I do have buffer_age enabled, and I waited at least 5 minutes before checking the file.  I am going to have to go away in just a minute, but I will test it again when I get a chance.
Comment 24 Michael Marley 2014-01-20 13:53:34 UTC
Created attachment 84748 [details]
KWin debugging output with KWIN_TRIPLE_BUFFER undefined

Sorry for the delay.  Here is the output.  I commented out the KWIN_TRIPLE_BUFFER in my ,profile, rebooted, and ran glxgears in fullscreen for about 30 seconds to make sure it had rendered enough frames.
Comment 25 Michael Marley 2014-01-20 18:15:34 UTC
I have also noticed that after enabling KWIN_TRIPLE_BUFFER, sometimes I get more lag between the cursor and the window when dragging windows around the screen, especially if I drag the window in circles.  It isn't that bad, but I thought you should know anyway.
Comment 26 Thomas Lübking 2014-01-20 20:55:42 UTC
something is fishy here.

a) please provide
- /var/log/Xorg.0.log
- glxinfo > my.glxinfo
- nvidia-settings -q all > my.nvsettings
- cat /proc/`pidof kwin`/environ > my.kwinenv

b) did you really redirect *all* level outputs for 1212/kwin in "kdebugdialog --fullmode"?
Comment 27 Michael Marley 2014-01-20 21:07:32 UTC
Created attachment 84755 [details]
Comment 28 Michael Marley 2014-01-20 21:08:38 UTC
Created attachment 84756 [details]
Comment 29 Michael Marley 2014-01-20 21:09:02 UTC
Created attachment 84757 [details]
Comment 30 Michael Marley 2014-01-20 21:09:34 UTC
Created attachment 84758 [details]
Comment 31 Michael Marley 2014-01-20 21:12:16 UTC
I did redirect all the output for "1212 kwin" to that file.
Comment 32 Thomas Lübking 2014-01-20 22:11:08 UTC
You're overriding FSAA (multisampling), it's not re-overridden by an enviroment in kwin.
So unless you've an application profile for kwin in nvidia settings that sets GLFSAAMode to 0x0, that will cause some significant GPU load.
Comment 33 Michael Marley 2014-01-20 22:33:25 UTC
I do in fact have an application profile configured for kwin that turns AA and AF off.  Sorry, I forgot to mention that.
Comment 34 Thomas Lübking 2014-01-24 21:27:41 UTC
grrrr... it's because the buffer_age patch exits the paint function early and completely bypasses the triple buffer detection.
Can/Do you want to try a patch?
Comment 35 Michael Marley 2014-01-24 21:40:34 UTC
Sure, I can try a patch.
Comment 36 Thomas Lübking 2014-01-24 21:43:00 UTC
See here:

Sorry that it took so long (that's apparently why everyone says that early exits are evil ;-)
Comment 37 Michael Marley 2014-01-24 21:44:06 UTC
OK, compiling now.  This may take a while; my compile box is an old Core 2 Duo machine.
Comment 38 Michael Marley 2014-01-24 23:35:00 UTC
Thanks, this patch works!  After a few seconds, the log message indicates that kwin detected triple buffering and the jerkiness clears up.

However, I still am getting that lag I was talking about earlier when dragging windows.  That only started happening after buffer_age was introduced.  Should I file another bug for that?
Comment 39 Thomas Lübking 2014-01-25 14:22:11 UTC
Does the lagging ever occur during a session or only right after login?
Comment 40 Michael Marley 2014-01-25 14:31:44 UTC
It can happen anytime, but doesn't always happen.  A pretty reliable way to reproduce it for me is to drag a window around in large circles on the screen at a rate of about one circle per second.  When I do that, the window lags quite a ways behind the mouse cursor.  Curiously, I haven't noticed any lag when doing other things, such as dragging scrollbars or playing games.
Comment 41 Thomas Lübking 2014-01-25 14:42:21 UTC
Circular movement is just a good way to outpace systems, so that's not too special.

When this happens, does
- re-initiating a new drag
- restaring the compositor (Shift+Alt+F12 twice)
stop it?

Is there exceptonally high CPU load?
Comment 42 Michael Marley 2014-01-25 15:06:03 UTC
During the dragging, kwin is using about 8% of the CPU and Xorg is using about 4% (both as measured by htop.)  I don't even have to restart compositing to make the lag go away.  If I stop dragging the window, it goes away immediately.  Also, this didn't happen before buffer_age was introduced.
Comment 43 Thomas Lübking 2014-01-25 15:07:47 UTC
(In reply to comment #42)
> I don't even have to restart
> compositing to make the lag go away.  If I stop dragging the window, it goes
> away immediately.

Just to be absolutely certain about this:
that means a subsequent drag does not show this symptom?
Comment 44 Michael Marley 2014-01-25 15:08:56 UTC
Not unless I start dragging it in circles again.
Comment 45 Thomas Lübking 2014-01-25 15:15:38 UTC
That means it's permanent.
Try to toggle compositing off. Still laggy?
Toggle compositing on again. Still laggy?
Comment 46 Michael Marley 2014-01-25 15:19:13 UTC
With compositing off, there is no lag.  When I re-enable it, the lag comes back.
Comment 47 Thomas Lübking 2014-01-26 20:13:46 UTC
I guess it feels like moving through jelly, the window follows the mouse - it does not hang and then jump to the mouse position?

Dev note:
This would mean we load the swapbuffer with too many swaps what can basically have two reasons:

1. misdetection/overridden refreshrate/MaxFPS
2. triplebuffer misdetection (assumed to be NOT available, while it indeed is)

According to the present debug output, the refreshrate is detected as 60Hz (what is supported by the nvidia-settings query) - so it had to be 3buf detection, just that comment #25 explicitly states that this occurred WITH KWIN_TRIPLE_BUFFER=1 ...

So there must be a third reason to overcommit frames (could be broken paint time calculation)
Comment 48 Michael Marley 2014-01-26 20:24:34 UTC
My monitor does run at 60Hz and I do have KWIN_TRIPLE_BUFFER=1 set, so that sounds right.

And moving through jelly is a good description.  It is still smooth, but just is smooth farther behind the mouse cursor than it was before buffer_age was introduced.
Comment 49 Michael Marley 2014-01-29 02:55:59 UTC
I just discovered a reliable way to reproduce the lag at any time.  If I run an application that is playing a video or doing any kind of continuous animation, it makes all window dragging lag in exactly the same way I described with the circular dragging earlier.
Comment 50 Thomas Lübking 2014-01-29 19:22:36 UTC
Git commit bb9f76e1aede42fcd51edf298e4d8a0b942ff6ac by Thomas Lübking.
Committed on 24/01/2014 at 21:29.
Pushed by luebking into branch 'KDE/4.11'.

merge buffer_age render into general render code

avoiding the blocking swapinterval detection causes
issues in the timing strategy and prevents protection
against CPU overload on the nvidia blob
FIXED-IN: 4.11.6
REVIEW: 115306

M  +8    -10   kwin/eglonxbackend.cpp
M  +8    -10   kwin/glxbackend.cpp

Comment 51 Thomas Lübking 2014-02-05 14:05:02 UTC
see bug #330794