Bug 269816

Summary: Bad performance after re-enabling desktop effects
Product: [Plasma] kwin Reporter: Saygın Bakşi <sayginb>
Component: generalAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED NOT A BUG    
Severity: normal    
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: sysprof outputs for various cases

Description Saygın Bakşi 2011-03-31 12:53:02 UTC
Version:           unspecified (using KDE 4.6.1) 
OS:                Linux

With KDE 4.6 compositing performance is so good, i have never had a problem with it. However, if i suspend desktop effects and re-enable it, it becomes very laggy, like it has a really low fps, although "show fps" plugin shows 48fps minimum. The only way to have smooth compositing again is, enabling effects and reboot. I hope there is will be a solution soon.

Reproducible: Always

Steps to Reproduce:
1) Turn Off Desktop Effects.
2) Turn On Desktop Effects

Actual Results:  
After re-enabling, animations are not smooth anymore.

Expected Results:  
No matter how many times I disable/enable Desktop Effect, in the end, animations should be smooth as it was first enabled after a reboot.
Comment 1 Thomas Lübking 2011-03-31 20:48:20 UTC
- please sharpen "animations".
  * kwin window effects (cover flow, present windows, etc.)
  * in client animations (oxygen animations, scrolling, just smooth scrolling as in bug #246084)
  * particular applications (flash)
   * resizing
   * other stuff...

- does restarting kwin "kwin --replace &" help?

- does flushing the pixmap cache* help? ("nvidia-settings -a PixmapCache=0; nvidia-settings -a PixmapCache=1", don't try being smart by setting it in a combined nvidia-settings call, that won't work)

* i assume we're talking about this machine?
Kubuntu 10.10
Kde 4.6
Nvidia 425M with 1GB VRam (270.29 Beta driver)
6GB DDR3 RAM
1.73GHz i7 Quad Core
Comment 2 Saygın Bakşi 2011-04-01 15:53:33 UTC
(In reply to comment #1)
> - please sharpen "animations".
>   * kwin window effects (cover flow, present windows, etc.)
>   * in client animations (oxygen animations, scrolling, just smooth scrolling
> as in bug #246084)
>   * particular applications (flash)
>    * resizing
>    * other stuff...

Before I get into the details, it seems like the most choppy effect (maybe the only that my eyes can see) is the minimize/maximize effect. So, I tested all the proposed techniques according to this animation.

Animations I use with kwin are:
- Blur
- Fade
- Highlight Window
- Login
- Logout
- Minimize Animation
- Screenshot
- Shadow
- Slide
- Sliding popups
- Taskbar Thumbnails
- Translucency
- Startup Feedback
- Dialog Parent
- Dim Screen for Administration Mode
- Cover Switch
- Desktop Grid
- Present Windows

I use bespin theme with default configurations now, however the same result was there with oxygen theme with defaults.

>
> - does restarting kwin "kwin --replace &" help?
-Nope!

>
> - does flushing the pixmap cache* help? ("nvidia-settings -a PixmapCache=0;
> nvidia-settings -a PixmapCache=1", don't try being smart by setting it in a
> combined nvidia-settings call, that won't work)
I tried "nvidia-settings -a PixmapCache=0" in a line, after this command I did "nvidia-settings -a PixmapCache=1" as you said before. However nothing has changed. When the PixmapCache=0, the animation is almost can't be seen, worse than before. When PixmapCache=1 again, it is the same as re-enabled choppy version

>
> * i assume we're talking about this machine?
> Kubuntu 10.10
> Kde 4.6
> Nvidia 425M with 1GB VRam (270.29 Beta driver)
> 6GB DDR3 RAM
> 1.73GHz i7 Quad Core

Yes, my machine is the one above.
Comment 3 Thomas Lübking 2011-04-01 18:42:12 UTC
> it seems like the most choppy effect ... is the minimize/maximize effect
- is "magic lamp" affected as well?
- does it happen if you minimize a window that does NOT have the focus (you must use "click to focus" (default) or "focus follows mouse" to do this)

what do you do between suspending & resuming compositing? (run an OpenGL game etc. possibly related?)

> I use bespin theme
what about the window decoration?

> When the PixmapCache=0, the animation is almost can't be seen
expectable, the nvidia driver doesn't like the cacheless mode. de- and re-activating it just implicitly flushes it.
Comment 4 Saygın Bakşi 2011-04-03 19:34:14 UTC
(In reply to comment #3)
> > it seems like the most choppy effect ... is the minimize/maximize effect
> - is "magic lamp" affected as well?
Yes, it's the same as minimize effect.

> - does it happen if you minimize a window that does NOT have the focus (you
> must use "click to focus" (default) or "focus follows mouse" to do this)
I use click to focus and it happens all the time regardless of the window has focus or not
> 
> what do you do between suspending & resuming compositing? (run an OpenGL game
> etc. possibly related?)
Nothing indeed. I just use Alt+Shift+F12 double times and its just that.
> 
> > I use bespin theme
> what about the window decoration?
Bespin again but as I said before, I used to use oxygen (both theme and window decoration) and it was the same as again. (Both of the themes and decorations use default settings)
> 
> > When the PixmapCache=0, the animation is almost can't be seen
> expectable, the nvidia driver doesn't like the cacheless mode. de- and
> re-activating it just implicitly flushes it.
Comment 5 Thomas Lübking 2011-04-03 21:11:58 UTC
(In reply to comment #4)
> Yes, it's the same as minimize effect.
> I use click to focus and it happens all the time regardless of the window has
> focus or not
 > Nothing indeed. I just use Alt+Shift+F12 double times and its just that.

hmmm... 
- What if you set  "Keep window thumbnails" to "Always" in the advanced tab of "kcmshell4 kwincompositing"?
- Can you check whether it happens with other minimization effects like eg this one:
http://kde-apps.org/content/show.php/BeDropped+%3B-)?content=120847
- Did you monitor cpu usage during these effects?
- Can you log the system load with sysprof (start it and then minimize a lot of windows)

> Bespin again but as I said before,
Thanks for defending Bespin, but that's not necessary ;-)
I would have rather guessed that you're using Arorae and sth. with the plasma theme reload on suspend/resume borks the pixmap cache (and you'd have to render SVGs on every focus change in consequence) - but that does not seem to be the case anyway.
Comment 6 Saygın Bakşi 2011-04-04 00:01:30 UTC
(In reply to comment #5)
> - What if you set  "Keep window thumbnails" to "Always" in the advanced tab of
> "kcmshell4 kwincompositing"?
Well, when I change it from "Only for Shown Windows" to "Always", it is the same as suspending and resuming desktop effects. It gives the same laggy result.

> - Can you check whether it happens with other minimization effects like eg this
> one:
> http://kde-apps.org/content/show.php/BeDropped+%3B-)?content=120847
I don't have much time mow, I'll do it ASAP.

> - Did you monitor cpu usage during these effects?
One of my CPU cores goes a little bit high but doesn't look suspicious.

> - Can you log the system load with sysprof (start it and then minimize a lot of
> windows)
I'll do it ASAP too.
> 
> > Bespin again but as I said before,
> Thanks for defending Bespin, but that's not necessary ;-)
> I would have rather guessed that you're using Arorae and sth. with the plasma
> theme reload on suspend/resume borks the pixmap cache (and you'd have to render
> SVGs on every focus change in consequence) - but that does not seem to be the
> case anyway.
Comment 7 Thomas Lübking 2011-04-04 22:35:26 UTC
There's one last wild guess i can provide:
shut down the plasma-desktop ("kquitapp plasma-desktop")

Reason:
Toggling compositing triggers a theme update in plasma. Assuming sth. goes wrong there and you're using a taskbar which changes on minimizing clients, it could cause quite some cpu load then....
Comment 8 Saygın Bakşi 2011-04-05 00:48:03 UTC
Created attachment 58582 [details]
sysprof outputs for various cases

> - Can you check whether it happens with other minimization effects like eg this
> one:
> http://kde-apps.org/content/show.php/BeDropped+%3B-)?content=120847
Didn't work better, sorry :(

> - Can you log the system load with sysprof (start it and then minimize a lot of
> windows)
I did several profiling and named according to it. Here's as an attachment

also "kquitapp plasma-desktop" didn't have any effect.
Comment 9 Thomas Lübking 2011-04-06 00:01:16 UTC
The sysprofs do indicate high cpu drain (kwin + X11 together take 3/2 of the cpu amarok requires to play an mp3 - gstreamer cannot be /that/ bad ;-)
Can you confirm that the general cpu usage (eg in top) doe not overly rise (sucks away an entire core or so) when the issue occurs?

Since it's a mobile chip, i'd bet on powersavings (ie. when you suspend compositing, the GPU clocks down to save power and when you resume it doesn't clock up)
add
Option      "Coolbits" "1"
to the device setting in /etc/X11/xorg.conf restart X11 (sudo telinit 3; sudo telinit 5) and then launch nvidia-settings, have a look at "PowerMizer", suspend/resume compositing and have another look.
Since you activate "Coolbits" you can now also manipulate the gpu clock (do NOT burn it - while the GPU can take a lot of temperature and care about itself, it might melt down some transistors...)
Just have a look whether the Powermizer state and gpu/memory clock returns to where it has been.
It's possible to manipulate/force powermizer settings, but the entire thing seems to be still just broken on some notebooks :-(
Comment 10 Saygın Bakşi 2011-04-06 09:57:17 UTC
(In reply to comment #9)
> The sysprofs do indicate high cpu drain (kwin + X11 together take 3/2 of the
> cpu amarok requires to play an mp3 - gstreamer cannot be /that/ bad ;-)
> Can you confirm that the general cpu usage (eg in top) doe not overly rise
> (sucks away an entire core or so) when the issue occurs?

well, kubuntu 10.10 uses xine backend instead of gstreamer. But that's not important. This minimize effect does not eats so much cpu, and any of the cores have huge load. I don't think there is the problem.

> Since it's a mobile chip, i'd bet on powersavings (ie. when you suspend
> compositing, the GPU clocks down to save power and when you resume it doesn't
> clock up)
> add
> Option      "Coolbits" "1"
> to the device setting in /etc/X11/xorg.conf restart X11 (sudo telinit 3; sudo
> telinit 5) and then launch nvidia-settings, have a look at "PowerMizer",
> suspend/resume compositing and have another look.
> Since you activate "Coolbits" you can now also manipulate the gpu clock (do NOT
> burn it - while the GPU can take a lot of temperature and care about itself, it
> might melt down some transistors...)
> Just have a look whether the Powermizer state and gpu/memory clock returns to
> where it has been.
> It's possible to manipulate/force powermizer settings, but the entire thing
> seems to be still just broken on some notebooks :-(

I haven't done anything you said above yet, if you want I can do it. However while skimming in the nvidia-settings, I saw that I have Vertical Sync enabled in OpenGL window. I closed it, suspended&resumed kwin and wow! minimize effect returns to good performance again. However, this time videos I play in any media program has tearing effect, so I have to enable it. So I guess kwin can't work well if nvidia driver uses vSync option enabled. Can be true?
Comment 11 Thomas Lübking 2011-04-06 12:23:02 UTC
> I saw that I have Vertical Sync enabled in OpenGL window
That pretty much explains it.
Having the global syncing activated and kwin syncing activated will give you 2 syncs for every fullscreen effect frame (nvidia intercepts the glXSwapBuffers() call and adds a sync wait, but we already had waited once)
So you'll have (on a 50Hz display) 25 fps at max, if the geometry calculation or the MaxFPS cap crosses a frame, it drops to 16fps what is "visible" ;-)

You should preferably deactivate nvidias global sync (it's mostly for programs that can't sync themselves and it cannot sync the ordinary kwin frame - there's no complete frame update, thus buffer swap)  and use the environment variable to set it before launching clients that really need it (or that you want to use it since it will perform slightly better than glXWaitSync what can get you better vsync'ing in games if you /replace/ the games internal syncing.

If this is reallyreallyreally no option at all you'd set
Option      "TripleBuffer" "true"
in the xorg.conf device section what /might/ improve the experience - but this costs memory and will also slightly increase the latency (input -> visual result)
---------
Marking invalid for misconfiguration. Rethinking whether we actually need this and more control for the nvidia hack lib or launch kwin through a script... :S

--------------
Slight OT:
amarok has the highest CPU load in /usr/lib/gstreamer-0.10/libgstflump3dec.so and I do not see libxine at all...
Comment 12 Saygın Bakşi 2011-04-07 21:18:41 UTC
(In reply to comment #11)
> > I saw that I have Vertical Sync enabled in OpenGL window
> That pretty much explains it.
> Having the global syncing activated and kwin syncing activated will give you 2
> syncs for every fullscreen effect frame (nvidia intercepts the glXSwapBuffers()
> call and adds a sync wait, but we already had waited once)
> So you'll have (on a 50Hz display) 25 fps at max, if the geometry calculation
> or the MaxFPS cap crosses a frame, it drops to 16fps what is "visible" ;-)
> 
> You should preferably deactivate nvidias global sync (it's mostly for programs
> that can't sync themselves and it cannot sync the ordinary kwin frame - there's
> no complete frame update, thus buffer swap)  and use the environment variable
> to set it before launching clients that really need it (or that you want to use
> it since it will perform slightly better than glXWaitSync what can get you
> better vsync'ing in games if you /replace/ the games internal syncing.
> 
> If this is reallyreallyreally no option at all you'd set
> Option      "TripleBuffer" "true"
> in the xorg.conf device section what /might/ improve the experience - but this
> costs memory and will also slightly increase the latency (input -> visual
> result)
> ---------
> Marking invalid for misconfiguration. Rethinking whether we actually need this
> and more control for the nvidia hack lib or launch kwin through a script... :S
> 

Well, although I have major issues right now, I think it is still a bug. First of all, I NEED TO enable vertical sync in nvidia settings because of tearing effects in videos, so disabling it is not a good option I think. But after a restart, all effects works properly independent of vsync configuration in nvidia setting. After suspending/resuming the problem occurs.

> --------------
> Slight OT:
> amarok has the highest CPU load in /usr/lib/gstreamer-0.10/libgstflump3dec.so
> and I do not see libxine at all...
Well, I don't remember changing it from xine to gstreamer, but you are right, phonon uses gstreamer. (shame)
Comment 13 Thomas Lübking 2011-04-07 22:10:23 UTC
Assuming you'd actually /have/ to activate nvidias global synchronization this would lead to an unresolvable conflict to be only solved on your side by exporting __GL_SYNC_TO_VBLANK=0 on launching kwin.

I however doubt this is actually the case.
Unless you run a GL backend on your video player, this setting has exactly zipzeronull impact on video playback itself.
Aside this good video players offer syncing themselves ;-)

It has also no impact on (regular) compositing since KWin will usually not make the call that nvidia waits for to wait for the next vblank before doing some. You /have/ to use the kwin syncing to get sync'd compositing at all.

The nvidia driver can also sync to the adaptors (texture/blitter) of xv which is the default output of all relevant videoplayers (except when using xv, but afaik neither vdpau nor xvmc are syncable by the driver) - this is likely what you want to do and completely unrelated to GL syncing.

I've no idea why this does not happen before the first suspend/resume/restart but guess that kwin initially starts before the nvidia settings are applied.
Does logging out/in "fix" it?

---
If i'm totally wrong and don't understand what you're trying to achieve at all, i'd suggest to raise the MaxFPS rate to prevent additional frame dropping - but recheck above first ;-)
kwriteconfig -file kwinrc -group Compositing -key MaxFPS 200
Comment 14 Saygın Bakşi 2011-04-11 21:29:15 UTC
It is an nvidia driver OpenGL Vsync issue. With some additional invesitgation, the problem occurs depending on the situation:

1) If we enable Vsync in nvidia-settings from OpenGL tab, then enable kwin compositing, the framerate for desktop effects are very low. Not smooth at all. However movies have no tearing issues. (Tested with VLC with output is set to many different options)

2)If we start kwin then enable Vsync in nvidia-settings from OpenGL tab, the effects are smooth again, and the videos have no issues as well. 

3)With no Vsync in nvidia-settings from OpenGL tab, effects are smooth, videos have tearing issues.

I think the changes in the Vsync option in nvidia-settings from OpenGL tab only effects the newly opened applications, so kwin has to be started before this option is enabled I guess. I have no other ideas. So, from now on, I disable vsync in nvidia-settings from OpenGL tab then start kwin, then enable it if I want to play videos.
Comment 15 Thomas Lübking 2011-04-11 23:48:00 UTC
> I think the changes in the Vsync option in nvidia-settings from OpenGL tab only
> effects the newly opened applications
yes.

> So, from now on, I disable vsync in nvidia-settings from OpenGL tab then start kwin, then 
> enable it if I want to play videos.
Just start vlc with the VSYNC override exported "__GL_SYNC_TO_VBLANK=1 vlc" (alter the program menu entry or similar, "env" is your friend)
However, I still fail to believe that this can have any impact unless you use the GLX or SDL (in case SDL hooks on GLX) video output *shrug*
Comment 16 Saygın Bakşi 2011-04-14 00:11:48 UTC
Used "__GL_SYNC_TO_VBLANK=1 vlc" command and it works like a charm with one exception, if I get into fullscreen mode, the tearing effects come back. If it is windowed (maximized or not), not tearing at all... Any suggestions? Just a thought, if we can enable sync before running an application, may be we should enable Vsync globally but disable it while starting kwin? what do you think?
Comment 17 Thomas Lübking 2011-04-14 01:01:10 UTC
This /very/ strongly suggests that you're NOT using the OpenGL video output of vlc.

Reason:
while NOT in fullscreen, vlc  is composited and KWin syncs the window to the screen refresh rate.
In fullscreen mode the window will (by default, this can be changed, see the advanced tab) be unredirected to improve performance. (It bypasses the compositor and directly paints to screen)
At this point only the clients (vlc) or drivers (__GL_SYNC_TO_VBLANK) abilities to sync will apply. But for __GL_SYNC_TO_VBLANK to have any impact, vlc must actually render to opengl.
If it renders to xv (default, i think) nvidia should also be able to sync xv or vdpau (but it maybe only works correctly with one of the xv adapters)
You however need to enable this sync'ing /before/ running the application in nvidia-settings, "X Server XVideo Settings" (i don't think there's an environment variable)
Comment 18 Saygın Bakşi 2011-04-14 21:17:52 UTC
VLC XV output gives the best result with nvidia Xvideo Vsync option is enabled.

If you are interested for more, nvidia OpenGL Vsync is disabled, I have set vlc to glx (not default this time) output and used __GL_SYNC_TO_VBLANK=1 parameter, this time when windowed, the low fps problem (like minimize effect) occurs, fullscreen is great. If I open vlc without __GL_SYNC_TO_VBLANK option, again fullscreen is ok but windowed has tearing, not low fps problem. Well, I guess XV output is good enough, and I am bored of bugging you with my problems, so I am going to stop sending messages to this bug report. However, I think that nvidia openGL Vsync should be the same as Kwins, they shouldn't cause problems to others. So, I hope if there is a better way, you will find it.
Comment 19 Thomas Lübking 2011-04-14 23:55:16 UTC
(In reply to comment #18)

> Well, I guess XV output is good enough
Actually it's the preferred output and afaik the only one for accelerated video playback (vdpau) - the opengl output won't make things any faster and actually is more expensive (unless you tweak settings as possible in eg. mplayer - no idea about vlc)

> However, I think that
> nvidia openGL Vsync should be the same as Kwins, they shouldn't cause problems
> to others. So, I hope if there is a better way, you will find it.

The problem here is that if you ask nvidia to globally sync AND kwin to sync this are two sync calls - and a sync call is nothing but: waiting for the next vertical refresh trigger.

Ideally we would (but ONLY for nvidia since that doesn't work on any other GPU!) use the global sync env parameter but that's not that easy since:
a) you couldn't change the vsync setting w/o restarting kwin
b) you MUST be using glXSwapBuffers what's a pretty bad idea because 
b1) one will hardly shrink the desktop to XGA because ones GPU RAM is not fast enough to refresh WUXGA in time. (so this might require more powerful GPUs and likely suck more battery)
b2) one has to ensure the validity of the offscreen buffer what means to either repaint the entire screen (expensive) or ensure the buffer is always copied (ie. no flipping...)

However the GLES backend will require swapping anyway - just that nvidia does so far not (officially) support GLES ;-)