Bug 253903 - kwin compositing really slow when several windows are opened
Summary: kwin compositing really slow when several windows are opened
Status: RESOLVED DUPLICATE of bug 183680
Alias: None
Product: kwin
Classification: Plasma
Component: compositing (show other bugs)
Version: unspecified
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-12 00:21 UTC by Martin Stolpe
Modified: 2011-12-22 10:25 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
oprofile when dragging a windows (278.41 KB, image/svg+xml)
2010-10-12 00:21 UTC, Martin Stolpe
Details
output of glxinfo (23.54 KB, text/plain)
2010-12-25 20:37 UTC, Martin Stolpe
Details
xrestop output for OpenGL mode (3.82 KB, text/plain)
2010-12-25 20:43 UTC, Martin Stolpe
Details
xrestop output for XRender mode (3.76 KB, text/plain)
2010-12-25 20:44 UTC, Martin Stolpe
Details
Use GLPlatform to detect NPOT support (913 bytes, patch)
2010-12-26 19:13 UTC, Martin Flöser
Details
two outputs of sysprof (446.68 KB, application/octet-stream)
2011-02-10 23:21 UTC, Olivier Lacroix
Details
output of cachegrind when moving windows around (323.41 KB, application/octet-stream)
2011-02-10 23:29 UTC, Olivier Lacroix
Details
memcheck output with enough windows opened to make kwin slooow (33.21 KB, application/octet-stream)
2011-02-12 16:13 UTC, Olivier Lacroix
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Stolpe 2010-10-12 00:21:17 UTC
Created attachment 52429 [details]
oprofile when dragging a windows

Version:           unspecified
OS:                Linux

When only one window is opened kwin is working fine. But when I open several windows and then for example try to drag one window kwin is becoming really slow. When switching from the opengl to the xrender backend it works fast (but there are graphical artifact with drop down scroll lists which are painted in the background).

I've attached an oprofile log when dragging a windows but I can't see anything interesting there.

I'm using version 4.5.2 of the KDE SC. The card is a X1400 which uses the mesa gallium driver.

Reproducible: Always

Steps to Reproduce:
Open a few windows and drag one windows.
Comment 1 Thomas Lübking 2010-10-12 00:52:36 UTC
do you use the blur plugin? wobbly windows?
Comment 2 Martin Stolpe 2010-10-12 09:14:54 UTC
All plugins are disabled. I use the following settings:
 -Improved window management, Shadows, Various animations are all disabled
 -Effect for window switching: No Effect
 -Effect for desktop switching: No Effect
 -Animation speed: Instant

 -Compositing type: OpenGL
 -Keep window thumbnails: Only for Shown Windows

 -OpenGL mode: Texture From Pixmap
 -Texture filter: Bilinear
 -Enable direct rendering: enabled
 -Use VSync: disabled

 -Smooth scaling (slower): disabled
Comment 3 Martin Stolpe 2010-10-12 09:17:43 UTC
As an additional info: The slowdown becomes really apparent when roughly 10 windows or more are opened.
Comment 4 Thomas Lübking 2010-10-12 15:33:56 UTC
does the cpu load raise accordingly?
how much memory does your GPU have?
does the window type matter? (say: konsole vs. xterm)
is it related to the used decoration/size? (try to use "only titlebar in eg. oxygen or bespin)
Comment 5 Martin Stolpe 2010-10-14 11:49:17 UTC
(In reply to comment #4)
> does the cpu load raise accordingly?
no, cpu load for kwin stays at roughly 10%

> how much memory does your GPU have?
128 MB

> does the window type matter? (say: konsole vs. xterm)
no, it doesn't matter. It's dragging is lagging with only windows and it gets worse the more windows I have opened.

> is it related to the used decoration/size? (try to use "only titlebar in eg.
> oxygen or bespin)
no, it doesn't seem to be related to the decoration. I've tried several decorations. The strange thing is that after playing with the decorations the effect isn't as drastic any more when I open several windows. But it's still too slow to be usable (almost a slide show).
Comment 6 Marius Bjørnstad 2010-12-09 11:56:32 UTC
I think I have this problem too. I have an NVidia NVS 3100 GPU with 256 MB. 

1) If I open 12 konsole windows, I can resize them, and everything is reasonably fast. 
2) If i close those windows, and open 12 new ones, when I try to resize one of them to almost the full screen size, the screen locks up for about 10 seconds. The system is still running in the background; sound is playing fine.
3) If I repeat the process again, the same problem appears. 
4) If i only have a few konsole windows open, the problem goes away.

I can only use the "nvidia" driver, "nouveau" and "nv" are unusable in different ways.

Hope someone can look in to this;) These kind of things are why I moved away from Windows. Really annoying!
Comment 7 Marius Bjørnstad 2010-12-09 12:00:06 UTC
(In reply to comment #6)
> I think I have this problem too. I have an NVidia NVS 3100 GPU with 256 MB. 

Sorry, different problem, Xrender does not fix it for me.  Everything just becomes horribly slow. Please delete the comments, I will submit a different bug.
Comment 8 Martin Stolpe 2010-12-11 11:06:43 UTC
Possible duplicate of https://bugs.kde.org/show_bug.cgi?id=256654
Comment 9 Thomas Lübking 2010-12-11 17:31:43 UTC
from 256654:
"The problem goes away when downgrading kwin to 4.5.2 or reversing the commit(s)
at http://websvn.kde.org/?view=revision&revision=1189360"

while OP says:
"I'm using version 4.5.2 of the KDE SC. The card is a X1400 which uses the mesa
gallium driver."

-> unlikely a dupe :-(
comment #6 points a leak (but on nvidia) and from my personal experience #7 is hard to believe (Xrender in meanwile partially faster than OpenGL here, (on vanilla 4.5.4 and 260.19.21))
Comment 10 Martin Stolpe 2010-12-25 15:21:58 UTC
I just realized that when moving a window the size of the window has a huge impact on the performance.

When compositing is enabled and the window has a size of say 320*200 the movement of the window when dragging it around is instantanious. When the size of the windows gets bigger, say for example 600*1000, the movement of the windows becomes really sluggish.
Comment 11 Thomas Lübking 2010-12-25 17:17:44 UTC
sorry, but size related -> data amount -> very likely hardware limitation :-(

you probably just run out of GPU RAM (yes, this can happen quite fast*, esp. with GL compositing (texture in the GL client + offscreen pixmap on the X11 server) pixmap prone KDE/GNOME icons/ui styles, etc.) and the data needs to be transferred to the system RAM :-(

try to shutdown the desktop ("kquitapp plasma-desktop") and run "kcmshell4 style" to set another style (try "CDE", looks ugly - i know ;-) and check whether you can have more windows open before the issue occurs (maybe logout/in after the style change)

you can use "xrestop" to monitor the current amount of pixmap data on the desktop and oc _every_ window (docks, the desktop, etc.) requires (24|23)*w*h/8 bytes on the GL texture.

for more info you'll have to contact the ati/gallium driver devs about how the memory is split between GL & X11 (there used to be static splits) and whether it can be monitored.
also they can *maybe* improve the mapping strategy.

you can probably spare pixmap memory by using the raster engine for Qt (the internal "pixmaps" are then allocated on the system RAM, won't hurt that much)
also set "keep window thumbnails" to "never" so they're discarded for minimized windows and such on other virtual desktops

*
128MB = 128*1024*1024*8 bits
assuming you've got a 1440x900px display, NO argb window (all plasma panels & kicker, krunner, etc. are...) and the driver doesn't allocate 32bit textures unconditionally
128*1024*1024*8/(24*1440*900) = 34.5
ie. you could have 34.5 fullscreen windows, BUT:
Gtk+ & Qt keep internal offscreeen buffers for doubling (at least window size) and you need 1 copy as offscreen pixmap for compositing and than the texture
-> 34.5/3 = 11 windows à 1440x900px would be a /very/ optimistic calculation to fit in your video RAM

----

can you elaborate on the xrender artefacts? (eg. screenshot, does it happen randomly or always, etc.) - that's another bug, but if this is hardware limited, it cannot be "fixed" :-(
Comment 12 Martin Stolpe 2010-12-25 18:55:17 UTC
(In reply to comment #11)
> sorry, but size related -> data amount -> very likely hardware limitation :-(
Hm, I'm not so sure. Gallium got in a state where one can play with the driver a while ago. I have a ATI 3650 with 512 MB of RAM and I have the same problems there.


> try to shutdown the desktop ("kquitapp plasma-desktop") and run "kcmshell4
> style" to set another style (try "CDE", looks ugly - i know ;-) and check
> whether you can have more windows open before the issue occurs (maybe logout/in
> after the style change)
Unfortunately this doesn't help.

> you can use "xrestop" to monitor the current amount of pixmap data on the
> desktop and oc _every_ window (docks, the desktop, etc.) requires (24|23)*w*h/8
> bytes on the GL texture.
I have uploaded the outputs of xrestop when using CDE and when the "Compositing type" is set to XRender and to OpenGL

> you can probably spare pixmap memory by using the raster engine for Qt (the
> internal "pixmaps" are then allocated on the system RAM, won't hurt that much)
I'm already using the raster engine for Qt.

> also set "keep window thumbnails" to "never" so they're discarded for 
> minimized windows and such on other virtual desktops
This option doesn't have any impact on the perceived speed.

> can you elaborate on the xrender artefacts? (eg. screenshot, does it happen
> randomly or always, etc.) - that's another bug, but if this is hardware
> limited, it cannot be "fixed" :-(
XRender to seems to be ok now. At least there were no obvious artefacts when I was using this mode now.

XRender is still *a lot* faster than the OpenGL mode.
Comment 13 Thomas Lübking 2010-12-25 20:25:13 UTC
Ok, then my guess would be that the gallium/ati driver suggests to support (please post glxinfo) GL_ARB_texture_non_power_of_two but, well "supports" it in a more theoretic way =P

Can you test a window with a precise and fixed size of eg. 1024x1024 pixels?
(remove the deco and use a rule to fix the size)
Comment 14 Martin Stolpe 2010-12-25 20:37:45 UTC
Created attachment 55237 [details]
output of glxinfo

I don't see a way to disable windows decorations in Systemsettings -> Workspace Appearance -> Window Decorations.
Can you give me instructions on how to disable window decorations and how to set a fixed window size?
Comment 15 Martin Stolpe 2010-12-25 20:43:31 UTC
Created attachment 55238 [details]
xrestop output for OpenGL mode
Comment 16 Martin Stolpe 2010-12-25 20:44:07 UTC
Created attachment 55239 [details]
xrestop output for XRender mode
Comment 17 Thomas Lübking 2010-12-25 20:48:58 UTC
driver claims support for nonpot support.

run "kcmshell4 kwinrules", click "new...", click "detect window properties",
pick a window.
enter the "geometry" tab, check "size", select "force", enter "1024x1024"
enter the "preferences" tab, check "no border", select "force", check the
trailing checkbox
press, ok -> apply (in the main dialog) the window should change, in doubt when
opened the next time.

the window can be move w/ titlebar by pressing "alt" and leftclicking somewhere in it - then just move ;-)

you can just delete the rule (and press "apply") after the test to get rid of
such rather nasty rule.
Comment 18 Martin Flöser 2010-12-25 21:16:56 UTC
We need to use GLPlatform for "driver supports nonpot". We know better even if 
the driver thinks to support it ;-)
Comment 19 Martin Flöser 2010-12-26 19:13:35 UTC
Created attachment 55263 [details]
Use GLPlatform to detect NPOT support

Please try this patch. It makes KWin use GLPlatform to decide whether NPOT is supported. This should make kwin behave correctly even if the driver is reporting incorrect data.
Comment 20 Martin Stolpe 2010-12-26 21:59:54 UTC
This patch definitely improved the OpenGL performance. But the performance overall is still slow.

As a side note: I had to download kwinglplatform.cpp, kwinglplatform.h from SVN as they were missing in kdebase-workspace-4.5.4.tar.bz2 and otherwise compilation would fail with an error. I had also to download CMakeLists.txt so that the compilation ran successfully. Don't know why the packager of my distro didn't need those files.

Is there anything I should report to the mesa guys considering NPOT?
Comment 21 Martin Flöser 2010-12-26 22:10:35 UTC
> As a side note: I had to download kwinglplatform.cpp, kwinglplatform.h from
> SVN as they were missing in kdebase-workspace-4.5.4.tar.bz2 and otherwise
> compilation would fail with an error. I had also to download
> CMakeLists.txt so that the compilation ran successfully. Don't know why
> the packager of my distro didn't need those files.
The files are new in 4.6. So it is expected that they are not in 4.5.4. This 
probably also means that it is not executed correctly. The method 
GLPlatform::detect() needs to be executed. If possible try it again with 4.6 
and the patch.
> 
> Is there anything I should report to the mesa guys considering NPOT?
to not emulate things. If the hardware does not support a feature that's fine. 
We check for the extensions, but we have serious problems if they start to 
"support" the extension.
Comment 22 Martin Stolpe 2010-12-26 22:39:06 UTC
As it won't be too long before 4.6 will be released I'm going to wait for the final version and then try again and report back.
Comment 23 Martin Stolpe 2010-12-27 00:15:28 UTC
I've posted a question at phoronix as there are regularly some devs: http://phoronix.com/forums/showthread.php?p=163264#post163264

It could be that r300 generation hardware doesn't support mipmapped npot textures: http://phoronix.com/forums/showpost.php?p=163264&postcount=2
Comment 24 Martin Stolpe 2010-12-27 10:41:46 UTC
I have to correct myself: When using the compositing effects on my computer with the Radeon HD3650 OpenGL doesn't work at all (I'm using r600g, which isn't really mature yet), when I chose XRender as backend compositing works fine (can't use blurr effect, but that's all I have to complain so far).

Sorry for claiming that compositing is also slow on 3650.
Comment 25 Olivier Lacroix 2011-01-29 16:59:14 UTC
Hello,

I seem to be affected by a similar issue. At least the symptoms are the same.

Since the upgrade to KDE SC 4.6 from 4.5.5, Kwin with desktop effects got much slower.

I am running archlinux (kernel 2.6.37 and Gallium3D) on a ATI X600 card.

The more opened windows, the worse the performance. With no window opened, the framerate is about 60 fps, and it drops to about 5-10 fps as soon as 3 windows are opened.
Comment 26 Thomas Lübking 2011-01-30 01:06:02 UTC
maybe related to bug #256654 & commit http://websvn.kde.org/?view=revision&revision=1189360 resp. the "reversion" in http://websvn.kde.org/?revision=1215519&view=revision ?
Comment 27 Olivier Lacroix 2011-01-30 18:05:58 UTC
Thanks for your answer. 

Reverting 1189360 leads to no change at all. 

Removing the check againts r300G as in 1215519 feels a bit better but freezes the computer in about 5 sec.

Is there any debugging procedure I should follow to help pinpoint the issue ?
Comment 28 Thomas Lübking 2011-01-30 22:22:42 UTC
yes - first of all, disable all effect plugins. see whether things remain slow.
if not, re-enable* them one by one to figure the culprit.

this should dump you a list of you active effects:
grep -iE 'kwin4_effect_.*Enabled=true' `kde4-config --path config | cut -d":" -f1`/kwinrc | sed -e 's/kwin4_effect_//g; s/Enabled=true//g'
Comment 29 Martin Stolpe 2011-01-30 23:51:14 UTC
I don't have the problem with the freezing of the computer but even with the patch and KDE SC 4.6 kwin with compositing is still too slow to be usable. But I don't give up hope yet. Perhaps the switch to OpenGl ES brings some improvements on the performance side.
Comment 30 Olivier Lacroix 2011-01-31 20:35:12 UTC
Thanks Thomas. Unfortunately, disabling all effects leads to the same slowness. Your oneliner confirmed that all effects were disabled.

Scrolling is also really slow, if that can help you.

Would profiling help you figure that out ? I saw in bug #256654 that the output of sysprof was asked for...

Thanks
Comment 31 Thomas Lübking 2011-01-31 22:26:18 UTC
maybe - the interesting part of those profiles seems to be that the claimed to be faster one has more CPU load on kwin, so the CPU is probably not the bottleneck (but the memory) so cachegrind would be the tool of choice (but running kwin in valgrind is painful...)

Attaching Fredrik, since he probably knows more about ati specific pitfalls  :-)
Comment 32 Olivier Lacroix 2011-02-10 19:55:09 UTC
SInce I never really used valgrind&co could you give me the exact command to get you the most useful information possible ?
Comment 33 Thomas Lübking 2011-02-10 21:43:53 UTC
First of all you should use sysprof to verify that you CPU usage drops when things get slow.
Otherwise you experience an issue in the CPU power and can spare the not so funny valgrind run.

Valgrind you ned to know (open as many windows as you need to cross the magic slowness border before doing the below)
   
   valgrind kwin --replace
   #runs memcheck - do some slow stuff now

   valgrind --tool=cachegrind kwin --replace 
   #runs runs cachegrind - do some slow stuff now

NOTICE:
----------
a) not matter how fast your computer is, you can go and fetch a cup of coffee after calling each - also your system will run _really_ slow afterwards ;-)
b) pressing "ctrl+c" will leave you w/o WM, you better call "kwin --replace" from eg. krunner (this will kill the vagrind session, but doesn't harm) - however global shortcuts and some other stuff will likely fail. just re-run kwin once more then ;-)
c) the output goes to memcheck.out.<pid> resp. cachegrind.out.<pid> in the current directory, you better ensure to have write permission there - or just wasted some time =)
d) The resulting files will likely be LARGE, please compress them.
Comment 34 Olivier Lacroix 2011-02-10 23:21:36 UTC
Created attachment 57131 [details]
two outputs of sysprof

One output of sysprof when all windows are minimized ("fast", ie 30 fps) and one when the same windows are all maximized (slow, ie 3-4fps)
Comment 35 Olivier Lacroix 2011-02-10 23:29:07 UTC
Created attachment 57132 [details]
output of cachegrind when moving windows around
Comment 36 Olivier Lacroix 2011-02-10 23:36:55 UTC
Valgrind does not output any file. on launchin "kwin --replace" the valgrind process ends with


==28410== 
==28410== HEAP SUMMARY:
==28410==     in use at exit: 3,159,462 bytes in 26,757 blocks
==28410==   total heap usage: 602,353 allocs, 575,596 frees, 140,463,280 bytes
allocated
==28410== 
==28410== LEAK SUMMARY:
==28410==    definitely lost: 22,194 bytes in 276 blocks
==28410==    indirectly lost: 141,958 bytes in 793 blocks
==28410==      possibly lost: 126,812 bytes in 3,464 blocks
==28410==    still reachable: 2,868,498 bytes in 22,224 blocks
==28410==         suppressed: 0 bytes in 0 blocks
==28410== Rerun with --leak-check=full to see details of leaked memory
==28410== 
==28410== For counts of detected and suppressed errors, rerun with: -v
==28410== Use --track-origins=yes to see where uninitialised values come from
==28410== ERROR SUMMARY: 107 errors from 34 contexts (suppressed: 302 from 14)
Comment 37 Thomas Lübking 2011-02-11 00:11:13 UTC
sorry, memcheck actually prints to ~/kwin.memcheck by default - my bad :S
Comment 38 Olivier Lacroix 2011-02-11 19:58:20 UTC
I actually don't get that file out of valgrind either ...
Comment 39 Olivier Lacroix 2011-02-12 16:13:32 UTC
Created attachment 57188 [details]
memcheck output with enough windows opened to make kwin slooow

output of running valgrind --log-file=memcheck.out kwin --replace
Comment 40 Martin Flöser 2011-12-22 10:25:43 UTC

*** This bug has been marked as a duplicate of bug 183680 ***