Bug 329979 - server grabbing in ::manage() causes visible stalls in animated clients
Summary: server grabbing in ::manage() causes visible stalls in animated clients
Status: RESOLVED UPSTREAM
Alias: None
Product: kwin
Classification: Plasma
Component: core (show other bugs)
Version: 4.11.5
Platform: Kubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-01-14 22:15 UTC by Michael Marley
Modified: 2016-11-03 12:57 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Approach on how to improve it (2.67 KB, patch)
2015-01-15 13:36 UTC, Martin Flöser
Details
Qt patch (5.40 KB, patch)
2015-01-16 20:11 UTC, Christoph Feck
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Marley 2014-01-14 22:15:32 UTC
On my Kubuntu 14.04 system (with KDE 4.12.0 and kde-workspace 4.11.5) with an Nvidia graphics card and version 331.38 of the binary blob, all animation drops a frame and jerks when a new window or tooltip appears in kwin.  The jerking includes OpenGL applications, 2D applications, and animations in kwin itself.  (For example, if I launch Konsole from Kickoff, the exit animation for the Kickoff menu jerks when the Konsole window appears.)

Reproducible: Always

Steps to Reproduce:
1. Turn on kwin compositing.
2. Open an application (such as glxgears) that has a smooth animation.
3. Launch another application or perform some other action that causes a new window (including tooltips, etc) to be mapped.
Actual Results:  
The animation drops a frame and jerks.

Expected Results:  
The animation should run smoothly.
Comment 1 Gunther Piez 2014-03-19 12:41:52 UTC
This is easily visible if glxgears is run in a console with __GL_SYNC_TO_VBLANK=1 set, because it does output the exact number of frames rendered.
Everytime a new window is mapped, glxgears drops exactly 1 frame.
Comment 2 Martin Flöser 2015-01-14 09:17:54 UTC
that sounds like mapping a window takes too long and we lose a frame because of that. The situation might have improved with KWin 5.x as we are using xcb and don't require that many roundtrips to the X server any more.

Overall it can only be perfectly fixed by using a dedicated compositing rendering thread. That's still quite some work and won't happen any time soon, unfortunately.

I'm not able to reproduce this problem when a new window is mapped, but see it quite good when an application is launched from a launcher the moment one clicks on the item. This looks like something is blocking (DBus?). Will investigate.
Comment 3 Michael Marley 2015-01-14 10:33:53 UTC
Thanks for investigating!

I have tried with kwin 5, but I haven't been able to collect any accurate data because it seems quite jerky in general.  (If I run glxgears, it drops all sorts of frames even though I am not doing anything else with the system.)  Maybe the GPU on that system is just too old now (an Nvidia 8600m GT).
Comment 4 Martin Flöser 2015-01-14 10:59:47 UTC
Some observations: I can get a short freeze in glxgears whenever I start a Qt 5 application. The freeze doesn't happen if I start the same application as Qt 4 (tested with kwrite and qdbusviewer) and also not with gtk applications (tested with inkscape).

Furthermore the freeze can be observed with and without compositing and also on openbox.

As far as my investigations show there is no blocking DBus call involved, also I'm not seeing any xcb_grab_server - Qt performs a server grab during startup, but that's not done on a KDE setup.

I couldn't find any further XGrabServer or xcb_grab_server in Qt or KF5 but will continue investigating.
Comment 5 Thomas Lübking 2015-01-14 15:20:57 UTC
I see this with KWin 4 & KWin 5 - entirely regardless of compositing.
It does not happen with openbox.
It does nearly not matter what kind of window is opened, actually gtk+ seems to cause the biggest stall.

=> The cause is (quite obviously and I just gave it a quick test) the server grabbing in ::manage()

I do not know why  that's there, but it has always been (at least as long as I can remember KWin sources)
I could imagine that it has been to avoid flicker in mapping clients (back then, when every button was a single buffered X11 drawable...) but that's just a random guess.

=> We could disable that for master and see whether any bad things happen.
If not, guard it with an env for a release?
Comment 6 Martin Flöser 2015-01-14 15:30:30 UTC
ok I think we have multiple problems here ;-) What I see is completely independent of the window manager.

Actually I'm surprised that the grab in ::manage() is causing it. I had thought that with compositing it's anyway only during one frame.
Comment 7 Martin Flöser 2015-01-14 15:34:53 UTC
and yes, I think we could try removing the grab in ::manage
Comment 8 Thomas Lübking 2015-01-15 01:25:53 UTC
Another implication is - and that hits the compositor - that the grab causes the uncomposited stall because managing a client takes incredibly long.
getIcons() alone lasts ~100ms here, what's the equivalient of 6 frames!

We've to shrink those numbers or (because that's probably not entirely possible on X11) inject some compositor updates. Eeewww... :-(
Comment 9 Martin Flöser 2015-01-15 08:30:18 UTC
> getIcons() alone lasts ~100ms here

whaaa? QtConcurrent to the rescue or do we consider threads on X11 as too dangerous?
Comment 10 Martin Flöser 2015-01-15 08:55:33 UTC
just added a QElapsedTimer to manage and for me it's only about ~20 msec overall. Do you have a specific example for an application for which fetching the icons takes so long?
Comment 11 Thomas Lübking 2015-01-15 10:58:40 UTC
gtk-chtheme (could try other gtk+ apps)
It doesn't carry icons in the window properties, nor does it have a bitmap in WM_HINTS.

I can confirm that "a simple Qt4/5 dialog" (aka "virtuality demo") takes considerably less time (though still > 16ms)
Comment 12 Martin Flöser 2015-01-15 11:09:00 UTC
> gtk-chtheme (could try other gtk+ apps)

confirmed, took 56 msec for icons, overall 94 msec in manage. It's not that surprising if it takes the fallback paths as that causes roundtrips. Maybe we can improve that...

I did some more measurements and some more investigation on what we do and there are still some roundtrips to X which could be eliminated. Each roundtrip takes around half an msec. So I see potential for maybe around 5 msec.
Comment 13 Martin Flöser 2015-01-15 13:36:41 UTC
Created attachment 90425 [details]
Approach on how to improve it

This is an example on how we could further improve ::manage.
Comment 14 Martin Flöser 2015-01-15 16:17:02 UTC
I have now ::mange down to about 9 to 12 msec on my system. Remaining big parts (aside from icons) are:
* Client::applyWindowRules() with about 3 msec
* Client::setupCompositing() with about 0.5 msec
* Client::createDecoration() with about 3 msec
* reading WinInfo with about 1.5 msec

There are still a few XLib calls to read properties from ::manage (sizehints, opaque region) which need to be ported as they trigger roundtrips.
Comment 15 Martin Flöser 2015-01-16 12:06:47 UTC
Some review requests:
* https://git.reviewboard.kde.org/r/122084/
* https://git.reviewboard.kde.org/r/122085/
* https://git.reviewboard.kde.org/r/122086/
* https://git.reviewboard.kde.org/r/122087/

With those integrated I get the ::manage for the test case gtk-chtheme to < 14 msec with getting icons < 0.4 msec
Comment 16 Christoph Feck 2015-01-16 13:10:41 UTC
> Some observations: I can get a short freeze in glxgears whenever I start a Qt 5 application. The freeze doesn't happen if I start the same application as Qt 4

I have seen this the first day I tried a Qt 5 application. Compile a minimal QGuiApplication which immediately returns. When you run it without arguments, the XCB platform plugin is used, and the "strace" output indicates that the X server blocks on two requests for quite a long time (around 100 ms on my slow system). If you use the "minimal" or "offscreen" --platform, then these short freezes are of course not happening.

With "X server freezing" I  mean that even the mouse pointer does not move (KWin 4).
Comment 17 Martin Flöser 2015-01-16 13:50:47 UTC
Thanks Christoph for confirming my observation. I just run an application through Xtrace and found the following:

0.002 000:<:009e: 12: RANDR-Request(140,0): QueryVersion major-version=1 minor-version=4
0.002 000:>:009e:32: Reply to QueryVersion: major-version=1 minor-version=4
0.002 000:<:009f:  8: RANDR-Request(140,31): GetOutputPrimary window=0x000000b8
0.002 000:<:00a0:  8: RANDR-Request(140,8): GetScreenResources window=0x000000b8
0.002 000:>:009f:32: Reply to GetOutputPrimary: output=0x00000000
0.189 000:>:00a0:1268: Reply to GetScreenResources:

The request for RANDR::GetScreenResources is quite slow. Will play with Qt to try whether that is related.
Comment 18 Martin Flöser 2015-01-16 13:55:10 UTC
yes it's clearly related: I can reproduce the same problem with just running xrandr (freezing glxgears) and it blocks for the same time in RANDR::GetScreenResources.
Comment 19 Martin Flöser 2015-01-16 14:15:58 UTC
confirmation from KDE's xrandr experts:
[15:04] <dvratil> mgraesslin: I believe it does, yes. It needs to actually query HW, which I'd say happens in blocking fashion

so for Qt5 apps blocking X on startup there's probably not much to do: it needs to get the information to populate QScreen.
Comment 20 Christoph Feck 2015-01-16 17:35:49 UTC
Comment #17 is https://bugreports.qt.io/browse/QTBUG-40207, which has all the information how to resolve it.
Comment 21 Christoph Feck 2015-01-16 20:11:48 UTC
Created attachment 90458 [details]
Qt patch

Martin, sorry if this is off-topic.

I fixed qxcbconnection.cpp to use XRandR::GetScreenResourcesCurrent initially, i.e. unless it received an actual update request. Starting Qt5 apps is now 0.2 seconds faster.

But still it hangs for about 0.2 seconds in a second X request, and I cannot figure out how you made xtrace print timings. I used http://anonscm.debian.org/cgit/xtrace/xtrace.git and all output lines start with "000:>:XXX".
Comment 22 Christoph Feck 2015-01-16 20:31:26 UTC
Got it, "--help" did not mention "--timestamps".

The second costly request is GetScreenInfo, which unfortunately does not have a *Current variant.

The randrproto.txt has some descriptions about timestamps, which I cannot really follow, but it looks like GetScreenInfo is costly, because it does not know we are requesting "current" information, so effectively does a HW poll again.
Comment 23 Thomas Lübking 2015-01-16 20:39:47 UTC
Stupid question: how long do they take for you?
They're 0.6-0.8ms here each.
Comment 24 Christoph Feck 2015-01-16 20:54:44 UTC
GetScreenResources: 0.178 seconds
GetScreenInfo: 0.160 seconds

after patching:

GetScreenResourcesCurrent: 0.000 seconds
GetScreenInfo: 0.160 seconds

How do you get sub-milliseconds accuracy? The timestamps here only display three decimal places.
Comment 25 Thomas Lübking 2015-01-16 21:18:57 UTC
D'ohh - I don't.

I thought the stamps were uptime (but you need to pass --monotonic-timestamps) while they're just epoch. So the calls last 60-80ms each)
Comment 26 Martin Flöser 2015-01-21 08:26:19 UTC
Git commit 14659a9907ce868af6dc39e39cecc120797e7a2d by Martin Gräßlin.
Committed on 15/01/2015 at 15:31.
Pushed by graesslin into branch 'master'.

Split Client::checkActivities into two parts

REVIEW: 122087

M  +21   -2    client.cpp
M  +2    -0    client.h
M  +2    -1    manage.cpp

http://commits.kde.org/kwin/14659a9907ce868af6dc39e39cecc120797e7a2d
Comment 27 Martin Flöser 2015-01-21 09:19:48 UTC
Git commit 7e4307b263ed6850c9a22424c8b58ec8f9422eee by Martin Gräßlin.
Committed on 21/01/2015 at 09:17.
Pushed by graesslin into branch 'master'.

Use new KWindowSystem::icon overload taking a NETWinInfo*

Removes roundtrips to the X-server when reading the icons.

Requires kwindowsystem.git as of 6f941a5 (version 5.7).

M  +5    -5    client.cpp
M  +2    -1    manage.cpp

http://commits.kde.org/kwin/7e4307b263ed6850c9a22424c8b58ec8f9422eee
Comment 28 Martin Flöser 2016-11-03 12:57:29 UTC
KWin's ::manage is quite fast nowadays and tries to not do any roundtrips. What remains is the problem of Qt querying xrandr. That's upstream, so setting the bug to upstream.