Summary: | server grabbing in ::manage() causes visible stalls in animated clients | ||
---|---|---|---|
Product: | [Plasma] kwin | Reporter: | Michael Marley <michael> |
Component: | core | Assignee: | KWin default assignee <kwin-bugs-null> |
Status: | RESOLVED UPSTREAM | ||
Severity: | normal | CC: | cfeck, gpiez |
Priority: | NOR | ||
Version: | 4.11.5 | ||
Target Milestone: | --- | ||
Platform: | Kubuntu | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
Approach on how to improve it
Qt patch |
Description
Michael Marley
2014-01-14 22:15:32 UTC
This is easily visible if glxgears is run in a console with __GL_SYNC_TO_VBLANK=1 set, because it does output the exact number of frames rendered. Everytime a new window is mapped, glxgears drops exactly 1 frame. that sounds like mapping a window takes too long and we lose a frame because of that. The situation might have improved with KWin 5.x as we are using xcb and don't require that many roundtrips to the X server any more. Overall it can only be perfectly fixed by using a dedicated compositing rendering thread. That's still quite some work and won't happen any time soon, unfortunately. I'm not able to reproduce this problem when a new window is mapped, but see it quite good when an application is launched from a launcher the moment one clicks on the item. This looks like something is blocking (DBus?). Will investigate. Thanks for investigating! I have tried with kwin 5, but I haven't been able to collect any accurate data because it seems quite jerky in general. (If I run glxgears, it drops all sorts of frames even though I am not doing anything else with the system.) Maybe the GPU on that system is just too old now (an Nvidia 8600m GT). Some observations: I can get a short freeze in glxgears whenever I start a Qt 5 application. The freeze doesn't happen if I start the same application as Qt 4 (tested with kwrite and qdbusviewer) and also not with gtk applications (tested with inkscape). Furthermore the freeze can be observed with and without compositing and also on openbox. As far as my investigations show there is no blocking DBus call involved, also I'm not seeing any xcb_grab_server - Qt performs a server grab during startup, but that's not done on a KDE setup. I couldn't find any further XGrabServer or xcb_grab_server in Qt or KF5 but will continue investigating. I see this with KWin 4 & KWin 5 - entirely regardless of compositing. It does not happen with openbox. It does nearly not matter what kind of window is opened, actually gtk+ seems to cause the biggest stall. => The cause is (quite obviously and I just gave it a quick test) the server grabbing in ::manage() I do not know why that's there, but it has always been (at least as long as I can remember KWin sources) I could imagine that it has been to avoid flicker in mapping clients (back then, when every button was a single buffered X11 drawable...) but that's just a random guess. => We could disable that for master and see whether any bad things happen. If not, guard it with an env for a release? ok I think we have multiple problems here ;-) What I see is completely independent of the window manager. Actually I'm surprised that the grab in ::manage() is causing it. I had thought that with compositing it's anyway only during one frame. and yes, I think we could try removing the grab in ::manage Another implication is - and that hits the compositor - that the grab causes the uncomposited stall because managing a client takes incredibly long. getIcons() alone lasts ~100ms here, what's the equivalient of 6 frames! We've to shrink those numbers or (because that's probably not entirely possible on X11) inject some compositor updates. Eeewww... :-( > getIcons() alone lasts ~100ms here
whaaa? QtConcurrent to the rescue or do we consider threads on X11 as too dangerous?
just added a QElapsedTimer to manage and for me it's only about ~20 msec overall. Do you have a specific example for an application for which fetching the icons takes so long? gtk-chtheme (could try other gtk+ apps) It doesn't carry icons in the window properties, nor does it have a bitmap in WM_HINTS. I can confirm that "a simple Qt4/5 dialog" (aka "virtuality demo") takes considerably less time (though still > 16ms) > gtk-chtheme (could try other gtk+ apps)
confirmed, took 56 msec for icons, overall 94 msec in manage. It's not that surprising if it takes the fallback paths as that causes roundtrips. Maybe we can improve that...
I did some more measurements and some more investigation on what we do and there are still some roundtrips to X which could be eliminated. Each roundtrip takes around half an msec. So I see potential for maybe around 5 msec.
Created attachment 90425 [details]
Approach on how to improve it
This is an example on how we could further improve ::manage.
I have now ::mange down to about 9 to 12 msec on my system. Remaining big parts (aside from icons) are: * Client::applyWindowRules() with about 3 msec * Client::setupCompositing() with about 0.5 msec * Client::createDecoration() with about 3 msec * reading WinInfo with about 1.5 msec There are still a few XLib calls to read properties from ::manage (sizehints, opaque region) which need to be ported as they trigger roundtrips. Some review requests: * https://git.reviewboard.kde.org/r/122084/ * https://git.reviewboard.kde.org/r/122085/ * https://git.reviewboard.kde.org/r/122086/ * https://git.reviewboard.kde.org/r/122087/ With those integrated I get the ::manage for the test case gtk-chtheme to < 14 msec with getting icons < 0.4 msec > Some observations: I can get a short freeze in glxgears whenever I start a Qt 5 application. The freeze doesn't happen if I start the same application as Qt 4
I have seen this the first day I tried a Qt 5 application. Compile a minimal QGuiApplication which immediately returns. When you run it without arguments, the XCB platform plugin is used, and the "strace" output indicates that the X server blocks on two requests for quite a long time (around 100 ms on my slow system). If you use the "minimal" or "offscreen" --platform, then these short freezes are of course not happening.
With "X server freezing" I mean that even the mouse pointer does not move (KWin 4).
Thanks Christoph for confirming my observation. I just run an application through Xtrace and found the following: 0.002 000:<:009e: 12: RANDR-Request(140,0): QueryVersion major-version=1 minor-version=4 0.002 000:>:009e:32: Reply to QueryVersion: major-version=1 minor-version=4 0.002 000:<:009f: 8: RANDR-Request(140,31): GetOutputPrimary window=0x000000b8 0.002 000:<:00a0: 8: RANDR-Request(140,8): GetScreenResources window=0x000000b8 0.002 000:>:009f:32: Reply to GetOutputPrimary: output=0x00000000 0.189 000:>:00a0:1268: Reply to GetScreenResources: The request for RANDR::GetScreenResources is quite slow. Will play with Qt to try whether that is related. yes it's clearly related: I can reproduce the same problem with just running xrandr (freezing glxgears) and it blocks for the same time in RANDR::GetScreenResources. confirmation from KDE's xrandr experts: [15:04] <dvratil> mgraesslin: I believe it does, yes. It needs to actually query HW, which I'd say happens in blocking fashion so for Qt5 apps blocking X on startup there's probably not much to do: it needs to get the information to populate QScreen. Comment #17 is https://bugreports.qt.io/browse/QTBUG-40207, which has all the information how to resolve it. Created attachment 90458 [details] Qt patch Martin, sorry if this is off-topic. I fixed qxcbconnection.cpp to use XRandR::GetScreenResourcesCurrent initially, i.e. unless it received an actual update request. Starting Qt5 apps is now 0.2 seconds faster. But still it hangs for about 0.2 seconds in a second X request, and I cannot figure out how you made xtrace print timings. I used http://anonscm.debian.org/cgit/xtrace/xtrace.git and all output lines start with "000:>:XXX". Got it, "--help" did not mention "--timestamps". The second costly request is GetScreenInfo, which unfortunately does not have a *Current variant. The randrproto.txt has some descriptions about timestamps, which I cannot really follow, but it looks like GetScreenInfo is costly, because it does not know we are requesting "current" information, so effectively does a HW poll again. Stupid question: how long do they take for you? They're 0.6-0.8ms here each. GetScreenResources: 0.178 seconds GetScreenInfo: 0.160 seconds after patching: GetScreenResourcesCurrent: 0.000 seconds GetScreenInfo: 0.160 seconds How do you get sub-milliseconds accuracy? The timestamps here only display three decimal places. D'ohh - I don't. I thought the stamps were uptime (but you need to pass --monotonic-timestamps) while they're just epoch. So the calls last 60-80ms each) Git commit 14659a9907ce868af6dc39e39cecc120797e7a2d by Martin Gräßlin. Committed on 15/01/2015 at 15:31. Pushed by graesslin into branch 'master'. Split Client::checkActivities into two parts REVIEW: 122087 M +21 -2 client.cpp M +2 -0 client.h M +2 -1 manage.cpp http://commits.kde.org/kwin/14659a9907ce868af6dc39e39cecc120797e7a2d Git commit 7e4307b263ed6850c9a22424c8b58ec8f9422eee by Martin Gräßlin. Committed on 21/01/2015 at 09:17. Pushed by graesslin into branch 'master'. Use new KWindowSystem::icon overload taking a NETWinInfo* Removes roundtrips to the X-server when reading the icons. Requires kwindowsystem.git as of 6f941a5 (version 5.7). M +5 -5 client.cpp M +2 -1 manage.cpp http://commits.kde.org/kwin/7e4307b263ed6850c9a22424c8b58ec8f9422eee KWin's ::manage is quite fast nowadays and tries to not do any roundtrips. What remains is the problem of Qt querying xrandr. That's upstream, so setting the bug to upstream. |