Bug 380865

Summary: kwin_x11 freezes with 100% CPU when using Desktop Grid with Present Windows and Fill Gaps enabled
Product: [Plasma] kwin Reporter: Jacob Kauffmann <jacob>
Component: effects-desktop-gridAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: achilleas.k, kelvie, mabo, marcel.isolt
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed In: 5.15.0
Sentry Crash Report:
Attachments: two backtraces on all threads on kwin_x11 while it's frozen, I let it run for a few seconds before taking the second one

Description Jacob Kauffmann 2017-06-05 17:38:59 UTC
When I activate the Desktop Grid effect, ONLY if I have the "Use Present Windows effect to layout the windows" option enabled, there is a chance (maybe 1 in 10) that my system will freeze. I can move the mouse or CTRL-ALT-F2 into a terminal, but I can't click or interact with anything on the desktop. Running top shows that "kwin_x11" is at 100% CPU usage (usually, although on rare occasions it's Xorg that's at 100%.) Running "kill -9 kwin_x11" in the terminal and then using Krunner to run "kwin_x11 --replace" recovers the system, but on those rare occasions where it's Xorg at 100% and not kwin_x11, that trick doesn't work because I can't recover my open windows after killing Xorg.

If I don't kill kwin_x11, the system sometimes recovers itself, but it can take anywhere from 3 to 10 minutes, during which time my laptop's fans run slightly faster than usual.

This issue has been occurring since about November 2016, but I didn't narrow down the cause until fairly recently, when I saw someone else with the same issue on the Arch forums. I find the Present Windows effect to be an essential part of using Desktop Grid, so I'd really like to help resolve the problem.

I'm using a GeForce GTX 1080 (mobile) with the proprietary NVIDIA graphics driver. My processor is an i7-6700K, for what it's worth. I currently have Plasma 5.10.0, Frameworks 5.34.0, Qt 5.8.0, and the 4.11.3-1-ARCH kernel, although this problem has definitely been occurring for at least the past few releases of Plasma. I'm currently using OpenGL 3.1 for my rendering backend, although I've tried 2.0 and the problem still occurred. Scale method is currently at "Accurate", but the problem also occurs when "Smooth" is selected.

I'm happy to provide whatever info from my system would be helpful, just tell me what tools to use and where to get the information from. From my understanding, I may need to re-compile Plasma with debugging enabled due to Arch's packages having debugging disabled by default, so let me know if that is necessary and I will try to get it done.
Comment 1 Kelvie Wong 2017-06-15 00:58:00 UTC
Created attachment 106103 [details]
two backtraces on all threads on kwin_x11 while it's frozen, I let it run for a few seconds before taking the second one
Comment 2 Martin Flöser 2017-06-17 12:56:32 UTC
not much to see unfortunately in the debug output. It could indicate that we fail to calculate a layout.
Comment 3 Kelvie Wong 2017-06-23 17:26:17 UTC
OK, I did some further digging. In PresentWindowsEffect::calculateWindowTransformationNatural, near the end, there is a do-while loop (if "Fill Gaps" is enabled); it seems to loop on and on, and doesn't ever exit this loop for several minutes, as mentioned by Mr. Kauffmann above.

I turned off "Fill Gaps" to fix this, but someone should take a look at the algorithm.
Comment 4 Martin Flöser 2017-07-03 05:11:20 UTC
*** Bug 381934 has been marked as a duplicate of this bug. ***
Comment 5 Martin Flöser 2018-02-19 20:29:29 UTC
*** Bug 390743 has been marked as a duplicate of this bug. ***
Comment 6 Vlad Zahorodnii 2018-10-20 15:40:53 UTC
Git commit 30ad58f559aa0cfc5dba649be387578481e8db32 by Vlad Zagorodniy, on behalf of Erik Kurzinger.
Committed on 20/10/2018 at 15:37.
Pushed by vladz into branch 'master'.

[effects/presentwindows] Avoid potential freeze during fill-gaps

Summary:
When using the natural layout algorithm with the fill-gaps option, a small
error (less than one) is introduced in windows' aspect ratio each time they are
enlarged due to floating-point roundoff.

Currently, the algorithm computes the width and height enlargement factors and
then attempts to enlarge in each of the four possible directions, repeating
until it can't enlarge any windows any further.  Hence, this aspect ratio error
can be multiplied by up to four. Especially for small, long, and narrow
windows, this can result in a total error of greater than one by the end of
that loop iteration. If this occurs, on subsequent iterations the height
enlargement factor might then be computed as negative violating some of the
core assumptions of the algorithm and resulting in the loop iterating endlessly
until one of the window dimensions overflows, freezing the program for up to
several minutes.

To fix this, the height enlargement factor should be re-computed based on the
new width each time the window is enlarged, ensuring the error introduced in
the aspect ratio never exceeds one.
Related: bug 364709, bug 368811

FIXED-IN: 5.15.0

Test Plan:
The most reliable way to reproduce the freeze seems to be to activate the
desktop-grid effect while a tool-tip window is fading in.
Ensure desktop-grid is configured to use present windows, and that present
windows is configured to use the natural layout algorithm with the fill gaps
option selected.

The freeze is still intermittent, but using this method should be able to be
triggered within about 10 tries without this fix.
After applying the fix, the freeze has never been observed.

Reviewers: #kwin, zzag

Reviewed By: #kwin, zzag

Subscribers: graesslin, kwin, zzag

Tags: #kwin

Differential Revision: https://phabricator.kde.org/D16278

M  +15   -3    effects/presentwindows/presentwindows.cpp

https://commits.kde.org/kwin/30ad58f559aa0cfc5dba649be387578481e8db32
Comment 7 Vlad Zahorodnii 2018-10-28 22:08:13 UTC
Git commit 4348cd56834cb17da5aa9d95d16ddc27bf39e0e6 by Vlad Zagorodniy, on behalf of Erik Kurzinger.
Committed on 28/10/2018 at 22:02.
Pushed by vladz into branch 'Plasma/5.12'.

[effects/presentwindows] Avoid potential freeze during fill-gaps

Summary:
When using the natural layout algorithm with the fill-gaps option, a small
error (less than one) is introduced in windows' aspect ratio each time they are
enlarged due to floating-point roundoff.

Currently, the algorithm computes the width and height enlargement factors and
then attempts to enlarge in each of the four possible directions, repeating
until it can't enlarge any windows any further.  Hence, this aspect ratio error
can be multiplied by up to four. Especially for small, long, and narrow
windows, this can result in a total error of greater than one by the end of
that loop iteration. If this occurs, on subsequent iterations the height
enlargement factor might then be computed as negative violating some of the
core assumptions of the algorithm and resulting in the loop iterating endlessly
until one of the window dimensions overflows, freezing the program for up to
several minutes.

To fix this, the height enlargement factor should be re-computed based on the
new width each time the window is enlarged, ensuring the error introduced in
the aspect ratio never exceeds one.
Related: bug 364709, bug 368811

FIXED-IN: 5.15.0

Test Plan:
The most reliable way to reproduce the freeze seems to be to activate the
desktop-grid effect while a tool-tip window is fading in.
Ensure desktop-grid is configured to use present windows, and that present
windows is configured to use the natural layout algorithm with the fill gaps
option selected.

The freeze is still intermittent, but using this method should be able to be
triggered within about 10 tries without this fix.
After applying the fix, the freeze has never been observed.

Reviewers: #kwin, zzag

Reviewed By: #kwin, zzag

Subscribers: graesslin, kwin, zzag

Tags: #kwin

Differential Revision: https://phabricator.kde.org/D16278

M  +15   -3    effects/presentwindows/presentwindows.cpp

https://commits.kde.org/kwin/4348cd56834cb17da5aa9d95d16ddc27bf39e0e6