Bug 318212 - KWin gles broken with today's commits (11. April)
Summary: KWin gles broken with today's commits (11. April)
Status: RESOLVED UPSTREAM
Alias: None
Product: kwin
Classification: Plasma
Component: compositing (show other bugs)
Version: git master
Platform: Compiled Sources Linux
: NOR major
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-11 21:50 UTC by Hrvoje Senjan
Modified: 2013-08-06 12:19 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
KDebug output (24.72 KB, application/octet-stream)
2013-04-11 21:50 UTC, Hrvoje Senjan
Details
Screenshot1 (972.58 KB, image/png)
2013-04-11 21:51 UTC, Hrvoje Senjan
Details
Screenshot2 (1.24 MB, image/png)
2013-04-11 21:52 UTC, Hrvoje Senjan
Details
As per name (7.48 KB, application/octet-stream)
2013-04-11 22:18 UTC, Hrvoje Senjan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hrvoje Senjan 2013-04-11 21:50:06 UTC
After updating to today's KWin, i found that there is a large issue with kwin_gles.
The issue does not happen with regular KWin (though it also appears slower), or with compositing turned off.
I can confirm that it is for sure a KWin regression, as i have a 2,3 days old install of KWin in /opt, and that works fine.

Reproducible: Always

Steps to Reproduce:
1. kwin_gles --replace &
Actual Results:  
Terrible screen flickering,  windows dissapearing, etc

Expected Results:  
Should work normally

Will add kdebugs (nothing obvious there) and screenshots
Comment 1 Hrvoje Senjan 2013-04-11 21:50:56 UTC
Created attachment 78826 [details]
KDebug output
Comment 2 Hrvoje Senjan 2013-04-11 21:51:50 UTC
Created attachment 78827 [details]
Screenshot1
Comment 3 Hrvoje Senjan 2013-04-11 21:52:37 UTC
Created attachment 78828 [details]
Screenshot2
Comment 4 Thomas Lübking 2013-04-11 22:11:27 UTC
are you sure about the timeframe? last commit to eglonx has been 2013-03-26

Does the other version print:
kwin(10368) KWin::EglOnXBackend::init: EGL implementation and surface support eglPostSubBufferNV, let's use it
Comment 5 Hrvoje Senjan 2013-04-11 22:18:59 UTC
Created attachment 78830 [details]
As per name

(In reply to comment #4)
> are you sure about the timeframe? last commit to eglonx has been 2013-03-26
> 
> Does the other version print:
> kwin(10368) KWin::EglOnXBackend::init: EGL implementation and surface
> support eglPostSubBufferNV, let's use it

Yes and yes :-)
Attached the output of "good" kwin_gles
Also, noticed some LanczosFilter related commits, so tried disabling it, no difference.
Comment 6 Thomas Lübking 2013-04-11 22:26:11 UTC
Do you get
kwin(10368) KWin::checkGLError: GL error ( setupForOutput-clearErrors ):  "GL_INVALID_VALUE" 
lines on the "good" version either? (the attachment is shortcut)
Comment 7 Hrvoje Senjan 2013-04-11 22:32:15 UTC
Yes(In reply to comment #6)
> Do you get
> kwin(10368) KWin::checkGLError: GL error ( setupForOutput-clearErrors ): 
> "GL_INVALID_VALUE" 
> lines on the "good" version either? (the attachment is shortcut)
Yes,
and have color correction off on both setups (at least it is my understanding the error comes from there)
Comment 8 Thomas Lübking 2013-04-11 22:46:55 UTC
Happy bisecting?
If this was on glx only, i'd point a021eac - but on egl??

sure you build the correct clone/branch?
do you apply any local patches?
Comment 9 Hrvoje Senjan 2013-04-12 01:38:05 UTC
(In reply to comment #8)
> Happy bisecting?
> If this was on glx only, i'd point a021eac - but on egl??
Last commit on a working KWin is bea2cb8

> sure you build the correct clone/branch?
> do you apply any local patches?
Same (openSUSE) patches are on both.

Will try commits one by one to try determine the faulty one
Comment 10 Hrvoje Senjan 2013-04-12 08:57:06 UTC
Will need to do the procedure again, as i couldn't reproduce with another user. Then i found the faulty combo: gles + VSync (had exported vblank_mode=0 there). Removing the export brings the issue, but now i need to bisect again.
Comment 11 Hrvoje Senjan 2013-04-12 09:58:40 UTC
I deepy apologise. The issue comes from Mesa and/or xserver. I thought for sure is KWin, but that export fooled me (i guess it was a leftover). Will try to track it down there.

Apologies again.
Comment 12 Hrvoje Senjan 2013-04-12 10:43:05 UTC
Faulty Mesa commit,
3998f8c6b5da1a223926249755e54d8f701f81ab:
egl/x11: Fix initialisation of swap_interval
The EGLConfig attributes EGL_MIN/MAX_SWAP_INTERVAL were incorrectly set to
0 and 0. This prevented clients from setting the swap interval to a
reasonable value, like 1 or 2.

Swap interval worked correctly in Mesa 9.0. The commit below introduced
the bug.

    commit 7e9bd2b2ed35a440a96362417100a7e43715d606
    Author: Eric Anholt <eric@anholt.net>
    Date:   Tue Sep 25 14:05:30 2012 -0700
	egl: Add support for driconf control of swapinterval.

Note: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63078
[chadv: Wrote commit message]
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>

Reverting it, resolves the problem
Comment 13 Thomas Lübking 2013-04-12 11:15:53 UTC
@Ralf:
please cross check the commit mentioned in comment #12 with http://quickgit.kde.org/?p=kde-workspace.git&a=commit&h=ee0d463a65e36477d107d8942efc1d300de28636
Comment 14 Ralf Jung 2013-04-12 11:50:11 UTC
It sounds to me as if KWin hits this one
https://bugs.freedesktop.org/show_bug.cgi?id=63435
The commit which fixes the swap interval only uncovers the issue. Previously, the swap interval would be clamped to 0, even if v-sync was enabled. Now it can be set to 1, which triggers the flickering.
Hrvoje, could you try explicitly disabling v-sync in the Kwin configuration? That should fix the flickering.
Comment 15 Hrvoje Senjan 2013-04-12 11:54:11 UTC
(In reply to comment #14)
> Hrvoje, could you try explicitly disabling v-sync in the Kwin configuration?
> That should fix the flickering.

Yes, as in comment #10, confirming that turning off VSync resolves the problem
Comment 16 Ralf Jung 2013-04-12 11:57:49 UTC
Sorry, I should have read the full backlog first.

Since you seem to compile mesa from source, could you try to do
git revert 1e7776ca2bc59a6978d9b933d23852d47078dfa8
on top of current mesa master, and then run kwin with v-sync again? That fixes the flickering in my test app, so it should also fix it for kwin.
Comment 17 Hrvoje Senjan 2013-04-12 12:18:02 UTC
(In reply to comment #16)
> Sorry, I should have read the full backlog first.
No worries :-)

> Since you seem to compile mesa from source, could you try to do
> git revert 1e7776ca2bc59a6978d9b933d23852d47078dfa8
> on top of current mesa master, and then run kwin with v-sync again? That
> fixes the flickering in my test app, so it should also fix it for kwin.
Confirming. On top of plain Mesa master (without your commit reverted) and with reverting "egl: Remove bogus invalidate code" commit, also resolves the problem.
Comment 18 Ralf Jung 2013-04-12 14:11:08 UTC
(In reply to comment #17)
> Confirming. On top of plain Mesa master (without your commit reverted) and
> with reverting "egl: Remove bogus invalidate code" commit, also resolves the
> problem.
Thanks - now let's hope the mesa guys get this fixed before it hits a release.
(changed bug resolution to "upstream" as that's what mesa is for us)
Comment 19 Hrvoje Senjan 2013-05-24 21:39:51 UTC
Ralf, looks like this ended up in Mesa 9.1.3, do you know is KWin from 4.10 safe for this? (i'm still running mesa trunk with that commit reverted)
Comment 20 Ralf Jung 2013-05-25 10:50:42 UTC
Indeed it did :( And upstream doesn't seem to care at all.

This bug affects all EGL full-screen applications having a swap interval > 0. Lucky enough, KDE 4.10 does not try to enable v-sync for EGL, ignoring the user configuration. So this affects master only.
Comment 21 Alejandro Nova 2013-08-06 02:04:48 UTC
Upstream:

"Fixed in master by:

commit eed0a80137dfac641adfd39ce316938dbcf2be10
Author: Eric Anholt <eric@anholt.net>
Date:   Fri Jun 21 15:34:52 2013 -0700

    egl: Restore "bogus" DRI2 invalidate event code.
    
    I had removed it in commit 1e7776ca2bc59a6978d9b933d23852d47078dfa8
    because it was obviously wrong -- why do we care whether the server is a
    version that emits events, if we're not watching for the server's events,
    anyway?  And why would you only invalidate on a server that emits
    invalidate events, when the comment said to emit invalidates if the server
    *doesn't*?  Only, I missed that we otherwise don't flag that our buffers
    might have changed at swap time at all, so the driver was only checking
    for new buffers when triggered by the Viewport hack.  Of course you don't
    expect Viewport to be called after a swap.
    
    So, this is effectively a revert of the previous commit, except that I
    dropped the check for only emitting invalidates on a new server -- we
    *always* need to invalidate if we're doing a SwapBuffers.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63435
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Cc: "9.1 and 9.2" <mesa-stable@lists.freedesktop.org>

Hopefully Carl can merge it to the 9.2 and 9.1 branches soon."

Please, nag all of your distros!
Comment 22 Ralf Jung 2013-08-06 12:19:47 UTC
Mesa 9.1.6 actually also contains the fix already.