Bug 315114

Summary: KWin fails to initialize OpenGL ES on Mali-604T
Product: [Plasma] kwin Reporter: Vladimir <civil.over>
Component: scene-openglAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version: 4.10.0   
Target Milestone: 4.10.1   
Platform: Gentoo Packages   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: kwin-4.10.0-egl.patch
kwin-4.10.0-egl.patch
Possible patch
kwin-4.10.0-egl.patch

Description Vladimir 2013-02-14 07:58:47 UTC
kwin fails to initialize OpenGL ES backed with "query surface failed" error. Kwin 4.9.5 works fine.

Reproducible: Always

Steps to Reproduce:
1. Start KDE on Mali 604T
Actual Results:  
No working composite avaliable

Expected Results:  
Working composite.

Problem seems to be in this code:
    if (eglQuerySurface(dpy, surface, EGL_POST_SUB_BUFFER_SUPPORTED_NV, &surfaceHasSubPost) == EGL_FALSE) {
        kError(1212) << "query surface failed";
        return false;
    }

If you look in http://www.khronos.org/registry/egl/extensions/NV/EGL_NV_post_sub_buffer.txt, EGL_FALSE means that mali604 does not support eglPostSubBufferNV and eglSwapBuffers should be used instead. KWIN 4.9 ignores return code, but in 4.10 behavior changed and now it's fatal error, tough kwin seems to support eglSwapBuffers. If you comment out this check, everything works (but segfaults inside libGLES, but it's problem of vendor's libGL, not of kwin).

Changes was introduced in kde-workspace commit 4606e3b82d4f4a3f2478390e26cc135854841f8f.
Comment 1 Vladimir 2013-02-14 08:00:15 UTC
Created attachment 77277 [details]
kwin-4.10.0-egl.patch

AFAIK if check for EGL_POST_SUB_BUFFER_SUPPORTED_NV fails, it shouldn't be fatal error, so this part of code should be reverted to it's state before commit.
Comment 2 Vladimir 2013-02-14 08:10:53 UTC
Created attachment 77279 [details]
kwin-4.10.0-egl.patch

Sorry, first patch was malformed
Comment 3 Martin Flöser 2013-02-14 08:15:21 UTC
(In reply to comment #0)
> If you look in
> http://www.khronos.org/registry/egl/extensions/NV/EGL_NV_post_sub_buffer.txt,
> EGL_FALSE means that mali604 does not support eglPostSubBufferNV and
> eglSwapBuffers should be used instead.
You missread this section. It states that you should add that paragraph to the spec on page 37. This page describes how the value passed to eglQuerySurface is updated. That means if it is not present value should be set to EGL_FALSE not the return value of this function.

For this most important is the last paragraph of chapter 3.5.6:

"eqlQuerySurface returns EGL_FALSE on failure and value is not updated. If attribute is not a valid EGL surface attribute, then an EGL_BAD_ATTRIBUTE error is generated. If surface is not a valid EGLSurface then an EGL_BAD_SURFACE error is generated."

Now assuming the driver does not know about the extension (we are pre-context creation so we cannot check for extensions) it will return EGL_FALSE and set an EGL_BAD_ATTRIBUTE error. Given that it sets an error we would block the compositor anyway as we break on error (see eglonxbackend.cpp line 143).

The proper solution must therefore to check whether the call generated a EGL_BAD_ATTRIBUTE and then set surfaceHasSubPost to false and continue. In case of other error we still need to break.

That said: I'm of course unable to test this as I'm lacking hardware.
Comment 4 Martin Flöser 2013-02-14 08:21:44 UTC
Created attachment 77281 [details]
Possible patch

Please try this patch whether it solves the problem. It includes the changes I described in the comment.
Comment 5 Vladimir 2013-02-14 09:12:00 UTC
Created attachment 77285 [details]
kwin-4.10.0-egl.patch

AFAIK patch is not correct.
1) surfaceHasSubPost is already set to 0 in constructor (line 32), so no need to explicitly set it to false
2) EGL_FALSE is 0, so why set it to EGL_FALSE? surfaceHasSubPost is used internally, to choose what function to use (eglSwapBuffer or eglPostSubBufferNV).
3) It'll be strange if eglQuerySurface returns EGL_FALSE, but eglGetError is EGL_SUCCESS. Is it possible at all?

I think it's better just to change if statmenet to check if eglGetError is not EGL_BAD_ATTRIBUTE.

So I've attached patch that I suggest. With it, composition works for me. I can try yours too, but later today, if it's needed.
Comment 6 Martin Flöser 2013-02-14 09:53:56 UTC
After working around driver bugs for 5 years I do not trust drivers any more. So I expect the worst. Reasoning behind the checks below.

(In reply to comment #5)
> AFAIK patch is not correct.
> 1) surfaceHasSubPost is already set to 0 in constructor (line 32), so no
> need to explicitly set it to false
maybe the driver changed the value (even if it is not supposed to do so)? Doing the explicit setting does not hurt.
> 3) It'll be strange if eglQuerySurface returns EGL_FALSE, but eglGetError is
> EGL_SUCCESS. Is it possible at all?
According to the spec: there should be an error set. But maybe the driver did not set it?
> So I've attached patch that I suggest. With it, composition works for me. I
> can try yours too, but later today, if it's needed.
Please give it a try. I prefer to have here a little bit more code and to assume the worst than to assume that it will work :-)
Comment 7 Vladimir 2013-02-14 19:53:55 UTC
Yes, it works for me.
Comment 8 Martin Flöser 2013-02-20 07:06:07 UTC
Git commit df22c649eb92aa4af8bccf0431bdff7a7152e2bd by Martin Gräßlin.
Committed on 20/02/2013 at 08:03.
Pushed by graesslin into branch 'KDE/4.10'.

Do not abort EGL backend creation if EGL_NV_post_sub_buffer isn't supported

In case the extension is not present eglQuerySurface returns EGL_FALSE
when querying for EGL_POST_SUB_BUFFER_SUPPORTED_NV and sets an
EGL_BAD_ATTRIBUTE error state. If this is the case it's not an error we
should abort on, but one we should ignore as it's the same as extension
not supported.
FIXED: 4.10.1

M  +7    -2    kwin/eglonxbackend.cpp

http://commits.kde.org/kde-workspace/df22c649eb92aa4af8bccf0431bdff7a7152e2bd