Bug 367766

Summary: Compositing won't start with Nvidia 370.23 drivers on Quadro card.
Product: [Plasma] kwin Reporter: Dmitri <dkour>
Component: compositingAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED UPSTREAM    
Severity: normal CC: andrej, Jason, jchevarley
Priority: NOR Flags: mgraesslin: NVIDIA+
Version: 5.7.3   
Target Milestone: ---   
Platform: Archlinux   
OS: Linux   
URL: https://phabricator.kde.org/D2744
Latest Commit: Version Fixed In:

Description Dmitri 2016-08-24 15:30:24 UTC
Latest (370.23) update to Nvidia drivers seems to have broken compositing on my laptop with a Quadro card (NVIDIA Corporation GK104GLM [Quadro K4100M]). My desktop with a GTX1080 seems to be unaffected by this issue.

BIOS has been configured to only use the Nvidia card.

Note that at least one other Arch user has seen the same issue: https://bugs.archlinux.org/task/50500

Reproducible: Always

Steps to Reproduce:
1. Login to KDE with Nvidia Quadro card.
2. Note that compositor is not running.
3. Use Alt+Shift+F12 to try and turn on compositor. Screen sometimes visibly flickers, but compositing not enabled
4. Try to turn on compositing in system settings. Compositing still does not enable if any of the opengl options are selected.

Actual Results:  
Compositing does not turn on if using any setting other than XRender

Expected Results:  
Compositing should work. Worked with previous driver (367.35).

When trying to enable compositing, the following is dumped into .xsession-errors:

OpenGL vendor string:                   NVIDIA Corporation
OpenGL renderer string:                 Quadro K4100M/PCIe/SSE2
OpenGL version string:                  3.1.0 NVIDIA 370.23
OpenGL shading language version string: 1.40 NVIDIA via Cg compiler
Driver:                                 NVIDIA
Driver version:                         370.23
GPU class:                              Unknown
OpenGL version:                         3.1
GLSL version:                           1.40
X server version:                       1.18.4
Linux kernel version:                   4.7.2
Requires strict binding:                no
GLSL shaders:                           yes
Texture NPOT support:                   yes
Virtual Machine:                        no
Comment 1 Joe 2016-08-24 15:34:23 UTC
I have the same issue with my Quadro M1000M. I created this bug in Arch: https://bugs.kde.org/show_bug.cgi?id=367766
Comment 2 Martin Flöser 2016-08-24 18:34:05 UTC
> Compositing should work. Worked with previous driver (367.35).

So I understand correctly that this is a driver regression? Then you should report it against NVIDIA. Sorry but if our code didn't change it's hardly a problem we can solve.

From the detection it looks everything is fine - everything is detected correctly.

Nevertheless thanks to making us aware of that problem, we might need to tell people about it...
Comment 3 Martin Flöser 2016-08-25 06:42:56 UTC
Adding some useful debug output which teo just provided me:
% kwin_x11 --replace
QXcbConnection: XCB error: 3 (BadWindow), sequence: 3651, resource id: 10485848, major code: 18 (ChangeProperty), minor code: 0
OpenGL vendor string:                   NVIDIA Corporation                                                                                                                                                                      
OpenGL renderer string:                 Quadro M2000M/PCIe/SSE2                                                                                                                                                                 
OpenGL version string:                  3.1.0 NVIDIA 370.23
OpenGL shading language version string: 1.40 NVIDIA via Cg compiler
Driver:                                 NVIDIA
Driver version:                         370.23
GPU class:                              Unknown
OpenGL version:                         3.1
GLSL version:                           1.40
X server version:                       1.18.4
Linux kernel version:                   4.7.1
Requires strict binding:                no
GLSL shaders:                           yes
Texture NPOT support:                   yes
Virtual Machine:                        no
Pixel was QVector4D(0, 0, 0, 0) expected QVector4D(0, 1, 0, 1)
Pixel was QVector4D(0, 0, 0, 0) expected QVector4D(0, 1, 0, 1)
Pixel was QVector4D(0, 0, 0, 0) expected QVector4D(0, 1, 0, 1)
Pixel was QVector4D(0, 0, 0, 0) expected QVector4D(0, 1, 0, 1)
kwin_core: ShaderManager self test failed
kwin_core: Failed to initialize compositing, compositing disabled


We can see it hits the shader self test failure
Comment 4 Martin Flöser 2016-09-12 09:17:54 UTC
Workaround for the issue at https://phabricator.kde.org/D2744
Comment 5 Martin Flöser 2016-09-12 11:13:28 UTC
Git commit e9e936b6c1bde338fb51ea3ada0897ad70055c44 by Martin Gräßlin.
Committed on 12/09/2016 at 11:13.
Pushed by graesslin into branch 'Plasma/5.7'.

[kwinglutils] Skip ShaderManager::selfTest for NVIDIA Quadro hardware

Summary:
The self test fails with NVDIDA 370.23 or newer on Quadro hardware.
Most likely there is a bug in our code as the same things work later on.
But without the hardware we are not able to reproduce and investigate
properly. Given that all we currently can do is to skip the self test.

We encourage users to investigate this properly and to help us to
identify the root issue, so that we can fix it.

Reviewers: #kwin

Subscribers: kwin

Tags: #kwin

Differential Revision: https://phabricator.kde.org/D2744

M  +4    -0    libkwineffects/kwinglutils.cpp

http://commits.kde.org/kwin/e9e936b6c1bde338fb51ea3ada0897ad70055c44
Comment 6 Jason A. Donenfeld 2016-09-12 11:23:37 UTC
> The self test fails with NVDIDA 370.23 or newer on Quadro hardware.
> Most likely there is a bug in our code as the same things work later on.
> But without the hardware we are not able to reproduce and investigate
> properly. Given that all we currently can do is to skip the self test.

I'm happy to help you fix this bug for real. Ping me on IRC -- #zx2c4 -- and we can run various tests.
Comment 7 Joe 2016-09-16 06:33:23 UTC
Just wanted to chime in - working now on 5.7.5 and the 5.8 beta. Also, let me know if you need anymore help debugging for the "real" fix.
Comment 8 Martin Flöser 2016-09-16 07:18:39 UTC
>  Also, let me know if you need anymore help debugging for the "real" fix.

Sure, we would love to figure out what's the real problem. For that we need someone with an affected system and slight knowledge about OpenGL. Something is going wrong when rendering the first frame - some state being wrong, incorrect fencing, something like that. So it needs someone to go through the rendering of the first frame to figure that one out.

Unfortunately creating patches and passing them to someone else for testing doesn't bring us anywhere - we tried that already.
Comment 9 Joe 2016-09-20 21:00:45 UTC
So, I have an affected system, but not the slightest knowledge of OpenGL, unfortunately - last time I came close to that was messing around with SDL 1.2 years ago.
Comment 10 Andrej 2016-10-10 00:35:46 UTC
In my case Kwin doesn't even seem to recognize the presence of the nvidia driver.

I have an NVS 510 card with 370.28 driver. The driver loads fine but Kwin uses the LLVMpipe. This is happening with KF 5.8.0, was fine before.

Not sure if it's related to this bug or should I open another one. More detailed report on https://bbs.archlinux.org/viewtopic.php?id=218038
Comment 11 Martin Flöser 2016-10-10 05:52:00 UTC
@Andrej: if llvmpipe is loaded it means that the driver doesn't work. That's a setup problem and completely unrelated to the problem described in this bug report.
Comment 12 Andrej 2016-10-10 10:01:04 UTC
@Martin: You're right, a package dependency didn't resolve properly during an upgrade it seems. Thanks and sorry for false alarm.