Bug 386490 - Crash in Atomic DRM setting
Summary: Crash in Atomic DRM setting
Status: RESOLVED FIXED
Alias: None
Product: kwin
Classification: Plasma
Component: platform-drm (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL: https://phabricator.kde.org/D8752
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-03 10:40 UTC by David Edmundson
Modified: 2017-11-14 16:11 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
kde: Wayland+
kde: X11-
mgraesslin: ReviewRequest+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Edmundson 2017-11-03 10:40:41 UTC
(reported by Fabian, posting on bugzilla, as I need input from Roman)

#0  KWin::DrmPlane::atomicPopulate (this=0x5555558bb3b0, req=0x55555629f2e0) at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/plugins/platforms/drm/drm_object_plane.cpp:133
#1  0x00007fffdce6ef63 in KWin::DrmOutput::doAtomicCommit (this=this@entry=0x5555558cd1a0, mode=mode@entry=KWin::DrmOutput::AtomicCommitMode::Test)
    at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/plugins/platforms/drm/drm_output.cpp:955
#2  0x00007fffdce70609 in KWin::DrmOutput::presentAtomically (this=0x5555558cd1a0, buffer=<optimized out>) at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/plugins/platforms/drm/drm_output.cpp:832
#3  0x00007fffdce62c5c in KWin::DrmBackend::present (this=0x5555557e9610, buffer=0x55555617fe40, output=<optimized out>) at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/plugins/platforms/drm/drm_backend.cpp:576
#4  0x00007fffdce761bd in KWin::EglGbmBackend::presentOnOutput (this=0x5555558c5ec0, o=...) at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/plugins/platforms/drm/egl_gbm_backend.cpp:268
#5  0x00007fffdce77458 in KWin::EglGbmBackend::endRenderingFrameForScreen (this=0x5555558c5ec0, screenId=0, renderedRegion=..., damagedRegion=...)
    at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/plugins/platforms/drm/egl_gbm_backend.cpp:338
#6  0x00007fffd69dad99 in KWin::SceneOpenGL::paint (this=this@entry=0x5555558750d0, damage=..., toplevels=...) at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/plugins/scenes/opengl/scene_opengl.cpp:681
#7  0x00007ffff7a57662 in KWin::Compositor::performCompositing (this=this@entry=0x55555587ab10) at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/composite.cpp:740
#8  0x00007ffff7a57dee in KWin::Compositor::startupWithWorkspace (this=0x55555587ab10) at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/composite.cpp:351
#9  0x00007ffff5a11fec in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib64/libQt5Core.so.5
#10 0x0000555555560447 in KWin::ApplicationWayland::continueStartupWithX (this=0x7fffffffdf40) at /usr/src/debug/kwin-5.11.90git.20171102T195914~77b5c3caa/main_wayland.cpp:258
#11 0x00007ffff5a128d2 in QObject::event(QEvent*) () from /usr/lib64/libQt5Core.so.5


Crash is (in hindsight) obvious.


drm_object.cpp:44
init_props()
    for (unsigned int i = 0; i < properties->count_props; ++i) {
drmModePropertyRes *prop = drmModeGetProperty(m_backend->fd(), properties->props[i]);
        if (!prop) {
            continue;
        }
        if (prop->name == m_propsNames[n]) {
            qCDebug(KWIN_DRM).nospace() << m_id << ": " << prop->name << "' (id " << prop->prop_id
                              << "): " << properties->prop_values[i];
            m_props[n] = new Property(prop, properties->prop_values[i], enumNames);
        }
}


we don't always populate m_prop[i] with a valid object if our names don't match.

yet remaining code does:

for() {
m_props[i]->something

everywhere. Both in the trace and in a bunch of other places.

I can guard them all for being null. Roman, is that the right approach?
Comment 1 Roman Gilg 2017-11-03 14:06:25 UTC
The problem seems to be, that not all drivers export the newly queried "rotation" property to user space in contrast to the atomic mode setting propertiers, which are expected to all get exported to my knowledge.

The "rotation" property isn't atomic, so it might make sense to instead of from now on guarding all atomic properties that we just don't use the Property class for rotation and instead introduce a local variable in DrmPlane for the rotation property.
Comment 2 David Edmundson 2017-11-03 14:53:11 UTC
That still leaves an inconsistency where you're guarding the insert in initProps but not guarding access.

If you're gonna do that, you should error on initProps if one of the other props doesn't exist.
Comment 3 Roman Gilg 2017-11-03 20:02:17 UTC
Yes, that's a good idea. We could then decide to not use this plane/crtc/connector.

That said, my assumption is that for an atomic mode property this should never happen. But since we would do this check only once at the beginning of runtime I'm fine with it. Or should we assert? I mean if my assumption is correct it would be a critical driver bug. Maybe don't hide it.
Comment 4 Martin Flöser 2017-11-03 22:15:56 UTC
I would say all access needs guarding. My interpretation of https://01.org/linuxgraphics/gfx-docs/drm/drm-kms-properties.html is that the properties are not mandatory. For our usage rotation is clearly optional so we don't need to assert on it or report it as an error. We should consider it optional.
Comment 5 Fabian Vogt 2017-11-05 10:53:29 UTC
I got kwin_wayland to work in openQA again, I had to revert the following commits:

    Revert "[drm] Implement changing of modes"
    This reverts commit 33a4cf405071f2c74977a68c5b332dde4cea5853.

    Revert "[platforms/drm] Fix typo in cleanup of eglSurface"
    This reverts commit 7739dea33c2233b1fbbd49f52fc6263bfc2e2f0c

    Revert "[platforms/drm] Add support for rotation property on the Plane"
    This reverts commit 77b5c3caa9534862a057ea76f26bd094aaaff56e

    Revert "[platforms/drm] Rotate screen if requested from KScreen"
    This reverts commit b2d8bbec8157f9f8fbe70ace473162437dee5676

    Revert "[platforms/drm] Restore previous mode if an atomic test fails"
    This reverts commit 26cdfd317fc87b024ebcd9c2adefc86cfd59504e

    Revert "[platforms/drm] Properly adjust cursor position on a rotated output"
    This reverts commit c06c234778f8c9ad3c5ba708097240a7a88d8a62

The revert of b2d8bbec8157f9f8fbe70ace473162437dee5676 is what fixes the crash originally reported here.
However, even with that revert it did not work:

The behaviour I experienced while trying to figure out what's wrong exactly:
- dbus-launch startplasmacompositor (or just using sddm) resulted in ksplashqml starting, but after a few seconds kwin_wayland crashed
- The crash is not related to the reason that broke it, kwin_wayland exits for an invisible reason (nothing in the log) and crashes in ~DrmOutput.
- Running kwin_wayland with konsole/kwrite/anything else worked fine

So my guess is that kscreen's requests for rotation somehow cause the DRM backend to give up.
Comment 6 Martin Flöser 2017-11-05 20:13:54 UTC
As a workaround you could also 
export KWIN_DRM_NO_AMS=1
Comment 7 Bhushan Shah 2017-11-09 11:18:49 UTC
A data point,

The KWIN_DRM_NO_AMS property is not totally useful, if I export it, the kwin "starts" but it is not functional : see the following qemu screenshot : https://i.imgur.com/Gjy2cy4.jpg

It basically doesn't get screen resolution correctly and I can't interact with the kwin/plasmashell.
Comment 8 Martin Flöser 2017-11-10 19:40:56 UTC
Possible patch at: https://phabricator.kde.org/D8752
Comment 9 Fabian Vogt 2017-11-11 10:49:59 UTC
(In reply to Martin Flöser from comment #8)
> Possible patch at: https://phabricator.kde.org/D8752

I tried this patch and kwin still crashes in QEMU, but at a different place now at least:

#0  KWin::DrmOutput::pixelSize (this=0x0) at /usr/src/debug/kwin-5.11.90git.20171108T174438~9df174483/plugins/platforms/drm/drm_output.cpp:162
#1  0x00007fffdcccfca9 in KWin::DrmCrtc::blank (this=0x555555889a10)
    at /usr/src/debug/kwin-5.11.90git.20171108T174438~9df174483/plugins/platforms/drm/drm_object_crtc.cpp:85
#2  0x00007fffdccd1398 in KWin::DrmOutput::~DrmOutput (this=0x555555843240, __in_chrg=<optimized out>)
    at /usr/src/debug/kwin-5.11.90git.20171108T174438~9df174483/plugins/platforms/drm/drm_output.cpp:68
#3  0x00007fffdccd1699 in KWin::DrmOutput::~DrmOutput (this=0x555555843240, __in_chrg=<optimized out>)
    at /usr/src/debug/kwin-5.11.90git.20171108T174438~9df174483/plugins/platforms/drm/drm_output.cpp:87
#4  0x00007fffdccc9912 in KWin::DrmBackend::updateOutputs (this=this@entry=0x5555557cb500)
    at /usr/src/debug/kwin-5.11.90git.20171108T174438~9df174483/plugins/platforms/drm/drm_backend.cpp:460
#5  0x00007fffdccca81d in KWin::DrmBackend::openDrm (this=0x5555557cb500)
    at /usr/src/debug/kwin-5.11.90git.20171108T174438~9df174483/plugins/platforms/drm/drm_backend.cpp:326
#6  0x00007ffff5a0ffec in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib64/libQt5Core.so.5
#7  0x00007ffff7b1c9e2 in KWin::LogindIntegration::hasSessionControlChanged (this=<optimized out>, _t1=<optimized out>, _t1@entry=true)
    at /usr/src/debug/kwin-5.11.90git.20171108T174438~9df174483/build/kwin_autogen/EWIEGA46WW/moc_logind.cpp:188

At this point `export KWIN_DRM_NO_AMS=1` helps though and the session starts up. Unfortunately kwin_wayland gives up just after these messages:

kscreen.kwayland: Loading Wayland backend.
kscreen.kded: PowerDevil SuspendSession action not available!
kscreen.kwayland: Loading Wayland backend.
kwin_wayland: Couldn't find current GLX or EGL context.

and crashes somewhere in the destructors of Qt's global objects. That's exactly the same state as Comment 5 though.
Comment 10 Martin Flöser 2017-11-11 11:36:39 UTC
Thanks for testing. Looks like init on DrmOutput fails. I'll have a look.
Comment 11 Martin Flöser 2017-11-13 20:20:00 UTC
Git commit c601e875cfa7d673f1567c71087036631e42d272 by Martin Flöser.
Committed on 13/11/2017 at 20:19.
Pushed by graesslin into branch 'master'.

[platforms/drm] At safety checks for the properties

Summary:
The AMS code accesses elements in a vector which might not be valid. This
change refactors the code to be more robust, especially the DrmPlane,
which started to crash after adding transformation support.

Reviewers: #kwin, #plasma, fvogt, subdiff

Subscribers: plasma-devel, kwin

Tags: #kwin

Differential Revision: https://phabricator.kde.org/D8752

M  +14   -5    plugins/platforms/drm/drm_object.cpp
M  +17   -14   plugins/platforms/drm/drm_object.h
M  +2    -2    plugins/platforms/drm/drm_object_connector.cpp
M  +5    -2    plugins/platforms/drm/drm_object_crtc.cpp
M  +24   -9    plugins/platforms/drm/drm_object_plane.cpp
M  +3    -0    plugins/platforms/drm/egl_gbm_backend.cpp

https://commits.kde.org/kwin/c601e875cfa7d673f1567c71087036631e42d272
Comment 12 Bhushan Shah 2017-11-14 06:25:20 UTC
Can I re-open this bug, the commit above fixes the issue of crash, but right after starting plasma kwin_wayland is hung, and there is no mouse pointer as well, so it's still impossible to interact with Plasma.. or should I report separate bug?
Comment 13 Martin Flöser 2017-11-14 16:11:50 UTC
(In reply to Bhushan Shah from comment #12)
> Can I re-open this bug, the commit above fixes the issue of crash, but right
> after starting plasma kwin_wayland is hung, and there is no mouse pointer as
> well, so it's still impossible to interact with Plasma.. or should I report
> separate bug?

Please use a separate bug. It's a different issue now, most likely related to the incorrect request we get from KScreen.