| Summary: | kwin_wayland SIGSEGV in KWin::WaylandCursorTheme::loadTheme | ||
|---|---|---|---|
| Product: | [Plasma] kwin | Reporter: | schrott3000 |
| Component: | wayland-generic | Assignee: | KWin default assignee <kwin-bugs-null> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | schrott3000 |
| Priority: | NOR | Flags: | schrott3000:
Wayland+
schrott3000: X11- mgraesslin: ReviewRequest+ |
| Version First Reported In: | 5.12.0 | ||
| Target Milestone: | --- | ||
| Platform: | openSUSE | ||
| OS: | Linux | ||
| URL: | https://phabricator.kde.org/D10549 | ||
| Latest Commit: | https://commits.kde.org/kwin/2ea5153e1c979760c56bd81f3d47f2fb373130c9 | Version Fixed/Implemented In: | 5.12.3 |
| Sentry Crash Report: | |||
| Attachments: |
Detailed gdb backtrace
Working startupconfig Not working startupconfig Support information Patch proposal for the recursion problem |
||
Do you know why it was 0? Did you modify that manually? I did not change it manually. But I can reproduce this by changing the cursor theme in kcmshell5, on relogin it sets it to 0. Some background info: I have tried kwin_wayland to see how stable it got. And I always got crashes with my normal account. Because I've heard that it should be pretty table by now I was wondering and checked with a newly created user account where everything worked fine. This is why I started to investigate. I've attached the debugger and found that it's an issue with the kwin_wayland cursor theme implementation. So I've checked for differences in the relevant config file between my main account and this newly created account and this is how I came up with this workaround for it. Thanks for the investigation. The cursor theme kcm changed recently I guess the value stored changed as well. 0 doesn't make much sense. Should be easy to guard KWin against it. now that I have the code open: 0 is a valid value. To understand what's going on I need some more information: * what's your cursor theme? * can you please add the output of: qdbus org.kde.KWin /KWin supportInformation What I need to understand is what the code does in your system and why your system goes into that bad path. Yeah, 0 should be display dependent cursor size that would match the setting in kcmshell were it shows "Auflösungsabhängig" and the calculation in wayland_cursor_theme.cpp. The cursor theme is 'Breeze' but it's not dependent on that it behaves the same way if it is set to 'DMZ'. I will attach the support info with the working settings now and then try to get one with the not working settings. An additional hint here: I have noticed that it does not fail immediately, but when the cursor theme needs to be changed the first time due to e.g. moving it over a window bar. Created attachment 110599 [details]
Working startupconfig
Created attachment 110600 [details]
Not working startupconfig
Created attachment 110601 [details]
Support information
Please note that the support information output is identical for the working and the non working configuration.
I have debugged a little bit further: In normal operation the loop does not occur because after the first loop m_theme in wayland_cursor_theme.cpp is loaded and the conditions cause that a different path is taken in the next loop through this functions. If size==0 something else happens: An automatic size gets calculated with in my case is a negative number. This is passed to wl_cursor_theme_load with fails and return NULL. So m_theme remains set to NULL and it tries to load this theme with the wrong size in an infinite loop. Okay, I have taken a look at the size calculation at assembler level and it does some correct but likely not expected things: At first for my pc: => output->pixelSize().height() returns 1080 (which is my screen height) => output->physicalSize().height() returns 0 (which is obviously a problem) What then happens is that the following is calculated for floating points in SIMD: qreal(1080) / (qreal(0) * 0.0393701) * 16.0 / 72.0 = qreal(1080) / 0.0 * 16.0 / 72.0 (Please not that div by zero is NaN in floating point and does not raise an exception) = NaN * 16.0 / 72.0 = NaN This NaN is than converted to int size in an implicit manner which results in: int size = -2147483648 (which actually is the NaN value 0x80000000 as int) This than causes wl_cursor_theme_load to fail and return NULL as it cannot cope with negative values. By the way I consider this recursive looping a very strange way of programming this thing. Is it understood in this code that emit themeChanged(); in WaylandCursorTheme::loadTheme() does not put the event in the Qt event loop but instead will call the connected handlers (like pointer_input.cpp, line 909) directly as normal functions as we don't have a queued signal connection? (referring to https://doc.qt.io/qt-5.10/signalsandslots.html#signals) thanks for the update. From what I understand we have two problems here: * we don't handle physical size being 0 * we emit the signal even if null is returned and it actually doesn't change Which brings us to the recursion: there should not be a recursion. It only is a recursion because the theme is null even after calling the method. Yeah that's correct. If m_them gets correctly set there will be no recursion because the get function in wayland_cursor_theme.cpp would not call loadTheme again. Writing a fix for this two problems should do it. Created attachment 110648 [details]
Patch proposal for the recursion problem
I have tested what will happen if the recursion is solved by not emitting the signal if wl_cursor_theme_load returns null. The result is that the session does not crash anymore but the cursor will disappear when loadTheme fails (which should be fixed if the physical monitor == 0 situation is fixed).
In my opinion this is an acceptable behaviour because if we don't pass a negative size to it wl_cursor_theme_load should only fail if it can't allocate the required memory (which should actually never happen). But I will leave it on you to decide that.
Yeah, that's what I had in mind as well. Full solution for both problems including a unit test exposing the problem at https://phabricator.kde.org/D10549 Works well for me. The default size matches the ones in X11 windows which I think is a good default. If you have further questions please ask, otherwise feel free to change this bug to RESOLVED. Git commit 2ea5153e1c979760c56bd81f3d47f2fb373130c9 by Martin Flöser. Committed on 04/03/2018 at 08:42. Pushed by graesslin into branch 'Plasma/5.12'. Don't crash if the cursor theme fails to create Summary: If the cursor theme failed to create KWin crashed due to an endless recursion. There are two reasons for this fault: 1) When the physical size does not exist we perform a division by 0 which results in an invalid size going into wl_cursor_theme_load 2) We emit the signal that the cursor theme changed even if it didn't change thus creating an endless recursion This change addresses both problems: it checks that the size is not 0 and changes the handling for theme update to only destroy the previous theme if the new theme could be created and only emits the signal if things change. FIXED-IN: 5.12.3 Test Plan: Added a new test case which crashed with old code Reviewers: #kwin, #plasma Subscribers: plasma-devel, kwin Tags: #plasma Differential Revision: https://phabricator.kde.org/D10549 M +1 -0 autotests/integration/CMakeLists.txt A +124 -0 autotests/integration/dont_crash_cursor_physical_size_empty.cpp [License: GPL (v2)] M +17 -9 wayland_cursor_theme.cpp https://commits.kde.org/kwin/2ea5153e1c979760c56bd81f3d47f2fb373130c9 |
Created attachment 110557 [details] Detailed gdb backtrace Symptoms: kwin_wayland session crashes when moving the cursor e.g. over another window. Cause: There is an endless recursion when calling KWin::CursorImage::loadThemeCursor. I will describe the scheme in short for details see the attached backtrace: step1: Kwin calls KWin::CursorImage::loadThemeCursor step2: Which then calls KWin::WaylandCursorTheme::get which call KWin::WaylandCursorTheme::loadTheme step3: KWin::WaylandCursorTheme::loadTheme emits the event themeChanged() (wayland_cursor_theme.cpp, line:70) step4: This is handled by the lambda expession at pointer_input.cpp line 908, which calls KWin::CursorImage::loadThemeCursor in line 911 which starts the whole thing again (=> goto step1) This cycle breaks in the moment when malloc fails to allocate and kwin crashes with a segmentation fault: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f35190c0919 in malloc () from /lib64/libc.so.6 #0 0x00007f35190c0919 in malloc () from /lib64/libc.so.6 No symbol table info available. #1 0x00007f351948d718 in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6 No symbol table info available. #2 0x00007f3519c63f0d in QObjectPrivate::connectImpl(QObject const*, int, QObject const*, void**, QtPrivate::QSlotObjectBase*, Qt::ConnectionType, int const*, QMetaObject const*) () from /usr/lib64/libQt5Core.so.5 No symbol table info available. #3 0x00007f3519c64295 in QObject::connectImpl(QObject const*, void**, QObject const*, void**, QtPrivate::QSlotObjectBase*, Qt::ConnectionType, int const*, QMetaObject const*) () from /usr/lib64/libQt5Core.so.5 No symbol table info available. #4 0x00007f351bfcbded in QObject::connect<void (KWin::Cursor::*)(), void (KWin::WaylandCursorTheme::*)()> (type=Qt::AutoConnection, slot=(void (KWin::WaylandCursorTheme::*)(KWin::WaylandCursorTheme * const)) 0x7f351bfcbc70 <KWin::WaylandCursorTheme::loadTheme()>, receiver=0x563d373dfee0, signal=(void (KWin::Cursor::*)(KWin::Cursor * const)) 0x7f351c015600 <KWin::Cursor::themeChanged()>, sender=0x563d36a12350) at /usr/include/qt5/QtCore/qobject.h:259 types = 0x0 #5 KWin::WaylandCursorTheme::loadTheme (this=0x563d373dfee0) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/wayland_cursor_theme.cpp:55 size = <optimized out> this = 0x563d373dfee0 #6 0x00007f351bfcc0d0 in KWin::WaylandCursorTheme::get (this=this@entry=0x563d373dfee0, name=...) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/wayland_cursor_theme.cpp:90 c = <optimized out> #7 0x00007f351bfcc13d in KWin::WaylandCursorTheme::get (this=0x563d373dfee0, shape=<optimized out>) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/wayland_cursor_theme.cpp:84 No locals. #8 0x00007f351bedbd7c in KWin::CursorImage::loadThemeCursor<Qt::CursorShape> (this=0x563d37581b20, shape=shape@entry=@0x7ffd4e791354: Qt::ArrowCursor, cursors=..., image=0x563d37581bd0) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/pointer_input.cpp:1194 cursor = <optimized out> b = <optimized out> buffer = <optimized out> it = {i = 0x7f3519cdc360 <QHashData::shared_null>} #9 0x00007f351beda40d in KWin::CursorImage::loadThemeCursor (image=<optimized out>, shape=<optimized out>, this=<optimized out>) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/pointer_input.cpp:1175 No locals. #10 KWin::CursorImage::<lambda()>::operator() (__closure=0x563d374e45c0) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/pointer_input.cpp:911 this = 0x563d37581b20 #11 QtPrivate::FunctorCall<QtPrivate::IndexesList<>, QtPrivate::List<>, void, KWin::CursorImage::CursorImage(KWin::PointerInputRedirection*)::<lambda()> >::call (arg=<optimized out>, f=...) at /usr/include/qt5/QtCore/qobjectdefs_impl.h:130 No locals. #12 QtPrivate::Functor<KWin::CursorImage::CursorImage(KWin::PointerInputRedirection*)::<lambda()>, 0>::call<QtPrivate::List<>, void> (arg=<optimized out>, f=...) at /usr/include/qt5/QtCore/qobjectdefs_impl.h:240 No locals. #13 QtPrivate::QFunctorSlotObject<KWin::CursorImage::CursorImage(KWin::PointerInputRedirection*)::<lambda()>, 0, QtPrivate::List<>, void>::impl(int, QtPrivate::QSlotObjectBase *, QObject *, void **, bool *) (which=<optimized out>, this_=0x563d374e45b0, r=<optimized out>, a=<optimized out>, ret=<optimized out>) at /usr/include/qt5/QtCore/qobjectdefs_impl.h:423 No locals. #14 0x00007f3519c600cc in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib64/libQt5Core.so.5 No symbol table info available. #15 0x00007f351bfcbd5f in KWin::WaylandCursorTheme::loadTheme (this=0x563d373dfee0) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/wayland_cursor_theme.cpp:70 size = -2147483648 this = 0x563d373dfee0 ---- stripped here see attached log for detailed backtrace ----- Workaround: Setting kcminputrc_mouse_cursorsize='' in startupconfig prevents triggering this bug (NOTE: it was originally set to 0 in my account). Please note that the attached backtrace does not show the full backtrace, hte parts in the middle are just the same over and over again.