Created attachment 110557 [details] Detailed gdb backtrace Symptoms: kwin_wayland session crashes when moving the cursor e.g. over another window. Cause: There is an endless recursion when calling KWin::CursorImage::loadThemeCursor. I will describe the scheme in short for details see the attached backtrace: step1: Kwin calls KWin::CursorImage::loadThemeCursor step2: Which then calls KWin::WaylandCursorTheme::get which call KWin::WaylandCursorTheme::loadTheme step3: KWin::WaylandCursorTheme::loadTheme emits the event themeChanged() (wayland_cursor_theme.cpp, line:70) step4: This is handled by the lambda expession at pointer_input.cpp line 908, which calls KWin::CursorImage::loadThemeCursor in line 911 which starts the whole thing again (=> goto step1) This cycle breaks in the moment when malloc fails to allocate and kwin crashes with a segmentation fault: Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f35190c0919 in malloc () from /lib64/libc.so.6 #0 0x00007f35190c0919 in malloc () from /lib64/libc.so.6 No symbol table info available. #1 0x00007f351948d718 in operator new(unsigned long) () from /usr/lib64/libstdc++.so.6 No symbol table info available. #2 0x00007f3519c63f0d in QObjectPrivate::connectImpl(QObject const*, int, QObject const*, void**, QtPrivate::QSlotObjectBase*, Qt::ConnectionType, int const*, QMetaObject const*) () from /usr/lib64/libQt5Core.so.5 No symbol table info available. #3 0x00007f3519c64295 in QObject::connectImpl(QObject const*, void**, QObject const*, void**, QtPrivate::QSlotObjectBase*, Qt::ConnectionType, int const*, QMetaObject const*) () from /usr/lib64/libQt5Core.so.5 No symbol table info available. #4 0x00007f351bfcbded in QObject::connect<void (KWin::Cursor::*)(), void (KWin::WaylandCursorTheme::*)()> (type=Qt::AutoConnection, slot=(void (KWin::WaylandCursorTheme::*)(KWin::WaylandCursorTheme * const)) 0x7f351bfcbc70 <KWin::WaylandCursorTheme::loadTheme()>, receiver=0x563d373dfee0, signal=(void (KWin::Cursor::*)(KWin::Cursor * const)) 0x7f351c015600 <KWin::Cursor::themeChanged()>, sender=0x563d36a12350) at /usr/include/qt5/QtCore/qobject.h:259 types = 0x0 #5 KWin::WaylandCursorTheme::loadTheme (this=0x563d373dfee0) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/wayland_cursor_theme.cpp:55 size = <optimized out> this = 0x563d373dfee0 #6 0x00007f351bfcc0d0 in KWin::WaylandCursorTheme::get (this=this@entry=0x563d373dfee0, name=...) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/wayland_cursor_theme.cpp:90 c = <optimized out> #7 0x00007f351bfcc13d in KWin::WaylandCursorTheme::get (this=0x563d373dfee0, shape=<optimized out>) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/wayland_cursor_theme.cpp:84 No locals. #8 0x00007f351bedbd7c in KWin::CursorImage::loadThemeCursor<Qt::CursorShape> (this=0x563d37581b20, shape=shape@entry=@0x7ffd4e791354: Qt::ArrowCursor, cursors=..., image=0x563d37581bd0) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/pointer_input.cpp:1194 cursor = <optimized out> b = <optimized out> buffer = <optimized out> it = {i = 0x7f3519cdc360 <QHashData::shared_null>} #9 0x00007f351beda40d in KWin::CursorImage::loadThemeCursor (image=<optimized out>, shape=<optimized out>, this=<optimized out>) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/pointer_input.cpp:1175 No locals. #10 KWin::CursorImage::<lambda()>::operator() (__closure=0x563d374e45c0) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/pointer_input.cpp:911 this = 0x563d37581b20 #11 QtPrivate::FunctorCall<QtPrivate::IndexesList<>, QtPrivate::List<>, void, KWin::CursorImage::CursorImage(KWin::PointerInputRedirection*)::<lambda()> >::call (arg=<optimized out>, f=...) at /usr/include/qt5/QtCore/qobjectdefs_impl.h:130 No locals. #12 QtPrivate::Functor<KWin::CursorImage::CursorImage(KWin::PointerInputRedirection*)::<lambda()>, 0>::call<QtPrivate::List<>, void> (arg=<optimized out>, f=...) at /usr/include/qt5/QtCore/qobjectdefs_impl.h:240 No locals. #13 QtPrivate::QFunctorSlotObject<KWin::CursorImage::CursorImage(KWin::PointerInputRedirection*)::<lambda()>, 0, QtPrivate::List<>, void>::impl(int, QtPrivate::QSlotObjectBase *, QObject *, void **, bool *) (which=<optimized out>, this_=0x563d374e45b0, r=<optimized out>, a=<optimized out>, ret=<optimized out>) at /usr/include/qt5/QtCore/qobjectdefs_impl.h:423 No locals. #14 0x00007f3519c600cc in QMetaObject::activate(QObject*, int, int, void**) () from /usr/lib64/libQt5Core.so.5 No symbol table info available. #15 0x00007f351bfcbd5f in KWin::WaylandCursorTheme::loadTheme (this=0x563d373dfee0) at /usr/src/debug/kwin5-5.12.0-1.1.x86_64/wayland_cursor_theme.cpp:70 size = -2147483648 this = 0x563d373dfee0 ---- stripped here see attached log for detailed backtrace ----- Workaround: Setting kcminputrc_mouse_cursorsize='' in startupconfig prevents triggering this bug (NOTE: it was originally set to 0 in my account). Please note that the attached backtrace does not show the full backtrace, hte parts in the middle are just the same over and over again.
Do you know why it was 0? Did you modify that manually?
I did not change it manually. But I can reproduce this by changing the cursor theme in kcmshell5, on relogin it sets it to 0. Some background info: I have tried kwin_wayland to see how stable it got. And I always got crashes with my normal account. Because I've heard that it should be pretty table by now I was wondering and checked with a newly created user account where everything worked fine. This is why I started to investigate. I've attached the debugger and found that it's an issue with the kwin_wayland cursor theme implementation. So I've checked for differences in the relevant config file between my main account and this newly created account and this is how I came up with this workaround for it.
Thanks for the investigation. The cursor theme kcm changed recently I guess the value stored changed as well. 0 doesn't make much sense. Should be easy to guard KWin against it.
now that I have the code open: 0 is a valid value.
To understand what's going on I need some more information: * what's your cursor theme? * can you please add the output of: qdbus org.kde.KWin /KWin supportInformation What I need to understand is what the code does in your system and why your system goes into that bad path.
Yeah, 0 should be display dependent cursor size that would match the setting in kcmshell were it shows "Auflösungsabhängig" and the calculation in wayland_cursor_theme.cpp. The cursor theme is 'Breeze' but it's not dependent on that it behaves the same way if it is set to 'DMZ'. I will attach the support info with the working settings now and then try to get one with the not working settings. An additional hint here: I have noticed that it does not fail immediately, but when the cursor theme needs to be changed the first time due to e.g. moving it over a window bar.
Created attachment 110599 [details] Working startupconfig
Created attachment 110600 [details] Not working startupconfig
Created attachment 110601 [details] Support information Please note that the support information output is identical for the working and the non working configuration.
I have debugged a little bit further: In normal operation the loop does not occur because after the first loop m_theme in wayland_cursor_theme.cpp is loaded and the conditions cause that a different path is taken in the next loop through this functions. If size==0 something else happens: An automatic size gets calculated with in my case is a negative number. This is passed to wl_cursor_theme_load with fails and return NULL. So m_theme remains set to NULL and it tries to load this theme with the wrong size in an infinite loop.
Okay, I have taken a look at the size calculation at assembler level and it does some correct but likely not expected things: At first for my pc: => output->pixelSize().height() returns 1080 (which is my screen height) => output->physicalSize().height() returns 0 (which is obviously a problem) What then happens is that the following is calculated for floating points in SIMD: qreal(1080) / (qreal(0) * 0.0393701) * 16.0 / 72.0 = qreal(1080) / 0.0 * 16.0 / 72.0 (Please not that div by zero is NaN in floating point and does not raise an exception) = NaN * 16.0 / 72.0 = NaN This NaN is than converted to int size in an implicit manner which results in: int size = -2147483648 (which actually is the NaN value 0x80000000 as int) This than causes wl_cursor_theme_load to fail and return NULL as it cannot cope with negative values.
By the way I consider this recursive looping a very strange way of programming this thing. Is it understood in this code that emit themeChanged(); in WaylandCursorTheme::loadTheme() does not put the event in the Qt event loop but instead will call the connected handlers (like pointer_input.cpp, line 909) directly as normal functions as we don't have a queued signal connection? (referring to https://doc.qt.io/qt-5.10/signalsandslots.html#signals)
thanks for the update. From what I understand we have two problems here: * we don't handle physical size being 0 * we emit the signal even if null is returned and it actually doesn't change Which brings us to the recursion: there should not be a recursion. It only is a recursion because the theme is null even after calling the method.
Yeah that's correct. If m_them gets correctly set there will be no recursion because the get function in wayland_cursor_theme.cpp would not call loadTheme again. Writing a fix for this two problems should do it.
Created attachment 110648 [details] Patch proposal for the recursion problem I have tested what will happen if the recursion is solved by not emitting the signal if wl_cursor_theme_load returns null. The result is that the session does not crash anymore but the cursor will disappear when loadTheme fails (which should be fixed if the physical monitor == 0 situation is fixed). In my opinion this is an acceptable behaviour because if we don't pass a negative size to it wl_cursor_theme_load should only fail if it can't allocate the required memory (which should actually never happen). But I will leave it on you to decide that.
Yeah, that's what I had in mind as well.
Full solution for both problems including a unit test exposing the problem at https://phabricator.kde.org/D10549
Works well for me. The default size matches the ones in X11 windows which I think is a good default. If you have further questions please ask, otherwise feel free to change this bug to RESOLVED.
Git commit 2ea5153e1c979760c56bd81f3d47f2fb373130c9 by Martin Flöser. Committed on 04/03/2018 at 08:42. Pushed by graesslin into branch 'Plasma/5.12'. Don't crash if the cursor theme fails to create Summary: If the cursor theme failed to create KWin crashed due to an endless recursion. There are two reasons for this fault: 1) When the physical size does not exist we perform a division by 0 which results in an invalid size going into wl_cursor_theme_load 2) We emit the signal that the cursor theme changed even if it didn't change thus creating an endless recursion This change addresses both problems: it checks that the size is not 0 and changes the handling for theme update to only destroy the previous theme if the new theme could be created and only emits the signal if things change. FIXED-IN: 5.12.3 Test Plan: Added a new test case which crashed with old code Reviewers: #kwin, #plasma Subscribers: plasma-devel, kwin Tags: #plasma Differential Revision: https://phabricator.kde.org/D10549 M +1 -0 autotests/integration/CMakeLists.txt A +124 -0 autotests/integration/dont_crash_cursor_physical_size_empty.cpp [License: GPL (v2)] M +17 -9 wayland_cursor_theme.cpp https://commits.kde.org/kwin/2ea5153e1c979760c56bd81f3d47f2fb373130c9