Bug 408553

Summary:	Segmentation faults and invalid reads/writes in powerdevil when logging out of Plasma 5.15.5 on Wayland in Fedora 30
Product:	[Frameworks and Libraries] kwayland	Reporter:	Matt Fagnani <matt.fagnani>
Component:	general	Assignee:	Martin Flöser <mgraesslin>
Status:	RESOLVED FIXED
Severity:	normal	CC:	rdieter
Priority:	NOR
Version First Reported In:	5.59.0
Target Milestone:	---
Platform:	Fedora RPMs
OS:	Linux
URL:	https://bugzilla.redhat.com/show_bug.cgi?id=1713467 https://bugzilla.redhat.com/show_bug.cgi?id=1727470
Latest Commit:	https://phabricator.kde.org/D27538	Version Fixed/Implemented In:	5.68
Sentry Crash Report:
Attachments:	valgrind --log-file=valgrind-powerdevil-3.txt /usr/libexec/org_kde_powerdevil & output with invalid reads/writes after logging out of Plasma on Wayland gdb output with full trace of all threads from segmentation fault of org_kde_powerdevil when logging out of Plasma on Wayland coredumpctl gdb output of segmentation fault in powerdevil when logging of Plasma on Wayland

Description Matt Fagnani 2019-06-10 21:15:33 UTC

Created attachment 120765 [details]
valgrind --log-file=valgrind-powerdevil-3.txt /usr/libexec/org_kde_powerdevil & output with invalid reads/writes after logging out of Plasma on Wayland

SUMMARY

powerdevil has aborted 8 times today when I logged out of Plasma 5.15.5 on Wayland in Fedora 30 to sddm. powerdevil seemed to abort because it was running after the Wayland compositor connection had been broken based on the journal messages as follows.

Jun 08 20:34:25 org_kde_powerdevil[6263]: The Wayland connection broke. Did the Wayland compositor die?
Jun 08 20:34:26 audit[12662]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=9 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=12662 comm="org_kde_powerde" exe="/usr/libexec/org_kde_powerdevil" sig=6 res=1
Jun 08 20:34:26 org_kde_powerdevil[12662]: Failed to create wl_display (No such file or directory)
Jun 08 20:34:26 org_kde_powerdevil[12662]: qt.qpa.plugin: Could not load the Qt platform plugin "wayland" in "" even though it was found.
Jun 08 20:34:26 org_kde_powerdevil[12662]: This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

An example of the abort and trace using coredumpctl debug / gdb is
Core was generated by `/usr/libexec/org_kde_powerdevil'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50        return ret;

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f767e65f895 in __GI_abort () at abort.c:79
#2  0x00007f767ea8caf5 in qt_message_fatal (context=..., message=<synthetic pointer>...)
    at global/qlogging.cpp:1901
#3  QMessageLogger::fatal (this=this@entry=0x7fffde03a030, msg=msg@entry=0x7f767f4176d7 "%s")
    at global/qlogging.cpp:887
#4  0x00007f767f0d829f in init_platform (argv=<optimized out>, argc=@0x7fffde03a25c: 1, 
    platformThemeName=..., platformPluginPath=..., pluginNamesWithArguments=...)
    at ../../include/QtCore/../../src/corelib/tools/qarraydata.h:208
#5  QGuiApplicationPrivate::createPlatformIntegration (this=0x5613163a7cc0)
    at kernel/qguiapplication.cpp:1384
#6  0x00007f767f0d8af8 in QGuiApplicationPrivate::createEventDispatcher (this=<optimized out>)
    at kernel/qguiapplication.cpp:1401
#7  0x00007f767ec6f975 in QCoreApplicationPrivate::init (this=this@entry=0x5613163a7cc0)
    at kernel/qcoreapplication.cpp:857
#8  0x00007f767f0da253 in QGuiApplicationPrivate::init (this=0x5613163a7cc0)
    at kernel/qguiapplication.cpp:1430
#9  0x00007f767f0db198 in QGuiApplication::QGuiApplication (this=0x7fffde03a270, 
    argc=@0x7fffde03a25c: 1, argv=0x7fffde03a3b8, flags=330753)
    at ../../include/QtCore/../../src/corelib/global/qglobal.h:1038
#10 0x00005613150d14f5 in PowerDevilApp::PowerDevilApp (argv=<optimized out>, 
    argc=@0x7fffde03a25c: 1, this=0x7fffde03a270)
    at /usr/src/debug/powerdevil-5.15.5-1.fc30.x86_64/daemon/powerdevilapp.cpp:202
#11 main (argc=<optimized out>, argv=<optimized out>)
    at /usr/src/debug/powerdevil-5.15.5-1.fc30.x86_64/daemon/powerdevilapp.cpp:202


coredumpctl info /usr/libexec/drkonqi had 8 crashes with traces like those reported here before when I logged out of Plasma on Wayland with command lines like
/usr/libexec/drkonqi -platform wayland --appname org_kde_powerdevil --apppath org_kde_powerdevil --apppath /usr/libexec --signal 11 --pid 6263 --startupid 0 --restarted

coredumpctl debug / gdb showed the following for the last of those aborts of drkonqi
Core was generated by `/usr/libexec/drkonqi -platform wayland --appname org_kde_powerdevil --apppath /'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50        return ret;

(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f1c06b9f895 in __GI_abort () at abort.c:79
#2  0x00007f1c06fccaf5 in qt_message_fatal (context=..., message=<synthetic pointer>...)
    at global/qlogging.cpp:1901
#3  QMessageLogger::fatal (this=this@entry=0x7ffe86c215c0, msg=msg@entry=0x7f1c078b76d7 "%s")
    at global/qlogging.cpp:887
#4  0x00007f1c0757829f in init_platform (argv=<optimized out>, argc=@0x7ffe86c2184c: 12, 
    platformThemeName=..., platformPluginPath=..., pluginNamesWithArguments=...)
    at ../../include/QtCore/../../src/corelib/tools/qarraydata.h:208
#5  QGuiApplicationPrivate::createPlatformIntegration (this=0x55708ca5fab0)
    at kernel/qguiapplication.cpp:1384
#6  0x00007f1c07578af8 in QGuiApplicationPrivate::createEventDispatcher (this=<optimized out>)
    at kernel/qguiapplication.cpp:1401
#7  0x00007f1c071af975 in QCoreApplicationPrivate::init (this=this@entry=0x55708ca5fab0)
    at kernel/qcoreapplication.cpp:857
#8  0x00007f1c0757a253 in QGuiApplicationPrivate::init (this=this@entry=0x55708ca5fab0)
    at kernel/qguiapplication.cpp:1430
#9  0x00007f1c07b121fd in QApplicationPrivate::init (this=0x55708ca5fab0)
    at kernel/qapplication.cpp:566
#10 0x000055708c1c255f in main (argc=<optimized out>, argv=<optimized out>)
    at /usr/src/debug/plasma-drkonqi-5.15.5-1.fc30.x86_64/src/main.cpp:63

I interpreted those command lines as segmentation faults in org_kde_powerdevil when logging out of Plasma on Wayland. I switched from Plasma on Wayland to another VT and logged in. I ran gdb -p pid where pid was the process id of org_kde_powerdevil. I continued running it with c in gdb. I switched back to Plasma and logged out. I switch back to the VT. gdb showed a segmentation fault as follows.

Core was generated by `/usr/libexec/org_kde_powerdevil'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  _int_free (av=0x7f498b549c60 <main_arena>, p=0x5617ea20c540, have_lock=<optimized out>)
    at malloc.c:4330
4330        if (__glibc_unlikely (!prev_inuse(nextchunk)))
[Current thread is 1 (Thread 0x7f498b386880 (LWP 17015))]

nextchunk was a pointer to an inaccessible address which would explain the segmentation fault which might have resulted from memory corruption.

(gdb) p nextchunk
$1 = (mchunkptr) 0xd56175efca50
(gdb) x 0xd56175efca50
0xd56175efca50: Cannot access memory at address 0xd56175efca50

The lines around malloc.c:4330 were

(gdb) list
4325        if (__builtin_expect (contiguous (av)
4326                              && (char *) nextchunk
4327                              >= ((char *) av->top + chunksize(av->top)), 0))
4328            malloc_printerr ("double free or corruption (out)");
4329        /* Or whether the block is actually not marked used.  */
4330        if (__glibc_unlikely (!prev_inuse(nextchunk)))
4331          malloc_printerr ("double free or corruption (!prev)");
4332
4333        nextsize = chunksize(nextchunk);
4334        if (__builtin_expect (chunksize_nomask (nextchunk) <= 2 * SIZE_SZ, 0)

I used gcore /programs/kde/powerdevil/gdb-powerdevil-segmentation-fault-1.core to save the core dump. The full trace of the crashing thread follows.
The size = 139953854940432 in the _int_free function of malloc.c in #0 looks too large to be valid. qt functions in #1-#3 and kf5-kidletime functions in #4-#8 
might have been involved.

(gdb) bt full
#0  _int_free (av=0x7f498b549c60 <main_arena>, p=0x5617ea20c540, have_lock=<optimized out>)
    at malloc.c:4330
        size = 139953854940432
        fb = <optimized out>
        nextchunk = 0xd56175efca50
        nextsize = <optimized out>
        nextinuse = <optimized out>
        prevsize = <optimized out>
        bck = <optimized out>
        fwd = <optimized out>
        __PRETTY_FUNCTION__ = "_int_free"
#1  0x00007f498b85f017 in QHashData::free_helper (this=0x5617ea1be4a0, 
    node_delete=node_delete@entry=0x7f498aff0280 <QHash<int, int>::deleteNode2(QHashData::Node*)>)
    at tools/qhash.cpp:573
        next = 0x0
        cur = 0x5617ea20c550
        this_e = 0x5617ea1be4a0
        bucket = 0x5617ea1c00a0
        n = <optimized out>
#2  0x00007f498afee847 in QHash<int, int>::freeData (this=0x5617ea126c00, x=<optimized out>)
    at /usr/include/qt5/QtCore/qhash.h:585
No locals.
#3  QHash<int, int>::~QHash (this=0x5617ea126c00, __in_chrg=<optimized out>)
    at /usr/include/qt5/QtCore/qhash.h:254
No locals.
--Type <RET> for more, q to quit, c to continue without paging--c
#4  KIdleTimePrivate::~KIdleTimePrivate (this=0x5617ea126be0, __in_chrg=<optimized out>) at /usr/src/debug/kf5-kidletime-5.58.0-1.fc30.x86_64/src/kidletime.cpp:57
No locals.
#5  KIdleTime::~KIdleTime (this=0x5617ea180880, __in_chrg=<optimized out>) at /usr/src/debug/kf5-kidletime-5.58.0-1.fc30.x86_64/src/kidletime.cpp:96
        d = <optimized out>
        d = <optimized out>
#6  0x00007f498afee88d in KIdleTime::~KIdleTime (this=0x5617ea180880, __in_chrg=<optimized out>) at /usr/src/debug/kf5-kidletime-5.58.0-1.fc30.x86_64/src/kidletime.cpp:92
        d = <optimized out>
#7  0x00007f498afee546 in KIdleTimeHelper::~KIdleTimeHelper (this=<optimized out>, __in_chrg=<optimized out>) at /usr/src/debug/kf5-kidletime-5.58.0-1.fc30.x86_64/src/kidletime.cpp:39
No locals.
#8  (anonymous namespace)::Q_QGS_s_globalKIdleTime::Holder::~Holder (this=<optimized out>, __in_chrg=<optimized out>) at /usr/src/debug/kf5-kidletime-5.58.0-1.fc30.x86_64/src/kidletime.cpp:46
No locals.
#9  0x00007f498b3c3670 in __run_exit_handlers (status=status@entry=1, listp=0x7f498b549738 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:108
        atfct = <optimized out>
        onfct = <optimized out>
        cxafct = <optimized out>
        f = <optimized out>
        new_exitfn_called = 245
        cur = 0x5617ea12b420
#10 0x00007f498b3c37b0 in __GI_exit (status=status@entry=1) at exit.c:139
No locals.
#11 0x00007f497a300f92 in QtWaylandClient::QWaylandDisplay::exitWithError (this=this@entry=0x5617ea0c78b0) at qwaylanddisplay.cpp:205
No locals.
#12 0x00007f497a300ff6 in QtWaylandClient::QWaylandDisplay::flushRequests (this=0x5617ea0c78b0) at qwaylanddisplay.cpp:188
No locals.
#13 0x00007f498b9df5ab in QMetaObject::activate (sender=0x5617ea0e3880, signalOffset=<optimized out>, local_signal_index=<optimized out>, argv=<optimized out>) at kernel/qobject.cpp:3789
        methodIndex = <optimized out>
        method_relative = <optimized out>
        callFunction = 0x7f497a3299c0 <QtWaylandClient::QWaylandDisplay::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)>
        receiver = 0x5617ea0c78b0
        receiverInSameThread = <optimized out>
        sw = {receiver = 0x5617ea0c78b0, previousSender = 0x0, currentSender = {sender = 0x5617ea0e3880, signal = 3, ref = 1}, switched = true}
        c = 0x5617ea0d1e50
        last = 0x5617ea0d1e50
        locker = {val = 139953854403096}
        connectionLists = {connectionLists = 0x5617ea0eaf00}
        list = <optimized out>
        currentThreadId = 0x7f498b386880
        signal_index = 3
        empty_argv = {0x0}
#14 0x00007f498b9eb7fc in QSocketNotifier::activated (this=this@entry=0x5617ea0e3880, _t1=<optimized out>, _t2=...) at .moc/moc_qsocketnotifier.cpp:140
        _a = {0x0, 0x7ffe11edc9dc, 0x7ffe11edca10}
#15 0x00007f498b9ebb61 in QSocketNotifier::event (this=0x5617ea0e3880, e=0x7ffe11edcae0) at kernel/qsocketnotifier.cpp:266
        d = 0x5617ea0e29e0
#16 0x00007f498b9b5255 in doNotify (receiver=<optimized out>, event=<optimized out>) at ../../include/QtCore/../../src/corelib/kernel/qobject.h:142
No locals.
#17 0x00007f498b9b52e8 in QCoreApplication::notifyInternal2 (receiver=0x5617ea0e3880, event=0x7ffe11edcae0) at kernel/qcoreapplication.cpp:1060
        selfRequired = true
        result = false
        cbdata = {0x5617ea0e3880, 0x7ffe11edcae0, 0x7ffe11edca8f}
        d = <optimized out>
        threadData = 0x5617ea0d00a0
        scopeLevelCounter = {threadData = 0x5617ea0d00a0}
#18 0x00007f498ba0ada7 in socketNotifierSourceDispatch (source=source@entry=0x5617ea0d5010) at kernel/qeventdispatcher_glib.cpp:106
        p = <optimized out>
        i = 0
        event = {_vptr.QEvent = 0x7f498bc65c70 <vtable for QEvent+16>, static staticMetaObject = {d = {superdata = 0x0, stringdata = 0x7f498bb50140 <qt_meta_stringdata_QEvent>, data = 0x7f498bb4fb80 <qt_meta_data_QEvent>, static_metacall = 0x0, relatedMetaObjects = 0x0, extradata = 0x0}}, d = 0x0, t = 50, posted = 0, spont = 0, m_accept = 1, reserved = 573}
        src = <optimized out>
#19 0x00007f4989ec8edd in g_main_dispatch (context=0x5617ea0e4500) at ../glib/gmain.c:3189
        dispatch = <optimized out>
        prev_source = 0x0
        was_in_call = <optimized out>
        user_data = 0x0
        callback = 0x0
        cb_funcs = 0x0
        cb_data = 0x0
        need_destroy = <optimized out>
        source = 0x5617ea0d5010
        current = 0x5617ea11a8e0
        i = 0
        __FUNCTION__ = "g_main_dispatch"
#20 g_main_context_dispatch (context=context@entry=0x5617ea0e4500) at ../glib/gmain.c:3854
No locals.
#21 0x00007f4989ec9270 in g_main_context_iterate (context=context@entry=0x5617ea0e4500, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:3927
        max_priority = 2147483647
        timeout = 4692
        some_ready = 1
        nfds = <optimized out>
        allocated_nfds = <optimized out>
        fds = 0x7f4974014be0
#22 0x00007f4989ec9313 in g_main_context_iteration (context=0x5617ea0e4500, may_block=may_block@entry=1) at ../glib/gmain.c:3988
        retval = <optimized out>
#23 0x00007f498ba0a3f5 in QEventDispatcherGlib::processEvents (this=0x5617ea0e2bf0, flags=...) at kernel/qeventdispatcher_glib.cpp:422
        d = 0x5617ea0e3ce0
        canWait = true
        savedFlags = {i = 0}
        result = <optimized out>
#24 0x00007f498b9b42bb in QEventLoop::exec (this=this@entry=0x7ffe11edcd00, flags=..., flags@entry=...) at ../../include/QtCore/../../src/corelib/global/qflags.h:140
        d = 0x5617ea17b250
        locker = {val = 94660710957488}
        ref = {d = 0x5617ea17b250, locker = @0x7ffe11edcc88, exceptionCaught = true}
        app = <optimized out>
#25 0x00007f498b9bbfd6 in QCoreApplication::exec () at ../../include/QtCore/../../src/corelib/global/qflags.h:120
        threadData = 0x5617ea0d00a0
        eventLoop = {<QObject> = {_vptr.QObject = 0x7f498bc65a28 <vtable for QEventLoop+16>, static staticMetaObject = {d = {superdata = 0x0, stringdata = 0x7f498bb54e20 <qt_meta_stringdata_QObject>, data = 0x7f498bb54d00 <qt_meta_data_QObject>, static_metacall = 0x7f498b9e7810 <QObject::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)>, relatedMetaObjects = 0x0, extradata = 0x0}}, d_ptr = {d = 0x5617ea17b250}, static staticQtMetaObject = {d = {superdata = 0x0, stringdata = 0x7f498bb57d40 <qt_meta_stringdata_Qt>, data = 0x7f498bb54f40 <qt_meta_data_Qt>, static_metacall = 0x0, relatedMetaObjects = 0x0, extradata = 0x0}}}, static staticMetaObject = {d = {superdata = 0x7f498bc5dfe0 <QObject::staticMetaObject>, stringdata = 0x7f498bb4f260 <qt_meta_stringdata_QEventLoop>, data = 0x7f498bb4f200 <qt_meta_data_QEventLoop>, static_metacall = 0x7f498b9b3fd0 <QEventLoop::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)>, relatedMetaObjects = 0x0, extradata = 0x0}}}
        returnCode = <optimized out>
#26 0x00005617e9c9d5f4 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/powerdevil-5.15.5-1.fc30.x86_64/daemon/powerdevilapp.cpp:216
        app = {<QGuiApplication> = {<QCoreApplication> = {<QObject> = {_vptr.QObject = 0x5617e9cae358 <vtable for PowerDevilApp+16>, static staticMetaObject = {d = {superdata = 0x0, stringdata = 0x7f498bb54e20 <qt_meta_stringdata_QObject>, data = 0x7f498bb54d00 <qt_meta_data_QObject>, static_metacall = 0x7f498b9e7810 <QObject::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)>, relatedMetaObjects = 0x0, extradata = 0x0}}, d_ptr = {d = 0x5617ea0cff70}, static staticQtMetaObject = {d = {superdata = 0x0, stringdata = 0x7f498bb57d40 <qt_meta_stringdata_Qt>, data = 0x7f498bb54f40 <qt_meta_data_Qt>, static_metacall = 0x0, relatedMetaObjects = 0x0, extradata = 0x0}}}, static staticMetaObject = {d = {superdata = 0x7f498bc5dfe0 <QObject::staticMetaObject>, stringdata = 0x7f498bb4f8a0 <qt_meta_stringdata_QCoreApplication>, data = 0x7f498bb4f780 <qt_meta_data_QCoreApplication>, static_metacall = 0x7f498b9b6d40 <QCoreApplication::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)>, relatedMetaObjects = 0x0, extradata = 0x0}}, static self = 0x7ffe11edcd80}, static staticMetaObject = {d = {superdata = 0x7f498bc65bc0 <QCoreApplication::staticMetaObject>, stringdata = 0x7f498c153e60 <qt_meta_stringdata_QGuiApplication>, data = 0x7f498c153be0 <qt_meta_data_QGuiApplication>, static_metacall = 0x7f498be216f0 <QGuiApplication::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)>, relatedMetaObjects = 0x0, extradata = 0x0}}}, static staticMetaObject = {d = {superdata = 0x7f498c22be00 <QGuiApplication::staticMetaObject>, stringdata = 0x5617e9ca85c0 <qt_meta_stringdata_PowerDevilApp>, data = 0x5617e9ca8560 <qt_meta_data_PowerDevilApp>, static_metacall = 0x5617e9c9d720 <PowerDevilApp::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)>, relatedMetaObjects = 0x0, extradata = 0x0}}, m_core = 0x7f497400d280}
        disableSessionManagement = <optimized out>
        service = <incomplete type>


I terminated org_kde_powerdevil from ksysguard in another session of Plasma on Wayland. I ran valgrind --log-file=valgrind-powerdevil-3.txt /usr/libexec/org_kde_powerdevil & (in konsole). I logged out of Plasma. The valgrind log file had 21 invalid reads and writes which appeared to be use-after-free errors since they all had lines like "Address 0x1934ea3c is 44 bytes inside a block of size 72 free'd". The first such invalid read was in wl_proxy_unref (wayland-client.c:229) which might be related to why the crashes have only happened with Plasma on Wayland and not Plasma on X.

==4203== Invalid read of size 4
==4203==    at 0x172BFBB4: wl_proxy_unref (wayland-client.c:229)
==4203==    by 0x172BFCB3: destroy_queued_closure (wayland-client.c:291)
==4203==    by 0x172BFEC7: dispatch_event.isra.0 (wayland-client.c:1436)
==4203==    by 0x172C146B: dispatch_queue (wayland-client.c:1576)
==4203==    by 0x172C146B: wl_display_dispatch_queue_pending (wayland-client.c:1818)
==4203==    by 0x172C18AA: wl_display_roundtrip_queue (wayland-client.c:1241)
==4203==    by 0x194887C3: KWayland::Client::ConnectionThread::roundtrip() (connection_thread.cpp:290)
==4203==    by 0x171BB1A3: Poller::initWayland() (poller.cpp:93)
==4203==  Address 0x1934ea3c is 44 bytes inside a block of size 72 free'd
==4203==    at 0x4839A0C: free (vg_replace_malloc.c:540)
==4203==    by 0x1949F844: destroy (wayland_pointer_p.h:63)
==4203==    by 0x1949F844: KWayland::Client::Registry::Private::globalSync(void*, wl_callback*, unsigned int) (registry.cpp:539)
==4203==    by 0x485CB27: ffi_call_unix64 (in /usr/lib64/libffi.so.6.0.2)
==4203==    by 0x485C338: ffi_call (in /usr/lib64/libffi.so.6.0.2)
==4203==    by 0x172C3606: wl_closure_invoke (connection.c:1014)
==4203==    by 0x172BFF17: dispatch_event.isra.0 (wayland-client.c:1430)
==4203==    by 0x172C146B: dispatch_queue (wayland-client.c:1576)
==4203==    by 0x172C146B: wl_display_dispatch_queue_pending (wayland-client.c:1818)
==4203==    by 0x172C18AA: wl_display_roundtrip_queue (wayland-client.c:1241)
==4203==    by 0x194887C3: KWayland::Client::ConnectionThread::roundtrip() (connection_thread.cpp:290)
==4203==    by 0x171BB1A3: Poller::initWayland() (poller.cpp:93)
==4203==  Block was alloc'd at
==4203==    at 0x483AB1A: calloc (vg_replace_malloc.c:762)
==4203==    by 0x172BFD42: UnknownInlinedFun (wayland-private.h:236)
==4203==    by 0x172BFD42: proxy_create.isra.0 (wayland-client.c:421)
==4203==    by 0x172C042B: create_outgoing_proxy (wayland-client.c:650)
==4203==    by 0x172C042B: wl_proxy_marshal_array_constructor_versioned (wayland-client.c:735)
==4203==    by 0x172C0782: wl_proxy_marshal_constructor (wayland-client.c:824)
==4203==    by 0x1949FCED: wl_display_sync (wayland-client-protocol.h:958)
==4203==    by 0x1949FCED: KWayland::Client::Registry::create(wl_display*) (registry.cpp:470)
==4203==    by 0x1949FD6A: KWayland::Client::Registry::create(KWayland::Client::ConnectionThread*) (registry.cpp:479)
==4203==    by 0x171BB05B: Poller::initWayland() (poller.cpp:60)
==4203== 

Memory corruption from those use-after-free errors involving wayland-client, various qt5, kf5-kidletime packages might have led to the segmentation faults. I'll attach the valgrind and gdb log files.

STEPS TO REPRODUCE
1. Install Fedora 30 KDE Plasma spin
2. Boot F30
3. Log in to Plasma on Wayland
4. sudo dnf upgrade --refresh --enablerepo=updates-testing
5. Reboot
6. Log in to Plasma on Wayland
7. Log out of Plasma
8. Log in to Plasma on Wayland
9. coredumpctl (in konsole)
10. coredumpctl debug 
11. switch from Plasma on Wayland to another VT and log in
12. gdb -p pid where pid is the process id of org_kde_powerdevil
13. c (in gdb)
14. switch back to Plasma 
15. log out
16. switch back to the VT with gdb
17. gcore /programs/kde/powerdevil/gdb-powerdevil-segmentation-fault-1.core
18. bt full
19. Log into Plasma on Wayland
20. sudo dnf debuginfo-install powerdevil qt5-qtbase* glibc glib2 kwayland-integration kf5-kwayland libwayland-client kf5-kidletime qt5-qtwayland
20. gdb /programs/kde/powerdevil/gdb-powerdevil-segmentation-fault-1.core
21. bt full (in gdb)
22. thread apply all bt full (in gdb)
23. q (in gdb)
24. terminate org_kde_powerdevil from ksysguard
25. valgrind --log-file=valgrind-powerdevil-3.txt /usr/libexec/org_kde_powerdevil & (in konsole)
26. log out of Plasma
27. log in to Plasma on Wayland
28. read valgrind-powerdevil-3.txt

OBSERVED RESULT
Aborts of powerdevil and drkonqi, and segmentation faults and invalid reads/writes in powerdevil when logging out of Plasma 5.15.5 on Wayland.

EXPECTED RESULT
No crashes of powerdevil.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Fedora 30, 5.1.7 kernel
(available in About System)
KDE Plasma Version: 5.15.5
KDE Frameworks Version: 5.58.0
Qt Version: 5.12.1

glib2-0:2.60.3-1.fc30.x86_64
glibc-0:2.29-15.fc30.x86_64
kf5-kidletime-0:5.58.0-1.fc30.x86_64
kf5-kwayland-0:5.58.0-1.fc30.x86_64
kwayland-integration-0:5.15.5-1.fc30.x86_64
libwayland-client-0:1.17.0-1.fc30.x86_64
powerdevil-0:5.15.5-1.fc30.x86_64
qt5-qtbase-0:5.12.1-7.fc30.x86_64
qt5-qtwayland-0:5.12.1-3.fc30.x86_64

ADDITIONAL INFORMATION

These crashes were reported at https://bugzilla.redhat.com/show_bug.cgi?id=1713467 
Rex Dieter suggested to reported them here.

Comment 1 Matt Fagnani 2019-06-10 21:16:41 UTC

Created attachment 120766 [details]
gdb output with full trace of all threads from segmentation fault of org_kde_powerdevil when logging out of Plasma on Wayland

Comment 2 Christoph Feck 2019-07-03 14:25:18 UTC

Crash in comment #1 is because of QtWaylandClient::QWaylandDisplay::exitWithError().

The description also says that "The Wayland connection broke. Did the Wayland compositor die?"

On logout, KWin of course "dies", but I have no idea how the Wayland protocol can inform clients that the compositor is no longer present, and how they could react.

Reassigning to KWin developers for inspection.

Comment 3 Matt Fagnani 2019-07-09 04:14:40 UTC

Created attachment 121409 [details]
coredumpctl gdb output of segmentation fault in powerdevil when logging of Plasma on Wayland

Thanks Christoph. I think that if the segmentation faults in powerdevil were fixed then the aborts of drkonqi and the restarted powerdevil after the Wayland compositor connection was broken wouldn't happen. I saw another segmentation fault in powerdevil when I logged out of Plasma 5.15.5 on Wayland. sddm didn't show up and the screen stayed blank which I've seen many times before when logging out of Plasma on Wayland. I pressed sysrq+alt+e , sysrq+alt+i which terminated then killed most of the userspace processes. sddm restarted after that.

This segmentation fault occurred at about the same time that the screen went blank. coredumpctl gdb showed that  tc_victim->fd in _int_malloc at malloc.c:3623 was an inaccessible address.
Core was generated by `/usr/libexec/org_kde_powerdevil'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f0d44dcadac in _int_malloc (av=av@entry=0x7f0d2c000020, bytes=bytes@entry=65)
    at malloc.c:3622
3622                          if (SINGLE_THREAD_P)
[Current thread is 1 (Thread 0x7f0d33b86700 (LWP 1559))]

(gdb) list
3617
3618                      /* While bin not empty and tcache not full, copy chunks.  */
3619                      while (tcache->counts[tc_idx] < mp_.tcache_count
3620                             && (tc_victim = *fb) != NULL)
3621                        {
3622                          if (SINGLE_THREAD_P)
3623                            *fb = tc_victim->fd;
3624                          else
3625                            {
3626                              REMOVE_FB (fb, pp, tc_victim);

(gdb) p tc_victim->fd
Cannot access memory at address 0xa10000556b
(gdb) p tc_victim
$2 = (mchunkptr) 0xa10000555b

A signal indicating a crash appeared after #13 in tcache_get at malloc.c:2952.
KCrash::defaultCrashHandler in #11 showed errors like "Cannot access memory at address 0x7" which might indicate memory corruption. Qt string conversions involving "org.kde.kglobalaccel" happened at #16-19. I've seen many aborts of kglobalaccel5 when logging out of Plasma on Wayland and X as reported at https://bugzilla.redhat.com/show_bug.cgi?id=1701485

I've attached the coredumpctl gdb output of the crash with the full backtrace of all threads etc. I reported this crash in more detail at https://bugzilla.redhat.com/show_bug.cgi?id=1727470  Should I create a new report on bugs.kde.org since the trace is different?

Comment 4 Matt Fagnani 2019-07-09 04:41:00 UTC

The versions used in the crash I reported in comment 3 were
glib2-0:2.60.4-1.fc30.x86_64
glibc-0:2.29-15.fc30.x86_64
kf5-kwayland-0:5.59.0-2.fc30.x86_64
kwayland-integration-0:5.15.5-1.fc30.x86_64
kwin-wayland-0:5.15.5-2.fc30.x86_64
libwayland-client-0:1.17.0-1.fc30.x86_64
powerdevil-0:5.15.5-1.fc30.x86_64
qt5-qtbase-0:5.12.4-1.fc30.x86_64
qt5-qtwayland-5.12.4-2.fc30.x86_64

coredumpctl has 27 entries for aborts of drkonqi due to powerdevil segmentation faults and of the restarted powerdevil each. The segmentation faults of powerdevil often occurred about the same time as blank screens occurred which I reported at https://bugzilla.redhat.com/show_bug.cgi?id=1727482

The black screen problem seems to have been the one reported at https://bugs.kde.org/show_bug.cgi?id=372789 

A patch to fix this issue for kwayland-integration was written by David Edmundson for Plasma 5.16.3
https://cgit.kde.org/kwayland-integration.git/commit/?id=bfce3c6727cdc58a2b8ba33c933df05e21914876
https://bugs.kde.org/show_bug.cgi?id=372789#c46

Comment 5 Matt Fagnani 2019-07-09 06:53:30 UTC

I've noticed similarities in the first invalid read at wl_proxy_unref (wayland-client.c:229) I reported and invalid reads starting at wayland-client.c:229 in in plasmashell https://bugs.kde.org/show_bug.cgi?id=409021#c1
konsole https://bugs.kde.org/show_bug.cgi?id=408971
kglobalaccel5 and akonadi_sendlater_agent

The address freed had the following common functions and source lines and was 44 bytes inside a block of size 72 free'd

==4203==  Address 0x1934ea3c is 44 bytes inside a block of size 72 free'd
==4203==    at 0x4839A0C: free (vg_replace_malloc.c:540)
==4203==    by 0x1949F844: destroy (wayland_pointer_p.h:63)
==4203==    by 0x1949F844: KWayland::Client::Registry::Private::globalSync(void*, wl_callback*, unsigned int) (registry.cpp:539)
==4203==    by 0x485CB27: ffi_call_unix64 (in /usr/lib64/libffi.so.6.0.2)
==4203==    by 0x485C338: ffi_call (in /usr/lib64/libffi.so.6.0.2)
==4203==    by 0x172C3606: wl_closure_invoke (connection.c:1014)
==4203==    by 0x172BFF17: dispatch_event.isra.0 (wayland-client.c:1430)
==4203==    by 0x172C146B: dispatch_queue (wayland-client.c:1576)
==4203==    by 0x172C146B: wl_display_dispatch_queue_pending (wayland-client.c:1818)
==4203==    by 0x172C18AA: wl_display_roundtrip_queue (wayland-client.c:1241)
==4203==    by 0x194887C3: KWayland::Client::ConnectionThread::roundtrip() (connection_thread.cpp:290)

Functions in those stacks might have freed the pointer before the other programs used it. KWayland::Client::Registry::Private::globalSync (registry.cpp:539) might be where the freeing was done too early.

(gdb) list registry.cpp:533,540
533     void Registry::Private::globalSync(void* data, wl_callback* callback, uint32_t serial)
534     {
535         Q_UNUSED(serial)
536         auto r = reinterpret_cast<Registry::Private*>(data);
537         Q_ASSERT(r->callback == callback);
538         r->handleGlobalSync();
539         r->callback.destroy();
540     }

Memory corruption due to the use-after-free errors might have led to the segmentation faults I saw. I'm reassigning this to frameworks-kwayland based on the above. kwayland-integration or libwayland-client are other possible packages involved.

Comment 6 Nate Graham 2021-01-13 18:44:52 UTC

#6  QMessageLogger::fatal (this=this@entry=0x7fffd70c5ba0, msg=msg@entry=0x7ff994ac00b8 "The Wayland connection broke. Did the Wayland compositor die?") at global/qlogging.cpp:893

This means that the compositor crashed. Due to a Qt issue, when this happens, the app using it will crash too. KDE developers submitted a fix, but sadly it was not merged. See https://codereview.qt-project.org/c/qt/qtwayland/+/308984.

Until we get better handling of this in Qt, the best we can do is debug why the compositor crashed in the first place. So can you please get a backtrace of the crash in kwin_wayland and then file a new bug report with it on kwin | wayland-generic? Thanks!

You may be able to use the `coredumpctl` utility to retrieve the backtrace. See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports#Retrieving_a_backtrace_using_coredumpctl

Comment 7 Matt Fagnani 2021-01-13 21:31:55 UTC

(In reply to Nate Graham from comment #6)
> #6  QMessageLogger::fatal (this=this@entry=0x7fffd70c5ba0,
> msg=msg@entry=0x7ff994ac00b8 "The Wayland connection broke. Did the Wayland
> compositor die?") at global/qlogging.cpp:893
> 
> This means that the compositor crashed. Due to a Qt issue, when this
> happens, the app using it will crash too. KDE developers submitted a fix,
> but sadly it was not merged. See
> https://codereview.qt-project.org/c/qt/qtwayland/+/308984.
> 
> Until we get better handling of this in Qt, the best we can do is debug why
> the compositor crashed in the first place. So can you please get a backtrace
> of the crash in kwin_wayland and then file a new bug report with it on kwin
> | wayland-generic? Thanks!
> 
> You may be able to use the `coredumpctl` utility to retrieve the backtrace.
> See
> https://community.kde.org/Guidelines_and_HOWTOs/Debugging/
> How_to_create_useful_crash_reports#Retrieving_a_backtrace_using_coredumpctl

Nate, I think that kwin_wayland stopped normally during logout before powerdevil segmentation faulted and then powerdevil tried to restart and drkonqi aborted, which led to the errors like The Wayland connection broke. Did the Wayland compositor die? I didn't mention any kwin_wayland crashes in my report. The first powerdevil segmentation faults were due to the use-after-free errors in wl_proxy_unref (wayland-client.c:229) in libwayland-client. I think those errors were fixed by Daniel Vrátil in kwayland 5.68 whose message mentioned invalid read/write use-after-free errors in wl_proxy_unref (wayland-client.c:230) also involving KWayland::Client::Registry::Private::globalSync in the commit https://phabricator.kde.org/R127:4ceb35672dfa3378776a926c452b9f83ffe2bc41

Registry: don't destroy the callback on globalsync

Summary:
Instead just unref it, because the wl_display_dispatch_queue_pending
will try to destroy the callback afterwards as well, leading to
invalid read/write.

Fixes Valgrind warnings when running KScreen tests:

==460922== Invalid read of size 4
==460922==    at 0x5CE5B34: wl_proxy_unref (wayland-client.c:230)
==460922==    by 0x5CE5C33: destroy_queued_closure (wayland-client.c:292)
==460922==    by 0x5CE74AB: dispatch_queue (wayland-client.c:1591)
==460922==    by 0x5CE74AB: wl_display_dispatch_queue_pending (wayland-client.c:1833)
==460922==    by 0x4E0240D: KWayland::Client::EventQueue::dispatch() (src/frameworks/kwayland/src/client/event_queue.cpp:96)
g==460922==  Address 0x17233aac is 44 bytes inside a block of size 80 free'd
==460922==    at 0x483B9F5: free (vg_replace_malloc.c:540)
==460922==    by 0x4E15B60: destroy (src/frameworks/kwayland/src/client/wayland_pointer_p.h:63)
==460922==    by 0x4E15B60: KWayland::Client::Registry::Private::globalSync(void*, wl_callback*, unsigned int) (src/frameworks/kwayland/src/client/registry.cpp:548)
...
==460922==    by 0x5CE74AB: dispatch_queue (wayland-client.c:1591)
==460922==    by 0x5CE74AB: wl_display_dispatch_queue_pending (wayland-client.c:1833)
==460922==    by 0x4E0240D: KWayland::Client::EventQueue::dispatch() (src/frameworks/kwayland/src/client/event_queue.cpp:96)

I haven't seen these powerdevil crashes or those involving similar invalid read/write errors in plasmashell, konsole, etc I mentioned in comment 5 since KF 5.68.0.

The qtwayland fix you mentioned could resolve the aborts of KDE programs after kwin_wayland stopped when logging out. Alternatively, kwin_wayland could be made to wait until the other KDE programs have stopped before it is stopped maybe using the systemd integration. Thanks.