Bug 472862 - plasmashell and kwin_wayland randomly crashed in KDirWatchPrivate::emitEvent() due to kate sessions
Summary: plasmashell and kwin_wayland randomly crashed in KDirWatchPrivate::emitEvent(...
Status: RESOLVED FIXED
Alias: None
Product: frameworks-kcoreaddons
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR crash
Target Milestone: ---
Assignee: Michael Pyne
URL:
Keywords: wayland
Depends on:
Blocks:
 
Reported: 2023-08-01 00:07 UTC by Nate Graham
Modified: 2023-12-15 05:04 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In: 6.0
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nate Graham 2023-08-01 00:07:43 UTC
kwin_wayland randomly crashed while I was scrolling upwards in Konsole.

Backtrace:

#0  0x00007f886e7cecf8 in std::__atomic_base<QThreadData*>::load(std::memory_order) const
    (__m=std::memory_order_relaxed, this=0x753d7465737261c0)
    at /usr/include/c++/13/bits/atomic_base.h:835
#1  std::atomic<QThreadData*>::load(std::memory_order) const
    (__m=std::memory_order_relaxed, this=0x753d7465737261c0) at /usr/include/c++/13/atomic:577
#2  QAtomicOps<QThreadData*>::loadRelaxed<QThreadData*>(std::atomic<QThreadData*> const&)
    (_q_value=<error reading variable: Cannot access memory at address 0x753d7465737261c0>)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/thread/qatomic_cxx11.h:201
#3  QBasicAtomicPointer<QThreadData>::loadRelaxed() const (this=0x753d7465737261c0)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/thread/qbasicatomic.h:174
#4  QObject::thread() const (this=this@entry=0x49fb460)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qobject.cpp:1564
#5  0x00007f886e78aeb7 in QMetaObject::invokeMethodImpl(QObject*, QtPrivate::QSlotObjectBase*, Qt::ConnectionType, void*)
    (object=object@entry=0x49fb460, slot=0x5720d30, type=type@entry=Qt::QueuedConnection, ret=ret@entry=0x0) at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qmetaobject.cpp:1601
#6  0x00007f8870560e37 in QMetaObject::invokeMethod<KDirWatchPrivate::emitEvent(Entry*, int, const QString&)::<lambda()> > (type=Qt::QueuedConnection, ret=0x0, function=..., context=0x49fb460)
    at /usr/include/qt6/QtCore/qobjectdefs.h:468
#7  KDirWatchPrivate::emitEvent(KDirWatchPrivate::Entry*, int, QString const&)
    (this=this@entry=0xea4d70, e=e@entry=0x66fd298, event=event@entry=1, fileName=...)
    at /home/nate/kde/src/kcoreaddons/src/lib/io/kdirwatch.cpp:1389
#8  0x00007f8870567dcb in KDirWatchPrivate::slotRescan() (this=0xea4d70)
    at /home/nate/kde/src/kcoreaddons/src/lib/io/kdirwatch.cpp:1513
#9  0x00007f886e7ddcd4 in QtPrivate::QSlotObjectBase::call(QObject*, void**)
    (a=0x7fff80c65ab0, r=0xea4d70, this=0xe9d0f0)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qobjectdefs_impl.h:363
#10 doActivate<false>(QObject*, int, void**) (sender=0xea4dc0, signal_index=3, argv=0x7fff80c65ab0)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qobject.cpp:3992
#11 0x00007f886e7d4757 in QMetaObject::activate(QObject*, QMetaObject const*, int, void**)
    (sender=<optimized out>, m=m@entry=0x7f886ec0a200, local_signal_index=local_signal_index@entry=0, argv=argv@entry=0x7fff80c65ab0)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qobject.cpp:4052
#12 0x00007f886e7ef90d in QTimer::timeout(QTimer::QPrivateSignal) (this=<optimized out>, _t1=...)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/redhat-linux-build/src/corelib/Core_autogen/include/moc_qtimer.cpp:272
#13 0x00007f886e7cf70f in QObject::event(QEvent*) (this=0xea4dc0, e=0x7fff80c65c40)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qobject.cpp:1413
#14 0x00007f886fbc0b08 in QApplicationPrivate::notify_helper(QObject*, QEvent*)
    (this=<optimized out>, receiver=0xea4dc0, e=0x7fff80c65c40)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/widgets/kernel/qapplication.cpp:3287
#15 0x00007f886e77c308 in QCoreApplication::notifyInternal2(QObject*, QEvent*)
    (receiver=0xea4dc0, event=0x7fff80c65c40)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qcoreapplication.cpp:1115
#16 0x00007f886e77c50d in QCoreApplication::sendEvent(QObject*, QEvent*)
    (receiver=<optimized out>, event=<optimized out>)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qcoreapplication.cpp:1533
#17 0x00007f886e9005b3 in QTimerInfoList::activateTimers() (this=this@entry=0xd85968)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qtimerinfo_unix.cpp:613
#18 0x00007f886e8fb0b0 in QEventDispatcherUNIXPrivate::activateTimers() (this=this@entry=0xd85890)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qeventdispatcher_unix.cpp:213
#19 0x00007f886e8fd556 in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>)
    (this=<optimized out>, flags=...)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/kernel/qeventdispatcher_unix.cpp:482
#20 0x00007f886f5434c2 in QUnixEventDispatcherQPA::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) (this=<optimized out>, flags=...)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/gui/platform/unix/qunixeventdispatcher.cpp:27
#21 0x00007f886e788e93 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>)
    (this=this@entry=0x7fff80c65e10, flags=..., flags@entry=...)
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/global/qflags.h:34
#22 0x00007f886e784b3d in QCoreApplication::exec() ()
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/corelib/global/qflags.h:74
#23 0x00007f886eff85cd in QGuiApplication::exec() ()
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/gui/kernel/qguiapplication.cpp:1894
#24 0x00007f886fbc0a79 in QApplication::exec() ()
    at /usr/src/debug/qt6-qtbase-6.5.1-2.fc38.x86_64/src/widgets/kernel/qapplication.cpp:2566
#25 0x00000000004305e8 in main(int, char**) (argc=<optimized out>, argv=<optimized out>)
    at /home/nate/kde/src/kwin/src/main_wayland.cpp:613
Comment 1 David Edmundson 2023-08-01 15:08:03 UTC
Tracing back to where we use a KDirWatch

#0  0x00007ffff59b45b0 in KDirWatch::KDirWatch(QObject*)@plt () at /opt/kde6/lib/libKF6Service.so.6
#1  0x00007ffff59f1d02 in KSycocaPrivate::KSycocaPrivate(KSycoca*) (this=0x555556877d50, qq=0x55555588dfd0)
    at /home/david/projects/kde6/src/frameworks/kservice/src/sycoca/ksycoca.cpp:84
#2  0x00007ffff59f254f in KSycoca::KSycoca() (this=0x55555588dfd0)
    at /home/david/projects/kde6/src/frameworks/kservice/src/sycoca/ksycoca.cpp:211
#3  0x00007ffff59f5e93 in KSycocaSingleton::sycoca()
    (this=0x7ffff5a90dc4 <QGlobalStatic<QtGlobalStatic::Holder<(anonymous namespace)::Q_QGS_ksycocaInstance> >::instance()::holder>)
    at /home/david/projects/kde6/src/frameworks/kservice/src/sycoca/ksycoca.cpp:174
#4  0x00007ffff59f32a5 in KSycoca::self() () at /home/david/projects/kde6/src/frameworks/kservice/src/sycoca/ksycoca.cpp:367
#5  0x00007ffff59b5c6f in KApplicationTrader::query(std::function<bool (QExplicitlySharedDataPointer<KService> const&)>)
    (filterFunc=...) at /home/david/projects/kde6/src/frameworks/kservice/src/services/kapplicationtrader.cpp:65
#6  0x00007ffff7357c74 in KWin::fetchProcessServiceField(QString const&, QString const&) (executablePath=..., fieldName=...)
    at /home/david/projects/kde6/src/kde/workspace/kwin/src/utils/serviceutils.h:39
#7  0x00007ffff73615b4 in KWin::fetchRequestedInterfaces(QString const&) (executablePath=...)
    at /home/david/projects/kde6/src/kde/workspace/kwin/src/utils/serviceutils.h:55
Comment 2 Nate Graham 2023-09-08 18:16:55 UTC
I'm now periodically seeing Plasma crash with this backtrace in KDirWatchPrivate::slotRescan as well; maybe it's a KDirWatch regression?
Comment 3 Nate Graham 2023-10-05 15:30:20 UTC
Just found this in my Journal log. Maybe related?

Oct 05 09:28:47 Liberator plasmashell[338385]: inotify_add_watch(/var/lib/samba/usershares) failed: (Permission denied)
Comment 4 Harald Sitter 2023-10-05 16:40:02 UTC
It's an interesting data point at least. Means the KDirWatch is using QFilesystemwatch internally rather than inotify.
Comment 5 Harald Sitter 2023-10-06 11:46:13 UTC
Meven actually made some changes to KDirWatch after your report, it would be interesting to get a new trace.
Comment 6 Nate Graham 2023-10-08 02:49:23 UTC
Sure, here's one I just got today, with yesterday's git master:


#0  QThreadData::get2(QThread*) (thread=<optimized out>) at /usr/include/c++/13/bits/atomic_base.h:835
#1  QMetaObject::invokeMethodImpl(QObject*, QtPrivate::QSlotObjectBase*, Qt::ConnectionType, void*)
    (object=object@entry=0x1716220, slot=0x37a0cd0, type=type@entry=Qt::QueuedConnection, ret=ret@entry=0x0) at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qmetaobject.cpp:1604
#2  0x00007f8406d78367 in QMetaObject::invokeMethod<KDirWatchPrivate::emitEvent(Entry*, int, const QString&)::<lambda()> > (type=Qt::QueuedConnection, ret=0x0, function=..., context=0x1716220)
    at /usr/include/qt6/QtCore/qobjectdefs.h:468
#3  KDirWatchPrivate::emitEvent(KDirWatchPrivate::Entry*, int, QString const&)
    (this=this@entry=0x106bd90, e=e@entry=0x506df48, event=event@entry=1, fileName=...)
    at /home/nate/kde/src/kcoreaddons/src/lib/io/kdirwatch.cpp:1441
#4  0x00007f8406d7eb4b in KDirWatchPrivate::slotRescan() (this=0x106bd90)
    at /home/nate/kde/src/kcoreaddons/src/lib/io/kdirwatch.cpp:1565
#5  0x00007f8404fde3a4 in QtPrivate::QSlotObjectBase::call(QObject*, void**)
    (a=0x7fff1f1c1110, r=0x106bd90, this=0x10654f0)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qobjectdefs_impl.h:363
#6  doActivate<false>(QObject*, int, void**) (sender=0x106bde0, signal_index=3, argv=0x7fff1f1c1110)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qobject.cpp:3992
#7  0x00007f8404fd4e27 in QMetaObject::activate(QObject*, QMetaObject const*, int, void**)
    (sender=<optimized out>, m=m@entry=0x7f840540a200, local_signal_index=local_signal_index@entry=0, argv=argv@entry=0x7fff1f1c1110)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qobject.cpp:4052
#8  0x00007f8404ff093d in QTimer::timeout(QTimer::QPrivateSignal) (this=<optimized out>, _t1=...)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/redhat-linux-build/src/corelib/Core_autogen/include/moc_qtimer.cpp:272
#9  0x00007f8404fcfddf in QObject::event(QEvent*) (this=0x106bde0, e=0x7fff1f1c12a0)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qobject.cpp:1413
#10 0x00007f84063c0af8 in QApplicationPrivate::notify_helper(QObject*, QEvent*)
    (this=<optimized out>, receiver=0x106bde0, e=0x7fff1f1c12a0)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/widgets/kernel/qapplication.cpp:3287
#11 0x00007f8404f7cdc8 in QCoreApplication::notifyInternal2(QObject*, QEvent*)
    (receiver=0x106bde0, event=0x7fff1f1c12a0)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qcoreapplication.cpp:1118
#12 0x00007f8404f7cfcd in QCoreApplication::sendEvent(QObject*, QEvent*)
    (receiver=<optimized out>, event=<optimized out>)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qcoreapplication.cpp:1536
#13 0x00007f8405101a93 in QTimerInfoList::activateTimers() (this=this@entry=0xd92ab8)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qtimerinfo_unix.cpp:613
#14 0x00007f84050fc5d0 in QEventDispatcherUNIXPrivate::activateTimers() (this=this@entry=0xd929e0)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qeventdispatcher_unix.cpp:213
#15 0x00007f84050fea56 in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>)
    (this=<optimized out>, flags=...)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/kernel/qeventdispatcher_unix.cpp:482
#16 0x00007f8405d40062 in QUnixEventDispatcherQPA::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) (this=<optimized out>, flags=...)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/gui/platform/unix/qunixeventdispatcher.cpp:27
#17 0x00007f8404f89a03 in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>)
    (this=this@entry=0x7fff1f1c1470, flags=..., flags@entry=...)
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/global/qflags.h:34
#18 0x00007f8404f856ad in QCoreApplication::exec() ()
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/corelib/global/qflags.h:74
#19 0x00007f84057f914d in QGuiApplication::exec() ()
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/gui/kernel/qguiapplication.cpp:1908
#20 0x00007f84063c0a69 in QApplication::exec() ()
    at /usr/src/debug/qt6-qtbase-6.5.2-5.fc38.x86_64/src/widgets/kernel/qapplication.cpp:2566
#21 0x0000000000430838 in main(int, char**) (argc=<optimized out>, argv=<optimized out>)
    at /home/nate/kde/src/kwin/src/main_wayland.cpp:607
Comment 7 Harald Sitter 2023-10-08 04:21:44 UTC
Thx. I've walked the code a bit and didn't see anything that sticks out. The way I understand the crash we fall into a since-deleted kdirwatch instance,  trouble is the instances are refcounted so I find that hard to believe.... unless QThreadStorage has a bug in your Qt version, also unlikely.
All this makes me wonder why we aren't seeing the crash more though.

We'll probably have to sprinkle some debug output throughout kdirwatch to get a better sense of what happens with the instances on your system.

On a related note, the /var/lib/samba/usershares output was a red herring and is probably unrelated.

On a further related note we also install a kdirwatch via plasma theme_p.cpp for the task switcher and what not.
Comment 8 Harald Sitter 2023-10-08 16:02:35 UTC
You could enable verbose logging https://invent.kde.org/frameworks/kcoreaddons/-/blob/80b6893d85bc90fb5d82882c41caa171748c1c63/src/lib/io/kdirwatch.cpp#L82 so we see which actual file is causing this, it's not yet clear that this is indeed the kservice watcher that blows up.
Comment 9 Nate Graham 2023-10-11 16:04:00 UTC
Ok, will do.
Comment 10 Nate Graham 2023-10-11 18:20:14 UTC
Got a new warning as a result of that:

> kf.config.core: Watching absolute paths is not supported "/home/nate/kde/usr6/share/color-schemes/BreezeDark.colors"

Also found another slightly-less-related-looking issue that I reported as Bug 475467.
Comment 11 Harald Sitter 2023-10-11 18:38:09 UTC
Not related to this; rather to the new dbus path cleanup in kconfig. So, also a different bug, a serious one at that.
Comment 12 Nate Graham 2023-10-13 01:27:06 UTC
More data: at frame 7, filename is "/home/nate/.local/share/kate/sessions", which is actually a directory, not a file. Its contents are:

$ ls /home/nate/.local/share/kate/sessions/
Default.katesession  Home.katesession  KDE.katesession

And those files within are constantly changing as I open and close files throughout the day and the .katesession files get updated accordingly.
Comment 13 Harald Sitter 2023-10-13 06:31:34 UTC
Are you sure you've correctly enabled the logging and category? You should be getting walls more information

e.g.

Okt 13 08:29:53 ajax plasmashell[2262]: kf.coreaddons.kdirwatch: got event "CREATE" for entry "/home/me/.local/share/kate/sessions" [file] "foo.katesession.lock"
Okt 13 08:29:53 ajax plasmashell[2262]: kf.coreaddons.kdirwatch: -->got CREATE signal for "/home/me/.local/share/kate/sessions/foo.katesession.lock" sub_entry= 0x0
Okt 13 08:29:53 ajax plasmashell[2262]: kf.coreaddons.kdirwatch: got event "MODIFY" for entry "/home/me/.local/share/kate/sessions" [file] "foo.katesession.lock"
Comment 14 Harald Sitter 2023-10-13 08:47:27 UTC
I've found a way to make plasma crash like this:

- open krunner kcm
- enable kate session (this creates a kdirwatch)
- disable kate session (this deletes the kdirwatch)
- open kate and save a session
- crash

What appears to happen for plasma is incredibly stupid but at first glance hard to solve. KRunner (the framework) gives every runner a thread, it does this by creating the runner on the current thread and then moveToThread()s it to the target thread. Inside the konsoleprofile and katesession runner implementations we create a KDirWatch.

Now because of how kdirwatch works things get complicated.
Multiple "fronting" kdirwatch objects may point to the same internal private instance that actually encapsulates the watching logic. This is so we can install 300 kdirwatches on plasmarc but in the end only a single inotify watch is used to back all those fronting objects.
Things fall over because the backing objects are managed thread-local e.g. if I have two threads and each have watches on plasmarc it in fact uses two private instances and two inotify instances (one per thread). This then gets in the way of moveToThread because moving a KDirWatch from thread0 to thread1 means a KDirWatch living in thread1 now points to a KDirWatchPrivate instance in thread0.
Specifically the crash appears because of use-after-destruction...

The destructor has a check for dwp_self.hasLocalData() (that's the thread-local backing data)
https://invent.kde.org/frameworks/kcoreaddons/-/blob/80b6893d85bc90fb5d82882c41caa171748c1c63/src/lib/io/kdirwatch.cpp#L1895

this will evaluate to false because we never constructed a private on this thread (the private is on the original thread)
https://invent.kde.org/frameworks/kcoreaddons/-/blob/80b6893d85bc90fb5d82882c41caa171748c1c63/src/lib/io/kdirwatch.cpp#L85

this then means the refcount on thread0 is off because we don't unref in the destructor and the end result is that we have a "Client" record that points to an instance (that is the pubic KDirWatch) that no longer exists 
https://invent.kde.org/frameworks/kcoreaddons/-/blob/80b6893d85bc90fb5d82882c41caa171748c1c63/src/lib/io/kdirwatch.cpp#L1401

and there is our crash. When the thread-local private in thread0 now receives an event pertaining to the since-deleted KDirWatch that was moved to thread1 it will crash on a bogus pointer.

Supposedly we could just change the if in the destructor to always unref when there is a d regardless of thread-local data. I currently cannot conceive of a scenario where unrefing should be skipped when a d is available.

Mind you, I don't understand how kwin_wayland would end up in that particular hell, as I can only see it using KDirWatch::self which shouldn't have this problem.
Comment 15 Harald Sitter 2023-10-13 08:48:27 UTC
Moving to kcoreaddons since this probably needs some code there to safe guard things. Even though it is unclear how kwin_wayland factors into this.
Comment 16 Nate Graham 2023-10-13 19:44:48 UTC
Amazing detective work, as usual! I'll try disabling the "Kate Sessions" runner and see if the crashes go away.

As discussed this morning, KWin crashes because it has an instance of KRunner loaded into the Overview effect.
Comment 17 Bug Janitor Service 2023-10-17 09:57:23 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/kcoreaddons/-/merge_requests/385
Comment 18 Bug Janitor Service 2023-10-17 11:53:54 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/kdeplasma-addons/-/merge_requests/479
Comment 19 Bug Janitor Service 2023-10-17 15:41:01 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/kcoreaddons/-/merge_requests/386
Comment 20 Harald Sitter 2023-10-30 12:28:09 UTC
Git commit 436e38ad511b44432be1be31dc4ed27383e4338a by Harald Sitter.
Committed on 30/10/2023 at 12:27.
Pushed by sitter into branch 'master'.

kdirwatch: always unref d, and unset d from inside d

previously we could fall into a trap when the current thread for
whatever reason has no local data (e.g. the dtor is invoked from the
wrong thread). when that happened we would leave our d untouched and
never unref, which then causes particularly hard to track down problems
because we can crash at a much later point in time when the d tries to
send an event to the since-deleted KDirWatch instance.

instead let's revisit this a bit...

the reason the `dwp_self.hasLocalData()` check was added is because of
destruction order between QThreadStorage (the d) and QGlobalStatic (the
::self())

specifically because of this caveat from the QThreadStorage
documentation:

> QThreadStorage can be used to store data for the main() thread.
QThreadStorage deletes all data set for the main() thread when
QApplication is destroyed, regardless of whether or not the main()
thread has actually finished.

versus QGlobalStatic:

> If the object is created, it will be destroyed at exit-time, similar
to the C atexit function.

This effectively means that QThreadStorage (at least on the main()
thread) will destroy BEFORE QGlobalStatic.

To bypass this problem the dwp_self check was added to conditionally
skip unrefing when the QThreadStorage had already cleaned up.

The trouble is that this then hides the mentioned scenario where we have
a d but no backing data because of poor thread management.

Instead improve our management of the pimpl: In ~KDirWatchPrivate we now
unset the d pointer of all still monitoring KDirWatch. This allows us to
unconditionally check for d in the ~KDirWatch so it always
unrefs/removes itself from KDirWatchPrivate. In the shutdown scenario
cleanup runs the other way around... KDirWatchPrivate cleans up and
unsets itself as d pointer, thereby turning ~KDirWatch effectively no-op
so when it runs afterwards it won't access any Private memory anymore.

M  +10   -1    src/lib/io/kdirwatch.cpp
M  +1    -0    src/lib/io/kdirwatch.h

https://invent.kde.org/frameworks/kcoreaddons/-/commit/436e38ad511b44432be1be31dc4ed27383e4338a
Comment 21 Harald Sitter 2023-10-30 12:30:45 UTC
Git commit 8245dd474d60ea34aa4aa06d3f90b8c9922507cc by Harald Sitter.
Committed on 30/10/2023 at 13:27.
Pushed by sitter into branch 'master'.

katesessions/konsoleprofiles: do not hold profilesmodel on the stack

this causes crashes when accessing the kdirwatch since it doesn't get
moveToThread along with the runner. i.e. the runner thread calls into a
kdirwatch that lives on the gui thread. this is particularly problematic
since kdirwatch is really just a facade for thread-local backing
technology, so when the wrong thread calls into kdirwatch it may tap
into uninitialized memory (for that thread -- it is initialized on the
"correct" thread)

M  +2    -0    profiles/profilesmodel.cpp
M  +10   -5    runners/katesessions/katesessions.cpp
M  +2    -1    runners/katesessions/katesessions.h
M  +10   -5    runners/konsoleprofiles/konsoleprofiles.cpp
M  +2    -1    runners/konsoleprofiles/konsoleprofiles.h

https://invent.kde.org/plasma/kdeplasma-addons/-/commit/8245dd474d60ea34aa4aa06d3f90b8c9922507cc
Comment 22 Harald Sitter 2023-10-30 13:37:05 UTC
Git commit 4a4439a96a44584b781e8a18b8882ab914db5f12 by Harald Sitter.
Committed on 30/10/2023 at 13:30.
Pushed by sitter into branch 'Plasma/5.27'.

katesessions/konsoleprofiles: do not hold profilesmodel on the stack

this causes crashes when accessing the kdirwatch since it doesn't get
moveToThread along with the runner. i.e. the runner thread calls into a
kdirwatch that lives on the gui thread. this is particularly problematic
since kdirwatch is really just a facade for thread-local backing
technology, so when the wrong thread calls into kdirwatch it may tap
into uninitialized memory (for that thread -- it is initialized on the
"correct" thread)


(cherry picked from commit 8245dd474d60ea34aa4aa06d3f90b8c9922507cc)

M  +2    -0    profiles/profilesmodel.cpp
M  +10   -5    runners/katesessions/katesessions.cpp
M  +2    -1    runners/katesessions/katesessions.h
M  +10   -5    runners/konsoleprofiles/konsoleprofiles.cpp
M  +2    -1    runners/konsoleprofiles/konsoleprofiles.h

https://invent.kde.org/plasma/kdeplasma-addons/-/commit/4a4439a96a44584b781e8a18b8882ab914db5f12
Comment 23 Harald Sitter 2023-11-03 12:53:46 UTC
Git commit 2fc44a3c300324c614f2d3b86fb465f2c413c70c by Harald Sitter.
Committed on 03/11/2023 at 13:25.
Pushed by sitter into branch 'master'.

kdirwatch: don't crash after moving threads

if a user moves a public watch instance across threads we used to have a
d but now local data (thereby not running the dtor) and so eventually
the private watch in the old thread would try to emit data into a since
deleted public watch in the new thread

to prevent this problem we'll "brick" the watch upon moving threads.
specifically we'll fully unregister it with the old private and
create/apply a new private in the new thread.

since we currently have no way to reliably copy Entry as part of this we
cannot really replicate the correct watches on the new thread.
definitely something that could be made to work in the future though. in
the meantime critically log that moving threads does not do what you
expect it to do when used with kdirwatch.

M  +30   -0    autotests/kdirwatch_unittest.cpp
M  +30   -0    src/lib/io/kdirwatch.cpp
M  +5    -0    src/lib/io/kdirwatch.h

https://invent.kde.org/frameworks/kcoreaddons/-/commit/2fc44a3c300324c614f2d3b86fb465f2c413c70c
Comment 24 Harald Sitter 2023-11-13 13:08:15 UTC
Git commit 65b6b7537dd3e844cb1d4ff54347de28fc34d58d by Harald Sitter.
Committed on 13/11/2023 at 14:02.
Pushed by sitter into branch 'kf5'.

kdirwatch: always unref d, and unset d from inside d

previously we could fall into a trap when the current thread for
whatever reason has no local data (e.g. the dtor is invoked from the
wrong thread). when that happened we would leave our d untouched and
never unref, which then causes particularly hard to track down problems
because we can crash at a much later point in time when the d tries to
send an event to the since-deleted KDirWatch instance.

instead let's revisit this a bit...

the reason the `dwp_self.hasLocalData()` check was added is because of
destruction order between QThreadStorage (the d) and QGlobalStatic (the
::self())

specifically because of this caveat from the QThreadStorage
documentation:

> QThreadStorage can be used to store data for the main() thread.
QThreadStorage deletes all data set for the main() thread when
QApplication is destroyed, regardless of whether or not the main()
thread has actually finished.

versus QGlobalStatic:

> If the object is created, it will be destroyed at exit-time, similar
to the C atexit function.

This effectively means that QThreadStorage (at least on the main()
thread) will destroy BEFORE QGlobalStatic.

To bypass this problem the dwp_self check was added to conditionally
skip unrefing when the QThreadStorage had already cleaned up.

The trouble is that this then hides the mentioned scenario where we have
a d but no backing data because of poor thread management.

Instead improve our management of the pimpl: In ~KDirWatchPrivate we now
unset the d pointer of all still monitoring KDirWatch. This allows us to
unconditionally check for d in the ~KDirWatch so it always
unrefs/removes itself from KDirWatchPrivate. In the shutdown scenario
cleanup runs the other way around... KDirWatchPrivate cleans up and
unsets itself as d pointer, thereby turning ~KDirWatch effectively no-op
so when it runs afterwards it won't access any Private memory anymore.
(cherry picked from commit 436e38ad511b44432be1be31dc4ed27383e4338a)

M  +11   -1    src/lib/io/kdirwatch.cpp
M  +1    -0    src/lib/io/kdirwatch.h

https://invent.kde.org/frameworks/kcoreaddons/-/commit/65b6b7537dd3e844cb1d4ff54347de28fc34d58d
Comment 25 Harald Sitter 2023-11-13 13:08:23 UTC
Git commit e40eac96d2258b8b8dd8589aa3c0bbe4e8c64d05 by Harald Sitter.
Committed on 13/11/2023 at 14:02.
Pushed by sitter into branch 'kf5'.

kdirwatch: don't crash after moving threads

if a user moves a public watch instance across threads we used to have a
d but now local data (thereby not running the dtor) and so eventually
the private watch in the old thread would try to emit data into a since
deleted public watch in the new thread

to prevent this problem we'll "brick" the watch upon moving threads.
specifically we'll fully unregister it with the old private and
create/apply a new private in the new thread.

since we currently have no way to reliably copy Entry as part of this we
cannot really replicate the correct watches on the new thread.
definitely something that could be made to work in the future though. in
the meantime critically log that moving threads does not do what you
expect it to do when used with kdirwatch.
(cherry picked from commit 2fc44a3c300324c614f2d3b86fb465f2c413c70c)

M  +31   -0    autotests/kdirwatch_unittest.cpp
M  +30   -0    src/lib/io/kdirwatch.cpp
M  +5    -0    src/lib/io/kdirwatch.h

https://invent.kde.org/frameworks/kcoreaddons/-/commit/e40eac96d2258b8b8dd8589aa3c0bbe4e8c64d05
Comment 26 Anthony Fieroni 2023-12-15 05:04:11 UTC
(In reply to Harald Sitter from comment #14)
> Multiple "fronting" kdirwatch objects may point to the same internal private
> instance that actually encapsulates the watching logic. This is so we can
> install 300 kdirwatches on plasmarc but in the end only a single inotify
> watch is used to back all those fronting objects.

What's the reason behind that?