Bug 430615 - Crash (apparently cgroups-related) after some time minimized
Summary: Crash (apparently cgroups-related) after some time minimized
Status: RESOLVED FIXED
Alias: None
Product: plasma-systemmonitor
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Compiled Sources Linux
: NOR crash
Target Milestone: ---
Assignee: KSysGuard Developers
URL:
Keywords: drkonqi
Depends on:
Blocks:
 
Reported: 2020-12-20 08:31 UTC by Bharadwaj Raju
Modified: 2021-03-04 14:36 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In: 5.21.3


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bharadwaj Raju 2020-12-20 08:31:41 UTC
Application: plasma-systemmonitor (5.20.80)
 (Compiled from sources)
Qt Version: 5.15.2
Frameworks Version: 5.77.0
Operating System: Linux 5.9.14-arch1-1 x86_64
Windowing system: X11
Distribution: "Arch Linux"

-- Information about the crash:
- What I was doing when the application crashed:

All pages were loaded. Open page was Overview. I minimized the app and was doing something else.

Page settings are default, no changes.

System Monitor, KSysGuard and libksysguard were compiled from git latest. Rest of system is Arch Linux latest.

Backtrace suggests something related to cgroups.

The crash can be reproduced every time.

-- Backtrace:
Application: System Monitor (plasma-systemmonitor), signal: Segmentation fault

[KCrash Handler]
#4  0x00007fb99b28d664 in pthread_mutex_lock () at /usr/lib/libpthread.so.0
#5  0x00007fb97c360f0c in __gthread_mutex_lock(__gthread_mutex_t*) (__mutex=0x50) at /usr/include/c++/10.2.0/x86_64-pc-linux-gnu/bits/gthr-default.h:749
#6  0x00007fb97c3616e6 in std::mutex::lock() (this=0x50) at /usr/include/c++/10.2.0/bits/std_mutex.h:100
#7  0x00007fb97c3626ed in std::unique_lock<std::mutex>::lock() (this=0x7ffde83a6f40) at /usr/include/c++/10.2.0/bits/unique_lock.h:138
#8  0x00007fb97c36218b in std::unique_lock<std::mutex>::unique_lock(std::mutex&) (this=0x7ffde83a6f40, __m=...) at /usr/include/c++/10.2.0/bits/unique_lock.h:68
#9  0x00007fb97c361b92 in KSysGuard::CGroupPrivate::ReadPidsRunnable::wait() (this=0x0) at /home/bharadwaj/kde/src/libksysguard/processcore/cgroup.cpp:95
#10 0x00007fb97c360210 in KSysGuard::CGroup::pids() const (this=0x55a376f306c0) at /home/bharadwaj/kde/src/libksysguard/processcore/cgroup.cpp:152
#11 0x00007fb97c3650ce in KSysGuard::CGroupDataModelPrivate::processesFor(KSysGuard::CGroup*) (this=0x55a376f18490, app=0x55a376f306c0) at /home/bharadwaj/kde/src/libksysguard/processcore/cgroup_data_model.cpp:447
#12 0x00007fb97c363ebf in KSysGuard::CGroupDataModel::data(QModelIndex const&, int) const (this=0x55a376f18350, index=..., role=256) at /home/bharadwaj/kde/src/libksysguard/processcore/cgroup_data_model.cpp:253
#13 0x00007fb99bb901ee in QAbstractProxyModel::data(QModelIndex const&, int) const () at /usr/lib/libQt5Core.so.5
#14 0x00007fb97c3db03f in ColumnDisplayModel::data(QModelIndex const&, int) const (this=0x55a376f17fe0, index=..., role=256) at /home/bharadwaj/kde/src/plasma-systemmonitor/src/table/ColumnDisplayModel.cpp:36
#15 0x00007fb99bb901ee in QAbstractProxyModel::data(QModelIndex const&, int) const () at /usr/lib/libQt5Core.so.5
#16 0x00007fb97c3e59d8 in ComponentCacheProxyModel::data(QModelIndex const&, int) const (this=0x55a376f17b00, proxyIndex=..., role=256) at /home/bharadwaj/kde/src/plasma-systemmonitor/src/table/ComponentCacheProxyModel.cpp:49
#17 0x00007fb99bb9c249 in QSortFilterProxyModel::lessThan(QModelIndex const&, QModelIndex const&) const () at /usr/lib/libQt5Core.so.5
#18 0x00007fb99bba25ae in  () at /usr/lib/libQt5Core.so.5
#19 0x00007fb99bba3ff2 in  () at /usr/lib/libQt5Core.so.5
#20 0x00007fb99bba9d8d in  () at /usr/lib/libQt5Core.so.5
#21 0x00007fb99bc0be10 in  () at /usr/lib/libQt5Core.so.5
#22 0x00007fb99bb6dfd6 in QAbstractItemModel::dataChanged(QModelIndex const&, QModelIndex const&, QVector<int> const&) () at /usr/lib/libQt5Core.so.5
#23 0x00007fb99bb99078 in  () at /usr/lib/libQt5Core.so.5
#24 0x00007fb99bc0be10 in  () at /usr/lib/libQt5Core.so.5
#25 0x00007fb99bb6dfd6 in QAbstractItemModel::dataChanged(QModelIndex const&, QModelIndex const&, QVector<int> const&) () at /usr/lib/libQt5Core.so.5
#26 0x00007fb99bb99078 in  () at /usr/lib/libQt5Core.so.5
#27 0x00007fb99bc0be10 in  () at /usr/lib/libQt5Core.so.5
#28 0x00007fb99bb6dfd6 in QAbstractItemModel::dataChanged(QModelIndex const&, QModelIndex const&, QVector<int> const&) () at /usr/lib/libQt5Core.so.5
#29 0x00007fb97c364ac9 in operator()() const (__closure=0x7fb9580049a0) at /home/bharadwaj/kde/src/libksysguard/processcore/cgroup_data_model.cpp:417
#30 0x00007fb97c365954 in std::__invoke_impl<void, KSysGuard::CGroupDataModel::update(KSysGuard::CGroup*)::<lambda()>&>(std::__invoke_other, struct {...} &) (__f=...) at /usr/include/c++/10.2.0/bits/invoke.h:60
#31 0x00007fb97c3657f0 in std::__invoke_r<void, KSysGuard::CGroupDataModel::update(KSysGuard::CGroup*)::<lambda()>&>(struct {...} &) (__fn=...) at /usr/include/c++/10.2.0/bits/invoke.h:153
#32 0x00007fb97c3655c1 in std::_Function_handler<void(), KSysGuard::CGroupDataModel::update(KSysGuard::CGroup*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/10.2.0/bits/std_function.h:291
#33 0x00007fb97c362d94 in std::function<void ()>::operator()() const (this=0x7fb9580049a0) at /usr/include/c++/10.2.0/bits/std_function.h:622
#34 0x00007fb97c362d34 in QtPrivate::FunctorCall<QtPrivate::IndexesList<>, QtPrivate::List<>, void, std::function<void ()> >::call(std::function<void ()>&, void**) (f=..., arg=0x7fb9580032a8) at /usr/include/qt/QtCore/qobjectdefs_impl.h:146
#35 0x00007fb97c362c90 in QtPrivate::Functor<std::function<void ()>, 0>::call<QtPrivate::List<>, void>(std::function<void ()>&, void*, void**) (f=..., arg=0x7fb9580032a8) at /usr/include/qt/QtCore/qobjectdefs_impl.h:256
#36 0x00007fb97c362ad6 in QtPrivate::QFunctorSlotObject<std::function<void ()>, 0, QtPrivate::List<>, void>::impl(int, QtPrivate::QSlotObjectBase*, QObject*, void**, bool*) (which=1, this_=0x7fb958004990, r=0x55a376f18350, a=0x7fb9580032a8, ret=0x0) at /usr/include/qt/QtCore/qobjectdefs_impl.h:443
#37 0x00007fb99bc01582 in QObject::event(QEvent*) () at /usr/lib/libQt5Core.so.5
#38 0x00007fb99c722752 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/libQt5Widgets.so.5
#39 0x00007fb99bbd4a7a in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/libQt5Core.so.5
#40 0x00007fb99bbd7573 in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () at /usr/lib/libQt5Core.so.5
#41 0x00007fb99bc2e0a4 in  () at /usr/lib/libQt5Core.so.5
#42 0x00007fb99a2da8f4 in g_main_context_dispatch () at /usr/lib/libglib-2.0.so.0
#43 0x00007fb99a32e821 in  () at /usr/lib/libglib-2.0.so.0
#44 0x00007fb99a2d9121 in g_main_context_iteration () at /usr/lib/libglib-2.0.so.0
#45 0x00007fb99bc2d6e1 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#46 0x00007fb99bbd33fc in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#47 0x00007fb99bbdb894 in QCoreApplication::exec() () at /usr/lib/libQt5Core.so.5
#48 0x000055a3755bf8d7 in main(int, char**) (argc=1, argv=0x7ffde83a84b8) at /home/bharadwaj/kde/src/plasma-systemmonitor/src/main.cpp:136
[Inferior 1 (process 46743) detached]

Reported using DrKonqi
Comment 1 Bharadwaj Raju 2020-12-20 09:39:24 UTC
> The crash can be reproduced every time.

Correction, I can't seem to reliably reproduce now. Can't reproduce under GDB either.

Debugging the coredump in GDB:

(gdb) frame 8
#8  0x00007fb97c361b92 in KSysGuard::CGroupPrivate::ReadPidsRunnable::wait (this=0x0) at /home/bharadwaj/kde/src/libksysguard/processcore/cgroup.cpp:95
95              std::unique_lock<std::mutex> lock{m_lock};
(gdb) print lock
$3 = {_M_device = 0x50, _M_owns = false}
(gdb) print m_lock
Cannot access memory at address 0x50
Comment 2 David Edmundson 2021-02-09 10:52:42 UTC
I think I've found it.

This is not thread safe:

```
    if (d->readPids) {
        d->readPids->wait();
    }
```

Because in another thread we do:

CGroupPrivate::ReadPidsRunnable::run()
{
....
         m_cgroupPrivate->readPids = nullptr;
}

This could happen between the two lines in the first paste.
Comment 3 Bug Janitor Service 2021-02-19 15:19:12 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/libksysguard/-/merge_requests/122
Comment 4 David Edmundson 2021-03-04 14:16:28 UTC
Git commit 18937f39935d3918f427bd8e6487244d0c6acdae by David Edmundson.
Committed on 04/03/2021 at 14:16.
Pushed by davidedmundson into branch 'master'.

Move CGroup pid fetching callback to the controller

The CGroup class started out as a dumb data store, that did some
fetching of relevant data.

It then gained a more complex async operation. The lifepsan of the
CGgroup object is managed by the model, so could get deleted whilst the
runnable was running. QRunnables and non-qobjects leds to a lot of
potential problems. There was a complex mutex and a wait condition, yet
it still misses a case only solvable with yet more mutexes.

By moving the callback handling logic to the controller, we can guard
everything in a safer more Qt manner without any overhead and with
simpler code.

There is a behavioural change if you call pids whilst things are
loading, but given a signal is emitted when pids load that's fine.

This class is exported, but the header was never installed.
Whilst technically it is an ABI break it pragmantically will have no
impact whatsoever.

M  +11   -39   processcore/cgroup.cpp
M  +8    -1    processcore/cgroup.h
M  +5    -2    processcore/cgroup_data_model.cpp

https://invent.kde.org/plasma/libksysguard/commit/18937f39935d3918f427bd8e6487244d0c6acdae
Comment 5 David Edmundson 2021-03-04 14:17:19 UTC
Git commit 1ce85053e45c95ef43f9a0543e2710f33b81e10b by David Edmundson.
Committed on 04/03/2021 at 14:17.
Pushed by davidedmundson into branch 'Plasma/5.21'.

Move CGroup pid fetching callback to the controller

The CGroup class started out as a dumb data store, that did some
fetching of relevant data.

It then gained a more complex async operation. The lifepsan of the
CGgroup object is managed by the model, so could get deleted whilst the
runnable was running. QRunnables and non-qobjects leds to a lot of
potential problems. There was a complex mutex and a wait condition, yet
it still misses a case only solvable with yet more mutexes.

By moving the callback handling logic to the controller, we can guard
everything in a safer more Qt manner without any overhead and with
simpler code.

There is a behavioural change if you call pids whilst things are
loading, but given a signal is emitted when pids load that's fine.

This class is exported, but the header was never installed.
Whilst technically it is an ABI break it pragmantically will have no
impact whatsoever.


(cherry picked from commit 18937f39935d3918f427bd8e6487244d0c6acdae)

M  +11   -39   processcore/cgroup.cpp
M  +8    -1    processcore/cgroup.h
M  +5    -2    processcore/cgroup_data_model.cpp

https://invent.kde.org/plasma/libksysguard/commit/1ce85053e45c95ef43f9a0543e2710f33b81e10b