Bug 426387

Summary: klauncher crashes because accessing dangled pointers
Product: [Frameworks and Libraries] frameworks-kinit Reporter: Ralf Habacker <ralf.habacker>
Component: generalAssignee: David Faure <faure>
Status: RESOLVED FIXED    
Severity: crash CC: kdelibs-bugs
Priority: NOR    
Version: 5.65.0   
Target Milestone: ---   
Platform: Other   
OS: Microsoft Windows   
Latest Commit: Version Fixed In: 5.85.0
Sentry Crash Report:
Bug Depends on:    
Bug Blocks: 426400, 435581    

Description Ralf Habacker 2020-09-10 22:11:01 UTC
SUMMARY

Running klauncher in combination with cross compiled kmymoney5 let klauncher5 crashes always with an segmentation fault after some time.

STEPS TO REPRODUCE
1. download portable package from https://build.opensuse.org/package/binaries/home:rhabacker:branches:windows:mingw:win32:kmymoney5-kf565/mingw32-kmymoney5:mingw32-kmymoney5-installer/openSUSE_Leap_15.1
2. start kmymoney
3. open file

OBSERVED RESULT
After some time klauncher5 crashes at

Thread 1 received signal SIGSEGV, Segmentation fault.
0x6cb835f9 in KIO::IdleSlave::protocol (this=0x1414b90) at /home/abuild/rpmbuild/BUILD/kio-5.65.0/src/core/idleslave.cpp:152
152         return d->mProtocol;
(gdb) bt
#0  0x6cb835f9 in KIO::IdleSlave::protocol (this=this@entry=0x1414b90) at /home/abuild/rpmbuild/BUILD/kio-5.65.0/src/core/idleslave.cpp:152
#1  0x004028cf in KLauncher::idleTimeout (this=this@entry=0x13f96a0) at /usr/i686-w64-mingw32/sys-root/mingw/include/qt5/QtCore/qstring.h:95
#2  0x0040a07a in KLauncher::qt_static_metacall (_o=_o@entry=0x13f96a0, _id=_id@entry=7, _a=_a@entry=0x28dba8, _c=QMetaObject::InvokeMetaMethod)
    at /home/abuild/rpmbuild/BUILD/kinit-5.65.0/build/src/klauncher/kdeinit_klauncher_autogen/EWIEGA46WW/moc_klauncher.cpp:117
#3  0x0040a0b0 in KLauncher::qt_static_metacall (_o=0x13f96a0, _c=QMetaObject::InvokeMetaMethod, _id=7, _a=0x28dba8)
    at /home/abuild/rpmbuild/BUILD/kinit-5.65.0/build/src/klauncher/kdeinit_klauncher_autogen/EWIEGA46WW/moc_klauncher.cpp:106
#4  0x6e3ef32a in libQt5Core!_ZN11QMetaObject8activateEP7QObjectiiPPv () from F:\Downloads\kmymoney\kmymoney5-5.1+git.2a911c86b\bin\libQt5Core.dll
#5  0x6e3fd4a8 in libQt5Core!_ZN6QTimer10timerEventEP11QTimerEvent () from F:\Downloads\kmymoney\kmymoney5-5.1+git.2a911c86b\bin\libQt5Core.dll

(gdb) p d
$1 = {
  d = 0xfeeefeee
}

EXPECTED RESULT
no crash


SOFTWARE/OS VERSIONS
Windows: 10
KDE Frameworks Version: 5.65.0
Qt Version: 5.11

ADDITIONAL INFORMATIONS
According to https://en.wikipedia.org/wiki/Magic_number_(programming)#Debug_values does the mentioned pointer indicate accessing already free'd memory.

At https://invent.kde.org/frameworks/kinit/-/blob/master/src/klauncher/klauncher.cpp#L116 KLauncher::idleTime() is set as slot for a timer event, which accesses the list of slaves at https://invent.kde.org/frameworks/kinit/-/blob/master/src/klauncher/klauncher.cpp#L1130. In the crash case the referenced list contains dangled pointers because the loop is entered. The list itself, which is defined at https://invent.kde.org/frameworks/kinit/-/blob/master/src/klauncher/klauncher.h#L218, is not guarded against dangled pointers.

Changing this definition to  

   QList<QPointer<IdleSlave>> mSlaveList;

or similar would fix that issue for my opinion.
Comment 2 Bug Janitor Service 2021-02-28 21:38:29 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/kinit/-/merge_requests/6
Comment 3 Ralf Habacker 2021-03-01 10:07:39 UTC
A workaround is to skip using klauncher by either

1. setting KDE_FORK_SLAVES=1 in front of starting an application or
2. configure KIO with cmake ... -DKIO_FORK_SLAVES=1 at https://build.opensuse.org/package/show/windows:mingw:win32/mingw32-kio and https://build.opensuse.org/package/show/windows:mingw:win64/mingw64-kio

Since klauncher is not used at binary factory, this problem has probably not been detected yet.

The advantage of using klauncher is that it is easier to terminate running slaves after a KDE session by terminating klauncher.

This is necessary, for example, for portable installations that are started from a USB stick or similar. Without terminating the background processes the USB stick cannot be removed again.
Comment 4 Ralf Habacker 2021-04-15 01:27:19 UTC
(In reply to Bug Janitor Service from comment #2)

It was mentioned at https://invent.kde.org/frameworks/kinit/-/merge_requests/6#note_195092 that the dangling pointer seems to be just an indication of a deeper problem with the signal/slot system and requires further troubleshooting.

Since there are currently no qt5 pretty printers for gdb, this will probably be delayed until the GSOC project https://community.kde.org/GSoC/2021/Ideas#Project:_Add_gdb_pretty_printer_support_for_Qt5 is completed.
Comment 5 Ralf Habacker 2021-07-21 08:33:01 UTC
Git commit 2a5d047b49a866de7e478a632ef53ab1d711c273 by Ralf Habacker.
Committed on 21/07/2021 at 07:42.
Pushed by dfaure into branch 'master'.

Fixes crash in KLauncher::idleTimeout() caused by unblockable destruction of IdleSlave objects

According to the documentation of QObject::destroyed() at
https://doc.qt.io/qt-5/qobject.html#destroyed this signal cannot be
be blocked.
This can lead to the fact that by removing an object from mSlaveList by
slotSlaveGone(), the contents of the list are changed in such a way that
with the next iteration again an access to the deleted object takes place
and thereby a segmentation fault is released.

See the following real world trace without this patch

"2021/07/14 12:57:55,782" idleTimeout
"2021/07/14 12:57:55,782" idleTimeout 0x4d29a60
"2021/07/14 12:57:55,782" idleTimeout 0x4d32778
"2021/07/14 12:57:55,782" idleTimeout killing KIO::IdleSlave(0x4d32778)
"2021/07/14 12:57:55,782" slotSlaveGone QObject(0x4d32778)
"2021/07/14 12:57:55,782" idleTimeout 0x4d54550
"2021/07/14 12:57:55,782" idleTimeout killing KIO::IdleSlave(0x4d54550)
"2021/07/14 12:57:55,782" slotSlaveGone QObject(0x4d54550)
"2021/07/14 12:57:55,782" idleTimeout 0x4d61460
"2021/07/14 12:57:55,782" idleTimeout killing KIO::IdleSlave(0x4d61460)
"2021/07/14 12:57:55,782" slotSlaveGone QObject(0x4d61460)
"2021/07/14 12:57:55,782" idleTimeout 0x4d61460
Thread 1 received signal SIGSEGV, Segmentation fault.

where the calls to slotSlaveGone() are intermixed with the iteration.

In the opposite after applying this patch there is

"2021/07/14 13:06:12,870" idleTimeout
"2021/07/14 13:06:12,870" idleTimeout 0x4d3a668
"2021/07/14 13:06:12,870" idleTimeout 0x4d6f8e8
"2021/07/14 13:06:12,870" idleTimeout killing KIO::IdleSlave(0x4d6f8e8)
"2021/07/14 13:06:12,870" idleTimeout 0x4d60540
"2021/07/14 13:06:12,870" idleTimeout killing KIO::IdleSlave(0x4d60540)
"2021/07/14 13:06:12,870" idleTimeout 0x4d6d400
"2021/07/14 13:06:12,870" idleTimeout killing KIO::IdleSlave(0x4d6d400)
"2021/07/14 13:06:12,870" idleTimeout 0x4da14a8
"2021/07/14 13:06:12,870" idleTimeout killing KIO::IdleSlave(0x4da14a8)
"2021/07/14 13:06:12,870" slotSlaveGone QObject(0x4d6f8e8)
"2021/07/14 13:06:12,870" slotSlaveGone QObject(0x4d60540)
"2021/07/14 13:06:12,870" slotSlaveGone QObject(0x4d6d400)
"2021/07/14 13:06:12,870" slotSlaveGone QObject(0x4da14a8)

which shows that deleting the slaves after leaving idleTimeout() happens.
FIXED-IN:5.85.0

M  +1    -1    src/klauncher/klauncher.cpp

https://invent.kde.org/frameworks/kinit/commit/2a5d047b49a866de7e478a632ef53ab1d711c273