Bug 127341 - ark crashes at exit (race condition ?)
Summary: ark crashes at exit (race condition ?)
Status: RESOLVED FIXED
Alias: None
Product: ark
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Slackware Linux
: NOR crash
Target Milestone: ---
Assignee: Harald Hvaal
URL:
Keywords:
: 130218 132527 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-05-15 03:25 UTC by jaguarwan
Modified: 2006-09-02 13:07 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description jaguarwan 2006-05-15 03:25:47 UTC
Version:            (using KDE KDE 3.5.2)
Installed from:    Slackware Packages
Compiler:          gcc 3.4.6 
OS:                Linux

Hello,

Since I upgraded to KDE 3.5.2, ark tends to crash when I use the right click menu "Extract here".

More precisely, it extracts all the files just fine but crashes at exit. However, this behaviour is not triggered by all archives.

It seems archives containing several nested folders are more likely to make ark crash at exit.

I can quite reliably reproduce this crash by using the "Extract here" menu on the antiword 0.37 tarball, for exemple. Opening the same archive with ark does not crash, however.

Here is a link to the tarball:
http://www.winfield.demon.nl/linux/antiword-0.37.tar.gz

The backtrace is always the same:
(I shortened the long lists of boring (no debugging symbols found))

(no debugging symbols found)
Using host libthread_db library "/lib/tls/libthread_db.so.1".
(no debugging symbols found)
`system-supplied DSO at 0xffffe000' has disappeared; keeping its symbols.
(no debugging symbols found)
[...]
(no debugging symbols found)
[Thread debugging using libthread_db enabled]
[New Thread -1500821824 (LWP 7676)]
(no debugging symbols found)
[...]
(no debugging symbols found)
[KCrash handler]
#5  0xa693c847 in raise () from /lib/tls/libc.so.6
#6  0xa693e0d9 in abort () from /lib/tls/libc.so.6
#7  0xa6970616 in __libc_message () from /lib/tls/libc.so.6
#8  0xa6976630 in malloc_consolidate () from /lib/tls/libc.so.6
#9  0xa69773f2 in _int_malloc () from /lib/tls/libc.so.6
#10 0xa6979201 in malloc () from /lib/tls/libc.so.6
#11 0xa699bb12 in opendir () from /lib/tls/libc.so.6
#12 0xa780f50d in KTempDir::qDir () from /opt/kde/lib/libkdecore.so.4
#13 0xa780f6d9 in KTempDir::removeDir () from /opt/kde/lib/libkdecore.so.4
#14 0xa780f73b in KTempDir::unlink () from /opt/kde/lib/libkdecore.so.4
#15 0xa5c7d4a8 in ArkWidget::cleanArkTmpDir ()
   from /opt/kde/lib/kde3/libarkpart.so
#16 0xa5c8290e in ArkWidget::~ArkWidget () from /opt/kde/lib/kde3/libarkpart.so
#17 0xa7f2c65e in KParts::Part::~Part () from /opt/kde/lib/libkparts.so.2
#18 0xa7f2c92d in KParts::ReadOnlyPart::~ReadOnlyPart ()
   from /opt/kde/lib/libkparts.so.2
#19 0xa7f2ca8f in KParts::ReadWritePart::~ReadWritePart ()
   from /opt/kde/lib/libkparts.so.2
#20 0xa5c587e9 in ArkPart::~ArkPart () from /opt/kde/lib/kde3/libarkpart.so
#21 0xa676b318 in MainWindow::~MainWindow ()
   from /opt/kde/lib/libkdeinit_ark.so
#22 0xa70e41bc in QObject::event () from /usr/lib/qt/lib/libqt-mt.so.3
#23 0xa712050f in QWidget::event () from /usr/lib/qt/lib/libqt-mt.so.3
#24 0xa71e4bd2 in QMainWindow::event () from /usr/lib/qt/lib/libqt-mt.so.3
#25 0xa708222f in QApplication::internalNotify ()
   from /usr/lib/qt/lib/libqt-mt.so.3
#26 0xa70823cc in QApplication::notify () from /usr/lib/qt/lib/libqt-mt.so.3
#27 0xa76f1965 in KApplication::notify () from /opt/kde/lib/libkdecore.so.4
#28 0xa70832e0 in QApplication::sendPostedEvents ()
   from /usr/lib/qt/lib/libqt-mt.so.3
#29 0xa70989e6 in QEventLoop::enterLoop () from /usr/lib/qt/lib/libqt-mt.so.3
#30 0xa70988a6 in QEventLoop::exec () from /usr/lib/qt/lib/libqt-mt.so.3
#31 0xa708138f in QApplication::exec () from /usr/lib/qt/lib/libqt-mt.so.3
#32 0xa6768b82 in kdemain () from /opt/kde/lib/libkdeinit_ark.so
#33 0xa75ef7b4 in kdeinitmain () from /opt/kde/lib/kde3/ark.so
#34 0x0804e474 in ?? ()
#35 0x00000004 in ?? ()
#36 0x080ecbf0 in ?? ()
#37 0x00000001 in ?? ()
#38 0x00000000 in ?? ()

I use KDE 3.5.2 from Slackware GNU/Linux current, with a 2.6.14 SMP-enabled kernel (pentium 4 ht).

I hope this report may be useful :)

Thank you for your attention and have a nice day.
Comment 1 Haris Kouzinopoulos 2006-05-15 03:44:53 UTC
Hi, thanks for your bug report! This is a duplicate bug in kdelibs though, and is already fixed.

*** This bug has been marked as a duplicate of 125642 ***
Comment 2 jaguarwan 2006-08-21 05:31:24 UTC
Hello,

Since I continued to encounter this bug in KDE 3.5.4 (with the kdelib tmpdir problem fixed), I investigated a bit and found the cause of the bug.

The problem lies in Ark, here is the valgrind output:
ark (kdeutils): Options were: -xkf
==27897== Invalid write of size 4
==27897==    at 0x40607C7: Arch::slotExtractExited(KProcess*) (arch.cpp:199)
==27897==    by 0x40609BF: Arch::qt_invoke(int, QUObject*) (arch.moc:225)
==27897==    by 0x405ACBB: TarArch::qt_invoke(int, QUObject*) (tar.moc:177)
==27897==    by 0x4D18D43: QObject::activate_signal(QConnectionList*, QUObject*) (in /usr/lib/qt-3.3.6/lib/libqt-mt.so.3.3.6)
==27897==    by 0x487399C: KProcess::processExited(KProcess*) (in /opt/kde/lib/libkdecore.so.4.2.0)
==27897==    by 0x4873A1B: KProcess::processHasExited(int) (in /opt/kde/lib/libkdecore.so.4.2.0)
==27897==    by 0x487729B: KProcessController::slotDoHousekeeping() (in /opt/kde/lib/libkdecore.so.4.2.0)
==27897==    by 0x4877307: KProcessController::qt_invoke(int, QUObject*) (in /opt/kde/lib/libkdecore.so.4.2.0)
==27897==    by 0x4D18D43: QObject::activate_signal(QConnectionList*, QUObject*) (in /usr/lib/qt-3.3.6/lib/libqt-mt.so.3.3.6)
==27897==    by 0x4D1936A: QObject::activate_signal(int, int) (in /usr/lib/qt-3.3.6/lib/libqt-mt.so.3.3.6)
==27897==    by 0x50596CF: QSocketNotifier::activated(int) (in /usr/lib/qt-3.3.6/lib/libqt-mt.so.3.3.6)
==27897==    by 0x4D35DAF: QSocketNotifier::event(QEvent*) (in /usr/lib/qt-3.3.6/lib/libqt-mt.so.3.3.6)
==27897==  Address 0x5B10F8C is 156 bytes inside a block of size 224 free'd
==27897==    at 0x401D66C: operator delete(void*) (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so)
==27897==    by 0x4058578: TarArch::~TarArch() (tar.cpp:134)
==27897==    by 0x406AB86: ArkWidget::closeArch() (arkwidget.cpp:175)
==27897==    by 0x406C76D: ArkWidget::file_close() (arkwidget.cpp:951)
==27897==    by 0x4051162: ArkPart::closeArchive() (ark_part.cpp:283)
==27897==    by 0x40511F8: ArkPart::closeURL() (ark_part.cpp:291)
==27897==    by 0x409FA16: MainWindow::file_close() (in /opt/kde/lib/libkdeinit_ark.so)
==27897==    by 0x409FE21: MainWindow::window_close() (in /opt/kde/lib/libkdeinit_ark.so)
==27897==    by 0x409FE6C: MainWindow::file_quit() (in /opt/kde/lib/libkdeinit_ark.so)
==27897==    by 0x40A23E0: MainWindow::qt_invoke(int, QUObject*) (in /opt/kde/lib/libkdeinit_ark.so)
==27897==    by 0x4D18D43: QObject::activate_signal(QConnectionList*, QUObject*) (in /usr/lib/qt-3.3.6/lib/libqt-mt.so.3.3.6)

with an instrumented libkdecore, I have seen this corruption happen between two directory removal, like here:

ark: RMDIR dir /tmp/kde-jaguarwan/ark1EFixU/temp_tarHJ9wmh/
==21657== Invalid write of size 4
==21657==    at 0x6280C66: Arch::slotExtractExited(KProcess*) (in /opt/kde/lib/kde3/libarkpart.so)
==21657==
etc...
==21657==    by 0x4DA74DB: QObject::activate_signal(int) (in /usr/lib/qt-3.3.6/lib/libqt-mt.so.3.3.6)
ark: [static bool KTempDir::removeDir(const QString&)]  /tmp/kde-jaguarwan/ark1EFixU/
entering removeDir()

The corruption happen at the line 198 of arch.cpp, an innocuous looking _kp = m_currentProcess = 0; If I remove the m_currentProcess = 0 assignation, valgrind does not detect any corruption anymore.

While I don't fully understand why m_currentProcess is in free'd memory at this point of execution (TarArch deleted ?), removing m_currentProcess = 0 fix the crash.

valgrind detects a small 744 bytes memory leak, by the way.

I would be very grateful if you could further the investigations; the fix I found works but it hides the true problem: why is m_currentProcess in a free'd memory chunk at this point of execution ?

Have a nice day :)
Comment 3 jaguarwan 2006-08-21 05:34:11 UTC
Reopening the bug as it still exists in KDE 3.5.4 with KTempDir fixed. See previous post for a beginning of fix.
Comment 4 jaguarwan 2006-08-23 03:25:05 UTC
This bug really pesters me, so I investigated further and found a cleaner fix :)

Without my fix, here is the Valgrind output with an instrumented libarkpart:
ark (kdeutils): diskHasSpace() dir: /home/jaguarwan/Build/ Size: 1640044
ark (kdeutils): Options were: -xkf
Entering Arch::slotExtractExited()
this pointer = 0x596ce00
Entering TarArch::~TarArch()
~TarArch() this pointer = 0x596ce00
The TarArch object @0x596ce00 is freed here
Corrupting memory here ->
==32161== Invalid write of size 4
==32161==    at 0x40608CF: Arch::slotExtractExited(KProcess*) (arch.cpp:202)
==32161==    by 0x4060ACF: Arch::qt_invoke(int, QUObject*) (arch.moc:225)
(...)
==32161==  Address 0x596CE9C is 156 bytes inside a block of size 224 free'd
==32161==    at 0x401D66C: operator delete(void*) (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so)
==32161==    by 0x40585F7: TarArch::~TarArch() (tar.cpp:138)
==32161==    by 0x406AC96: ArkWidget::closeArch() (arkwidget.cpp:175)
(...)
Getting out of Arch::slotExtractExited()
==32161==
==32161== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 144 from 5)

As you can see, ~TarArch is triggered while we are still in the slot, so when the slot code set the member variable it hits the freshly free'd memory. This corruption leads to the failure of malloc in opendir() later in the original backtrace I posted.

I checked why ~TarArch is called while we are still in a slot, and it appear it is awoken by the sigExtract signal, which is sent *before* clearing m_currentProcess.

So, to properly fix this bug, you only have to move emit sigExtract(success); after _kp = m_currentProcess = 0;

Here is the valgrind output after the fix:
ark (kdeutils): diskHasSpace() dir: /home/jaguarwan/Build/ Size: 1640044
ark (kdeutils): Options were: -xkf
Entering Arch::slotExtractExited()
this pointer = 0x5ab9aa0
Corrupting memory here ->
Getting out of Arch::slotExtractExited()
Entering TarArch::~TarArch()
~TarArch() this pointer = 0x5ab9aa0
The TarArch object @0x5ab9aa0 is freed here
==2088==
==2088== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 143 from 4)

and ark no longer segfaults on me (antiword tarball is really an excellent test on my machine :) ).

I am on SMP, maybe this is why it occured in the first place as it seems a thread synchronization problem.

Here is the diff against KDE 3.5.4 ark sourcecode:

--------------------------------8<---------------------------------

--- arch.cpp    2006-08-23 03:16:09.000000000 +0200
+++ arch.cpp.orig       2006-05-22 20:08:38.000000000 +0200
@@ -193,11 +193,9 @@
     }
   }
   m_password = "";
-
+  emit sigExtract( success );
   delete _kp;
   _kp = m_currentProcess = 0;
-
-  emit sigExtract( success );
 }

 void Arch::unarchFile( QStringList *fileList, const QString & destDir,


--------------------------------8<---------------------------------

Could you please apply it to your tree and check if this works for you too ?

I would really appreciate it if this bug were fixed in the next KDE release :)

Have a nice day.
Comment 5 jaguarwan 2006-08-23 03:30:51 UTC
Hmph, messed up the diff, sorry:

--------------------------------8<---------------------------------

--- ark/arch.cpp.orig   2006-05-22 20:08:38.000000000 +0200
+++ ark/arch.cpp        2006-08-23 03:16:09.000000000 +0200
@@ -193,9 +193,11 @@
     }
   }
   m_password = "";
-  emit sigExtract( success );
+
   delete _kp;
   _kp = m_currentProcess = 0;
+
+  emit sigExtract( success );
 }

 void Arch::unarchFile( QStringList *fileList, const QString & destDir,

--------------------------------8<---------------------------------

Seems I need to have some sleep x_x
Comment 6 Henrique Pinto 2006-08-23 03:35:47 UTC
SVN commit 576069 by henrique:

 * Fix for a race condition.
   Patch by jaguarwan <jaguarwan@yahoo.fr>. Thanks for the patch!

   BUG: 127341


 M  +1 -1      arch.cpp  
 M  +1 -1      main.cpp  


--- branches/KDE/3.5/kdeutils/ark/arch.cpp #576068:576069
@@ -193,9 +193,9 @@
     }
   }
   m_password = "";
-  emit sigExtract( success );
   delete _kp;
   _kp = m_currentProcess = 0;
+  emit sigExtract( success );
 }
 
 void Arch::unarchFile( QStringList *fileList, const QString & destDir,
--- branches/KDE/3.5/kdeutils/ark/main.cpp #576068:576069
@@ -65,7 +65,7 @@
 extern "C" KDE_EXPORT int kdemain( int argc, char *argv[]  )
 {
 	KAboutData aboutData( "ark", I18N_NOOP( "Ark" ),
-	                      "2.6.3", I18N_NOOP( "KDE Archiving tool" ),
+	                      "2.6.4", I18N_NOOP( "KDE Archiving tool" ),
 	                      KAboutData::License_GPL,
 	                      I18N_NOOP( "(c) 1997-2006, The Various Ark Developers" )
 	                    );
Comment 7 Haris Kouzinopoulos 2006-09-02 13:05:44 UTC
*** Bug 132527 has been marked as a duplicate of this bug. ***
Comment 8 Haris Kouzinopoulos 2006-09-02 13:07:09 UTC
*** Bug 130218 has been marked as a duplicate of this bug. ***