Bug 302264 - Dolphin crashes when browsing a git repo in a SSHFS-mount
Summary: Dolphin crashes when browsing a git repo in a SSHFS-mount
Status: RESOLVED FIXED
Alias: None
Product: dolphin
Classification: Applications
Component: general (show other bugs)
Version: 2.1
Platform: Ubuntu Linux
: NOR crash
Target Milestone: ---
Assignee: Dolphin Bug Assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-06-20 20:30 UTC by Wolfgang Wallner
Modified: 2013-01-15 16:12 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In: 4.10
Sentry Crash Report:


Attachments
New crash information added by DrKonqi (8.00 KB, text/plain)
2012-10-10 13:32 UTC, Alex Ball
Details
Patch to hopefully fix the crash (833 bytes, patch)
2012-10-10 19:03 UTC, Simeon Bird
Details
Patch with even more hope of fixing the crash (1.45 KB, patch)
2012-10-15 18:59 UTC, Simeon Bird
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wolfgang Wallner 2012-06-20 20:30:31 UTC
Application: dolphin (2.0)
KDE Platform Version: 4.8.3 (4.8.3)
Qt Version: 4.8.1
Operating System: Linux 3.2.0-25-generic x86_64
Distribution: Ubuntu 12.04 LTS

-- Information about the crash:
- What I was doing when the application crashed:
Browsing a direrectory that is mounted via sshfs.

The crash is reproducable. When I enter the sshfs-mount and browse through the folders, it will crash after a few directories (not always at the same).
It happens in all view modes (icons, compact, details).

When browsing in details-viewmode, and enabling tree-view, i can open the sshfs-mount by unfolding the tree without a crash. Howeber, entering a directy again crashes dolphin.

The crash can be reproduced every time.

-- Backtrace:
Application: Dolphin (dolphin), signal: Segmentation fault
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:39
[Current thread is 1 (Thread 0x7f8b5deef780 (LWP 8763))]

Thread 4 (Thread 0x7f8b49c83700 (LWP 8764)):
#0  0x00007f8b5d7bdb03 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00007f8b5579cff6 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007f8b5579d124 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007f8b5a84e426 in QEventDispatcherGlib::processEvents (this=0x7f8b440008c0, flags=...) at kernel/qeventdispatcher_glib.cpp:426
#4  0x00007f8b5a81dc82 in QEventLoop::processEvents (this=<optimized out>, flags=...) at kernel/qeventloop.cpp:149
#5  0x00007f8b5a81ded7 in QEventLoop::exec (this=0x7f8b49c82dd0, flags=...) at kernel/qeventloop.cpp:204
#6  0x00007f8b5a71cfa7 in QThread::exec (this=<optimized out>) at thread/qthread.cpp:501
#7  0x00007f8b5a7fd9ff in QInotifyFileSystemWatcherEngine::run (this=0x113ede0) at io/qfilesystemwatcher_inotify.cpp:248
#8  0x00007f8b5a71ffcb in QThreadPrivate::start (arg=0x113ede0) at thread/qthread_unix.cpp:298
#9  0x00007f8b56061e9a in start_thread (arg=0x7f8b49c83700) at pthread_create.c:308
#10 0x00007f8b5d7c94bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f8b48b8c700 (LWP 8765)):
#0  0x00007f8b5d7bdb03 in __GI___poll (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at ../sysdeps/unix/sysv/linux/poll.c:87
#1  0x00007f8b5579cff6 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#2  0x00007f8b5579d124 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#3  0x00007f8b5a84e426 in QEventDispatcherGlib::processEvents (this=0x7f8b3c0008c0, flags=...) at kernel/qeventdispatcher_glib.cpp:426
#4  0x00007f8b5a81dc82 in QEventLoop::processEvents (this=<optimized out>, flags=...) at kernel/qeventloop.cpp:149
#5  0x00007f8b5a81ded7 in QEventLoop::exec (this=0x7f8b48b8bdd0, flags=...) at kernel/qeventloop.cpp:204
#6  0x00007f8b5a71cfa7 in QThread::exec (this=<optimized out>) at thread/qthread.cpp:501
#7  0x00007f8b5a7fd9ff in QInotifyFileSystemWatcherEngine::run (this=0x1460630) at io/qfilesystemwatcher_inotify.cpp:248
#8  0x00007f8b5a71ffcb in QThreadPrivate::start (arg=0x1460630) at thread/qthread_unix.cpp:298
#9  0x00007f8b56061e9a in start_thread (arg=0x7f8b48b8c700) at pthread_create.c:308
#10 0x00007f8b5d7c94bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f8b425ea700 (LWP 8799)):
[KCrash Handler]
#6  0x00007f8b5cb36562 in lockInline (this=0x1a38be0) at /usr/include/qt4/QtCore/qmutex.h:187
#7  relock (this=<synthetic pointer>) at /usr/include/qt4/QtCore/qmutex.h:129
#8  UpdateItemStatesThread::run (this=0x1a38bc0) at ../../../dolphin/src/views/versioncontrol/updateitemstatesthread.cpp:67
#9  0x00007f8b5a71ffcb in QThreadPrivate::start (arg=0x1a38bc0) at thread/qthread_unix.cpp:298
#10 0x00007f8b56061e9a in start_thread (arg=0x7f8b425ea700) at pthread_create.c:308
#11 0x00007f8b5d7c94bd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#12 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f8b5deef780 (LWP 8763)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:39
#1  0x00007f8b5a71ea9b in _q_futex (val2=0, addr2=0x0, timeout=0x0, val=2, op=0, addr=0x191bb10) at thread/qmutex_unix.cpp:99
#2  QMutexPrivate::wait (this=0x191bb10, timeout=<optimized out>) at thread/qmutex_unix.cpp:113
#3  0x00007f8b5a71a86d in QMutex::lockInternal (this=<optimized out>) at thread/qmutex.cpp:450
#4  0x00007f8b5cb3603d in lockInline (this=0x7f8b5cd62b18) at /usr/include/qt4/QtCore/qmutex.h:190
#5  QMutexLocker (m=0x7f8b5cd62b18, this=<synthetic pointer>) at /usr/include/qt4/QtCore/qmutex.h:109
#6  UpdateItemStatesThread::setData (this=0x1a71b40, plugin=0x1130bf0, itemStates=...) at ../../../dolphin/src/views/versioncontrol/updateitemstatesthread.cpp:51
#7  0x00007f8b5cb36af4 in updateItemStates (this=0x16bf3d0) at ../../../dolphin/src/views/versioncontrol/versioncontrolobserver.cpp:262
#8  VersionControlObserver::updateItemStates (this=0x16bf3d0) at ../../../dolphin/src/views/versioncontrol/versioncontrolobserver.cpp:228
#9  0x00007f8b5cb37a2a in VersionControlObserver::verifyDirectory (this=0x16bf3d0) at ../../../dolphin/src/views/versioncontrol/versioncontrolobserver.cpp:182
#10 0x00007f8b5a833281 in QMetaObject::activate (sender=0x16deb40, m=<optimized out>, local_signal_index=<optimized out>, argv=0x0) at kernel/qobject.cpp:3547
#11 0x00007f8b5a838179 in QObject::event (this=0x16deb40, e=<optimized out>) at kernel/qobject.cpp:1157
#12 0x00007f8b59924894 in notify_helper (e=0x7fff2d59c510, receiver=0x16deb40, this=0x109c9b0) at kernel/qapplication.cpp:4559
#13 QApplicationPrivate::notify_helper (this=0x109c9b0, receiver=0x16deb40, e=0x7fff2d59c510) at kernel/qapplication.cpp:4531
#14 0x00007f8b59929713 in QApplication::notify (this=0x7fff2d59c7f0, receiver=0x16deb40, e=0x7fff2d59c510) at kernel/qapplication.cpp:4420
#15 0x00007f8b5b27fbb6 in KApplication::notify (this=0x7fff2d59c7f0, receiver=0x16deb40, event=0x7fff2d59c510) at ../../kdeui/kernel/kapplication.cpp:311
#16 0x00007f8b5a81ee9c in QCoreApplication::notifyInternal (this=0x7fff2d59c7f0, receiver=0x16deb40, event=0x7fff2d59c510) at kernel/qcoreapplication.cpp:876
#17 0x00007f8b5a8501f2 in sendEvent (event=0x7fff2d59c510, receiver=<optimized out>) at ../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:231
#18 QTimerInfoList::activateTimers (this=0x109e070) at kernel/qeventdispatcher_unix.cpp:611
#19 0x00007f8b5a84dc0d in timerSourceDispatch (source=<optimized out>) at kernel/qeventdispatcher_glib.cpp:186
#20 timerSourceDispatch (source=<optimized out>) at kernel/qeventdispatcher_glib.cpp:180
#21 0x00007f8b5a84dc31 in idleTimerSourceDispatch (source=<optimized out>) at kernel/qeventdispatcher_glib.cpp:233
#22 0x00007f8b5579cc9a in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#23 0x00007f8b5579d060 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#24 0x00007f8b5579d124 in g_main_context_iteration () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#25 0x00007f8b5a84e3bf in QEventDispatcherGlib::processEvents (this=0x10794a0, flags=...) at kernel/qeventdispatcher_glib.cpp:424
#26 0x00007f8b599ccd5e in QGuiEventDispatcherGlib::processEvents (this=<optimized out>, flags=...) at kernel/qguieventdispatcher_glib.cpp:204
#27 0x00007f8b5a81dc82 in QEventLoop::processEvents (this=<optimized out>, flags=...) at kernel/qeventloop.cpp:149
#28 0x00007f8b5a81ded7 in QEventLoop::exec (this=0x7fff2d59c780, flags=...) at kernel/qeventloop.cpp:204
#29 0x00007f8b5a822f67 in QCoreApplication::exec () at kernel/qcoreapplication.cpp:1148
#30 0x00007f8b5dadc4c7 in kdemain (argc=1, argv=0x7fff2d59cd48) at ../../../dolphin/src/main.cpp:89
#31 0x00007f8b5d6f876d in __libc_start_main (main=0x400640 <main(int, char**)>, argc=1, ubp_av=0x7fff2d59cd48, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fff2d59cd38) at libc-start.c:226
#32 0x0000000000400671 in _start ()

Reported using DrKonqi
Comment 1 Wolfgang Wallner 2012-06-21 11:41:33 UTC
Browsing the SSHFS-mounted directory in a "Save file"-dialog does not crash an application.

I tried it creating a new file in Kwrite and saving it on the SSHS-mounted directory.
I can enter and leave different directoriess (a few levels deep).

Hope this helps :)
Comment 2 Mark 2012-07-04 21:03:23 UTC
(In reply to comment #1)
> Browsing the SSHFS-mounted directory in a "Save file"-dialog does not crash
> an application.
> 
> I tried it creating a new file in Kwrite and saving it on the SSHS-mounted
> directory.
> I can enter and leave different directoriess (a few levels deep).
> 
> Hope this helps :)

Hmm, from the looks of the backtrace it crashed in a folder that has version control in it. Could you verify that? So, it should crash on one of your folders with version control (cvs?) in it.

If it does, we need the folder to fix it :) You can remove everything out the folder till it's as small as possible and still crashing.
Comment 3 Wolfgang Wallner 2012-07-04 21:39:39 UTC
(In reply to comment #2)
> Hmm, from the looks of the backtrace it crashed in a folder that has version
> control in it.

Yes, it contains a working copy of a subversion repo.

> Could you verify that? So, it should crash on one of your
> folders with version control (cvs?) in it.

Additional details i know so far:

I tried to sshfs-mount a locale copy of the same repo.
When i browse this copy on localhost, dolphin does not crash.

I also tried to sshfs-mount another remote server, which contains some git-repos.
Againg, dolphin does not crash.

> If it does, we need the folder to fix it :)
> You can remove everything out
> the folder till it's as small as possible and still crashing.

ok :)
Comment 4 Mark 2012-07-04 22:26:05 UTC
Oke, so it occurs only on mounted SSHFS partitions with a version controlled folder.. ehhh, not everyone has a SSHFS mount which makes this a bit difficult to fix ;)
Comment 5 Wolfgang Wallner 2012-07-05 18:01:26 UTC
(In reply to comment #4)
> Oke, so it occurs only on mounted SSHFS partitions with a version controlled
> folder.. ehhh, not everyone has a SSHFS mount which makes this a bit
> difficult to fix ;)

I will try to put a minimal working ( = crashing) example on my server. If this works, i will come back and ask you for your public key so you can access the example and hopefully reproduce the bug.
Comment 6 Mark 2012-07-05 18:56:54 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > Oke, so it occurs only on mounted SSHFS partitions with a version controlled
> > folder.. ehhh, not everyone has a SSHFS mount which makes this a bit
> > difficult to fix ;)
> 
> I will try to put a minimal working ( = crashing) example on my server. If
> this works, i will come back and ask you for your public key so you can
> access the example and hopefully reproduce the bug.

Hi,

I appreciate that effort :) Must be quite a job to get that "working" (read crashing). However, i don't have anything i can mount as SSHFS... yet. I can see if i can mount my local server though i rather prefer something less critical since i'm going to attempt dolphin crashes.. Even though that will not crash my server, it "might" cause some data loss which i rather prevent.
Comment 7 Simeon Bird 2012-07-17 15:31:36 UTC
I'm seeing a crash that looks similar with 4.9rc2, when browsing an sshfs mounted folder in dolphin, containing a git repo. 

It only crashes if the dolphin git plugin is enabled, and it only crashes on first access; after the crash I can restart dolphin and list the folder fine.

I have the svn plugin turned off.

Does that agree with what you see?
Comment 8 Jeroen van Meeuwen (Kolab Systems) 2012-08-24 16:18:31 UTC
Resetting assignee to default as per bug #305719
Comment 9 Alex Ball 2012-10-10 13:32:45 UTC
Created attachment 74459 [details]
New crash information added by DrKonqi

dolphin (2.0) on KDE Platform 4.8.5 (4.8.5) using Qt 4.8.1

- What I was doing when the application crashed:
I had this crash in slightly different circumstances. I had Dolphin in split screen mode, with a local bzr-controlled directory on one side and a network directory (mounted using sshfs/autofs) on the other. It happened while I was compiling a LaTeX document in the former directory (so quite a few files were changing at once). I should mention I have the bzr vcs plugin running in Dolphin.

It does not happen all the time.

-- Backtrace (Reduced):
#6  0x00007fb03a1a8752 in lockInline (this=0x1628990) at /usr/include/qt4/QtCore/qmutex.h:187
#7  relock (this=<synthetic pointer>) at /usr/include/qt4/QtCore/qmutex.h:129
#8  UpdateItemStatesThread::run (this=0x1628970) at ../../../dolphin/src/views/versioncontrol/updateitemstatesthread.cpp:67
#9  0x00007fb037d8cfcb in QThreadPrivate::start (arg=0x1628970) at thread/qthread_unix.cpp:298
#10 0x00007fb0336cfe9a in start_thread (arg=0x7fb017c6d700) at pthread_create.c:308
Comment 10 Simeon Bird 2012-10-10 19:03:07 UTC
Created attachment 74465 [details]
Patch to hopefully fix the crash

I *think* the problem here is that the lock on the list of directory items being released when listing directories. Either due to a misplaced compiler optimisation, or something else, this pattern:

lock
..stuff..
const QString directory = getDirectory_from_data_which_may_be_changed_by_another_thread()
unlock
accessDirectory(directory)
relock

is being changed to:

lock
..stuff..
unlock
accessDirectory(getDirectory_from_data_which_may_be_changed_by_another_thread())
relock

and the other-thread data is being changed while the directory is being listed, causing the crash. 

The reason people are finding this with network filesystems is just that directory listing is much slower, so the window to hit the race is wider.

Solution is easy: just hold the lock over the call to accessdirectory. There's no real reason not to, especially since we are editing the data it is supposed to protect outside the lock. In most cases the overhead to release and get the lock compared with the directory listing time means this version is probably even faster. 

I haven't seen the crash with this patch, but it's always been intermittent, so if someone else could confirm, that would be good.
Comment 11 Frank Reininghaus 2012-10-10 21:06:56 UTC
Thanks for the analysis and the patch!

The backtraces look like there is a deadlock between both threads:

Thread 1 is in UpdateItemStatesThread::setData() in line 51 in updateitemstatesthread.cpp. This means that it has the lock on m_itemMutex and tries to get the lock on m_globalPluginMutex.

Thread 2 is in UpdateItemStatesThread::run() in line 67 in updateitemstatesthread.cpp. This means that it has the lock on m_globalPluginMutex and tries to get the lock on m_itemMutex.

Your patch will indeed fix that because it prevents that thread 1 locks m_itemMutex in the small window (which might not be so small for remote file systems) where thread 2 releases the lock.

The only thing that is not clear to me is why this window actually exists - maybe there is a reason to unlock and re-lock the mutex, but I don't see it. An alternative fix would be to change UpdateItemStatesThread::setData() and unlock the first mutex before locking the other one. What do you think? I don't really think that access to shared data in the window where the lock is released is the problem.
Comment 12 Simeon Bird 2012-10-10 23:40:00 UTC
Ah - yes, that makes more sense. It didn't occur to me that a deadlock could result in a segfault, actually. I thought it would just hang forever.

Releasing the lock earlier in setData would still (in principle) allow the formation of deadlocks, would it not? Because the locks are potentially taken in a different order. 

Suppose Thread A (in setData) gets itemMutex on line 50, and at the same time
Thread B gets PluginMutex on line 63. 

Then Thread B will try to re-lock itemMutex on line 65 and block because of thread A, while thread A tries to get PluginMutex on line 52 and blocks because of thread B. Deadlock!

I admit that I don't see why the window exists either though, which is troubling...

Thanks for the quick response
Comment 13 Frank Reininghaus 2012-10-15 12:36:50 UTC
Sorry for the late reply, I was away for a couple of days.

(In reply to comment #12)
> Ah - yes, that makes more sense. It didn't occur to me that a deadlock could
> result in a segfault, actually. I thought it would just hang forever.

Why it segfaults is not really clear to me either, but the potential deadlock is the only problem that I can see in the code.

> Releasing the lock earlier in setData would still (in principle) allow the
> formation of deadlocks, would it not? Because the locks are potentially
> taken in a different order. 

I don't think that the locks can be done in a different order. When the compiler reorders statements to optimise the code, it must make sure that the reordered code is semantically equivalent. This would not be the case if mutex locks were reordered - the compiler would generate artificial deadlocks all the time if that was possible.

> Suppose Thread A (in setData) gets itemMutex on line 50, and at the same time
> Thread B gets PluginMutex on line 63. 
> 
> Then Thread B will try to re-lock itemMutex on line 65 and block because of
> thread A, while thread A tries to get PluginMutex on line 52 and blocks
> because of thread B. Deadlock!

This would not be possible if the first mutex was unlocked in setData() before the second one is locked.

> I admit that I don't see why the window exists either though, which is
> troubling...

Probably this is done to make sure that m_itemMutex is unlocked while "m_plugin->beginRetrieval(directory)" is executed in run(), because this call could take quite some time? If m_itemMutex would be locked during this call, any access to UpdateItemStatesThread::itemStates() or UpdateItemStatesThread::retrievedItems() from thread 1 might block the GUI.
Comment 14 Simeon Bird 2012-10-15 18:56:50 UTC
> Sorry for the late reply, I was away for a couple of days.

That's alright - no need to apologise.

> Why it segfaults is not really clear to me either, but the potential
> deadlock is the only problem that I can see in the code.

One other thing I noticed - we have:

Q_ASSERT(!m_itemStates.isEmpty());
followed by:
m_itemStates.first()

but Q_ASSERT just prints a warning and continues if it is hit
(and in any case m_itemStates is not locked at the time),
so it doesn't seem impossible that m_itemStates is empty when .first() is called.

And actually, if m_plugin is null, we will give a warning and segfault, so maybe that's the problem.

> > Releasing the lock earlier in setData would still (in principle) allow the
> > formation of deadlocks, would it not? Because the locks are potentially
> > taken in a different order. 
> 
> I don't think that the locks can be done in a different order. When the
> compiler reorders statements to optimise the code, it must make sure that
> the reordered code is semantically equivalent. This would not be the case if
> mutex locks were reordered - the compiler would generate artificial
> deadlocks all the time if that was possible.

Sorry, I wasn't very clear - I just meant "sometimes itemMutex is taken first, and sometimes PluginMutex", not that they could be taken in a different order to that written in the code.

> > Suppose Thread A (in setData) gets itemMutex on line 50, and at the same time
> > Thread B gets PluginMutex on line 63. 
> > 
> > Then Thread B will try to re-lock itemMutex on line 65 and block because of
> > thread A, while thread A tries to get PluginMutex on line 52 and blocks
> > because of thread B. Deadlock!
> 
> This would not be possible if the first mutex was unlocked in setData()
> before the second one is locked.

You're quite right, that would fix it. I was worried that if run() were 
called after unlocking the first mutex in setData, but before locking the second, 
m_itemStates could get listings from the old plugin. 
We could fix that by just taking PluginMutex first in setData though.

> > I admit that I don't see why the window exists either though, which is
> > troubling...
> 
> Probably this is done to make sure that m_itemMutex is unlocked while
> "m_plugin->beginRetrieval(directory)" is executed in run(), because this
> call could take quite some time? If m_itemMutex would be locked during this
> call, any access to UpdateItemStatesThread::itemStates() or
> UpdateItemStatesThread::retrievedItems() from thread 1 might block the GUI.

Makes sense. So what do you think of the following patch?
Comment 15 Simeon Bird 2012-10-15 18:59:37 UTC
Created attachment 74568 [details]
Patch with even more hope of fixing the crash

This patch:
- Fixes potential deadlocks by taking the two mutexes in setData in the same order as in run(). 
- Fixes potential segfaults by checking explicitly under a mutex that m_plugin is not null and m_itemStates is not empty and returning if they are.

- Hopefully fixes the crash.
Comment 16 Frank Reininghaus 2012-10-18 10:52:27 UTC
Thanks for the updated patch! Can be pushed to 4.9 after considering the comments below. If you don't have a git account, let me know and I'll do it for you.

+    //The locks are taken in the same order as in run()
+    //to avoid potential deadlock.

Add a space after '//' - that is more consistent with existing code and improves readability.

+    QMutexLocker pluginLocker(m_globalPluginMutex);
     QMutexLocker itemLocker(&m_itemMutex);
-    m_itemStates = itemStates;
 
-    QMutexLocker pluginLocker(m_globalPluginMutex);
+    m_itemStates = itemStates;
     m_plugin = plugin;
 
Yes, that might be a good way to fix the crash. 
 
@@ -58,12 +60,15 @@ void UpdateItemStatesThread::run()
     Q_ASSERT(m_plugin);
 
     QMutexLocker itemLocker(&m_itemMutex);
+    if ( m_itemStates.isEmpty() )
+        return;

Not necessary - the Q_ASSERT() at the beginning of that function ensures that m_itemStates is not empty.

     const QString directory = m_itemStates.first().item.url().directory(KUrl::AppendTrailingSlash);
+    m_retrievedItems = false;
     itemLocker.unlock();

Right, m_retrievedItems should also be protected by the mutex, good catch!
 
-    if (m_plugin->beginRetrieval(directory)) {
+    if (m_plugin && m_plugin->beginRetrieval(directory)) {

m_plugin cannot be 0 (see Q_ASSERT() in that function).
Comment 17 Simeon Bird 2012-10-20 00:54:38 UTC
> Thanks for the updated patch! Can be pushed to 4.9 after considering the
> comments below. If you don't have a git account, let me know and I'll do it
> for you.

Thanks! I have a git account, and I'll go ahead and commit it to 4.9.

> Add a space after '//' - that is more consistent with existing code and
> improves readability.

Sure.

> Not necessary - the Q_ASSERT() at the beginning of that function ensures
> that m_itemStates is not empty.

Ok - I was just being paranoid.
Comment 18 Jekyll Wu 2012-10-20 01:28:19 UTC
Strange, why isn't there auto message for http://commits.kde.org/kde-baseapps/4bdf134cbd3543fbf0273ed49e2b13b3ce49744e  ?

commit 4bdf134cbd3543fbf0273ed49e2b13b3ce49744e
Author: Simeon Bird <spb@ias.edu>
Date:   22 分钟之前

    Fix race condition and deadlock in the version plugin
    when listing directories is slow.
    
    BUG: 302264
    FIXED-IN: 4.9.3
Comment 19 Simeon Bird 2012-10-20 03:55:01 UTC
Sorry, I committed from the wrong email address...
Comment 20 Simeon Bird 2012-12-10 03:08:27 UTC
Ok, actually, this isn't fixed at all (I was able to trigger it fairly reliably again).

So, simply not unlocking and relocking the mutex does indeed appear to fix the crash, but I think by accident; not unlocking the mutex means that calls to itemState block while UpdateStateThread::run is in progress, and thus slotThreadFinished is blocked. 

If slotThreadFinished is not blocked, once it is done deleteLater() is called on the UpdateStateThread. When I compile in debug mode I get a "QThread:destroyed while thread is still running" message before the crash. So I think that somehow this deletion event is getting delivered to the running UpdateStateThread, not the one that just finished. 

FYI, I can trigger this reliably if I make the itemMutex a ReadWriteLock.

The call to setData is a red herring: it's just sitting there blocking in another thread, not doing anything dangerous.
Comment 21 Simeon Bird 2013-01-15 16:12:33 UTC
Git commit a9d7ebbc8be0d03104f3b5c620f8da232cd52857 by Simeon Bird.
Committed on 13/01/2013 at 19:49.
Pushed by sbird into branch 'KDE/4.10'.

A crash occurs if updateItemStates runs between the
UpdateItemStatesThread finishing and the finished() signal being
delivered.

In this case, a new thread was not created, because the old thread
still existed. However, pendingItemStatesUpdate was not set, because the
thread was not running. Instead, the old thread was restarted.

This meant that the finished() signal from the first run could be delivered
while the thread was running for a second time, causing the thread to be
deleted while still running and thus a crash.

Solution: set pendingItemStatesUpdate if the thread is non-null,
even if it is not running, knowing that slotThreadFinished has not yet run,
and will call updateItemStates itself.
FIXED-IN: 4.10
REVIEW: 107656

M  +1    -1    dolphin/src/views/versioncontrol/versioncontrolobserver.cpp

http://commits.kde.org/kde-baseapps/a9d7ebbc8be0d03104f3b5c620f8da232cd52857