Summary: | crash in KIO scheduler searchIdleList | ||
---|---|---|---|
Product: | [Unmaintained] kio | Reporter: | Rosetzky Cedric <loacoon> |
Component: | general | Assignee: | kdelibs bugs <kdelibs-bugs> |
Status: | RESOLVED FIXED | ||
Severity: | crash | CC: | adawit, andresbajotierra, auxsvr, dailey.thomas.a, echidnaman, el_ca_pi_tan, faure, FitchKendall, frank78ac, kdedevel, marcus, mchugh19, mrbrianclem, mrfort, screew2, yabolus |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Compiled Sources | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | 4.5.0 | |
Sentry Crash Report: | |||
Attachments: | Log of the crash with added debug info, showing a delete'd job being started. |
Description
Rosetzky Cedric
2008-06-03 23:06:39 UTC
*** Bug 164364 has been marked as a duplicate of this bug. *** *** Bug 167077 has been marked as a duplicate of this bug. *** *** Bug 168827 has been marked as a duplicate of this bug. *** *** Bug 170864 has been marked as a duplicate of this bug. *** To help us reproduce the problem, it would be helpful if the bug reporters could provide the following information: - which view mode (icons, details, column) did you use? - has the preview been turned on? - does the crash occur very rarely or is it always reproducible by going to a particular directory? Thanks! *** Bug 169105 has been marked as a duplicate of this bug. *** This is a bug in the KIO scheduler, so it is by nature difficult to reproduce exactly (depends on timing of kioslaves coming and going, etc.) However it is quite likely that this is the same crash as bug 165540. Which I thought was fixed, but a comment says it's not. It would be really helpful to know if anyone still gets this crash with the trunk version of KDE. Yes, I still get this bug occasionally. View mode is "icons", preview has been turned on for pictures. Version (dolphin-kde4) is: 4:4.1.2-0ubuntu1~hardy1~ppa1 Here's the recent bug report: Anwendung: Dolphin (dolphin), Signal SIGSEGV (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) [Thread debugging using libthread_db enabled] [New Thread 0xb5ef5720 (LWP 7841)] [KCrash handler] #6 QUrl::host (this=0x60) at io/qurl.cpp:4253 #7 0xb7d93f81 in searchIdleList (idleSlaves=@0x828305c, url=@0x60, protocol=@0xbfb9a9cc, exact=@0xbfb9aa6b) at /build/buildd/kde4libs-4.1.2/kio/kio/scheduler.cpp:609 #8 0xb7d9451e in KIO::SchedulerPrivate::findIdleSlave (this=0x8283008, job=0x872fa38, exact=@0xbfb9aa6b) at /build/buildd/kde4libs-4.1.2/kio/kio/scheduler.cpp:683 #9 0xb7d96b5a in KIO::SchedulerPrivate::startJobDirect (this=0x8283008) at /build/buildd/kde4libs-4.1.2/kio/kio/scheduler.cpp:588 #10 0xb7d96c38 in KIO::SchedulerPrivate::startStep (this=0x8283008) at /build/buildd/kde4libs-4.1.2/kio/kio/scheduler.cpp:433 #11 0xb7d96df7 in KIO::Scheduler::qt_metacall (this=0x827cd70, _c=QMetaObject::InvokeMetaMethod, _id=6, _a=0xbfb9ab48) at /build/buildd/kde4libs-4.1.2/obj-i486-linux-gnu/kio/scheduler.moc:101 #12 0xb765df79 in QMetaObject::activate (sender=0x828300c, from_signal_index=4, to_signal_index=4, argv=0x0) at kernel/qobject.cpp:3016 #13 0xb765e642 in QMetaObject::activate (sender=0x828300c, m=0xb773dae4, local_signal_index=0, argv=0x0) at kernel/qobject.cpp:3086 #14 0xb769b817 in QTimer::timeout (this=0x828300c) at .moc/release-shared/moc_qtimer.cpp:126 #15 0xb76650fe in QTimer::timerEvent (this=0x828300c, e=0xbfb9b038) at kernel/qtimer.cpp:263 #16 0xb76589fa in QObject::event (this=0x828300c, e=0xbfb9b038) at kernel/qobject.cpp:1105 #17 0xb6bc6f9c in QApplicationPrivate::notify_helper (this=0x80b8e80, receiver=0x828300c, e=0xbfb9b038) at kernel/qapplication.cpp:3800 #18 0xb6bcbbf9 in QApplication::notify (this=0xbfb9b2ac, receiver=0x828300c, e=0xbfb9b038) at kernel/qapplication.cpp:3392 #19 0xb7ade483 in KApplication::notify (this=0xbfb9b2ac, receiver=0x828300c, event=0xbfb9b038) at /build/buildd/kde4libs-4.1.2/kdeui/kernel/kapplication.cpp:311 #20 0xb76490b9 in QCoreApplication::notifyInternal (this=0xbfb9b2ac, receiver=0x828300c, event=0xbfb9b038) at kernel/qcoreapplication.cpp:591 #21 0xb7676c01 in QTimerInfoList::activateTimers (this=0x80bc1f4) at ../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:215 #22 0xb76744a0 in timerSourceDispatch (source=0x80bc1c0) at kernel/qeventdispatcher_glib.cpp:166 #23 0xb6293dd6 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0 #24 0xb6297193 in ?? () from /usr/lib/libglib-2.0.so.0 #25 0xb629774e in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0 #26 0xb76749f8 in QEventDispatcherGlib::processEvents (this=0x80b40d0, flags=@0xbfb9b198) at kernel/qeventdispatcher_glib.cpp:325 #27 0xb6c5aa25 in QGuiEventDispatcherGlib::processEvents (this=0x80b40d0, flags=@0xbfb9b1c8) at kernel/qguieventdispatcher_glib.cpp:204 #28 0xb764833d in QEventLoop::processEvents (this=0xbfb9b240, flags=@0xbfb9b204) at kernel/qeventloop.cpp:149 #29 0xb76484cd in QEventLoop::exec (this=0xbfb9b240, flags=@0xbfb9b248) at kernel/qeventloop.cpp:200 #30 0xb764a74d in QCoreApplication::exec () at kernel/qcoreapplication.cpp:849 #31 0xb6bc6897 in QApplication::exec () at kernel/qapplication.cpp:3330 #32 0x08080a89 in ?? () #33 0xb673a450 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6 #34 0x080619d1 in _start () #0 0xb7f0b410 in __kernel_vsyscall () *** Bug 170115 has been marked as a duplicate of this bug. *** *** Bug 166435 has been marked as a duplicate of this bug. *** A rather difficult crash to debug, given that it comes from KIO scheduling so it must be related to timing of slaves going in and out, etc. I haven't seen precise instructions on how to trigger it yet (because I guess there isn't really a way to do that...) I'm guessing that "job" is already deleted on the line return searchIdleList(idleSlaves, job->url(), jobData.protocol, exact); but I have no idea why and from where. It would be really helpful if you (anyone who has some chance of triggering this crash) could run the application in valgrind and attach the log (of the part related to the crash) here. I was able to reproduce this 100% reliably when downloading a file. Here's the output of valgrind: ==9572== Memcheck, a memory error detector. ==9572== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==9572== Using LibVEX rev 1804, a library for dynamic binary translation. ==9572== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. ==9572== Using valgrind-3.3.0, a dynamic binary instrumentation framework. ==9572== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. ==9572== For more details, rerun with: -v ==9572== ==9572== My PID = 9572, parent PID = 5004. Prog and args are: ==9572== kget ==9572== --nofork ==9572== ==9572== Invalid read of size 4 ==9572== at 0x4F607E6: Soprano::FilterModel::addStatement(Soprano::Statement const&) (filtermodel.cpp:92) ==9572== by 0x4F48208: Soprano::Model::addStatements(QList<Soprano::Statement> const&) (model.cpp:135) ==9572== by 0x5014D1C: Nepomuk::ResourceFilterModel::addStatements(QList<Soprano::Statement> const&) (resourcefiltermodel.cpp:245) ==9572== by 0x500BED3: Nepomuk::ResourceData::store() (resourcedata.cpp:285) ==9572== by 0x500C4BA: Nepomuk::ResourceData::setProperty(QUrl const&, Nepomuk::Variant const&) (resourcedata.cpp:370) ==9572== by 0x5027A81: Nepomuk::Resource::setProperty(QUrl const&, Nepomuk::Variant const&) (resource.cpp:227) ==9572== by 0x43C2CA1: NepomukHandler::saveFileProperties(Nepomuk::Resource const&) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43C2C37: NepomukHandler::saveFileProperties() (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43B1331: Transfer::setStatus(Job::Status, QString const&, QPixmap const&) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43B24BE: Transfer::load(QDomElement const&) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43B2CA0: Transfer::Transfer(TransferGroup*, TransferFactory*, Scheduler*, KUrl const&, KUrl const&, QDomElement const*) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x76F095A: (within /usr/lib/kde4/kget_kiofactory.so) ==9572== Address 0x694e878 is 0 bytes inside a block of size 24 free'd ==9572== at 0x402371A: operator delete(void*) (in /usr/lib/valgrind/x86-linux/vgpreload_memcheck.so) ==9572== by 0x500F722: Nepomuk::MainModel::~MainModel() (nepomukmainmodel.cpp:173) ==9572== by 0x500E667: Nepomuk::ResourceManager::init() (resourcemanager.cpp:88) ==9572== by 0x500E75F: Nepomuk::ResourceManager::mainModel() (resourcemanager.cpp:227) ==9572== by 0x500EA8E: Nepomuk::ResourceManager::generateUniqueUri() (resourcemanager.cpp:206) ==9572== by 0x5014C2A: Nepomuk::ResourceFilterModel::addStatements(QList<Soprano::Statement> const&) (resourcefiltermodel.cpp:236) ==9572== by 0x500BED3: Nepomuk::ResourceData::store() (resourcedata.cpp:285) ==9572== by 0x500C4BA: Nepomuk::ResourceData::setProperty(QUrl const&, Nepomuk::Variant const&) (resourcedata.cpp:370) ==9572== by 0x5027A81: Nepomuk::Resource::setProperty(QUrl const&, Nepomuk::Variant const&) (resource.cpp:227) ==9572== by 0x43C2CA1: NepomukHandler::saveFileProperties(Nepomuk::Resource const&) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43C2C37: NepomukHandler::saveFileProperties() (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43B1331: Transfer::setStatus(Job::Status, QString const&, QPixmap const&) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== ==9572== Jump to the invalid address stated on the next line ==9572== at 0x0: ??? ==9572== by 0x4F48208: Soprano::Model::addStatements(QList<Soprano::Statement> const&) (model.cpp:135) ==9572== by 0x5014D1C: Nepomuk::ResourceFilterModel::addStatements(QList<Soprano::Statement> const&) (resourcefiltermodel.cpp:245) ==9572== by 0x500BED3: Nepomuk::ResourceData::store() (resourcedata.cpp:285) ==9572== by 0x500C4BA: Nepomuk::ResourceData::setProperty(QUrl const&, Nepomuk::Variant const&) (resourcedata.cpp:370) ==9572== by 0x5027A81: Nepomuk::Resource::setProperty(QUrl const&, Nepomuk::Variant const&) (resource.cpp:227) ==9572== by 0x43C2CA1: NepomukHandler::saveFileProperties(Nepomuk::Resource const&) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43C2C37: NepomukHandler::saveFileProperties() (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43B1331: Transfer::setStatus(Job::Status, QString const&, QPixmap const&) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43B24BE: Transfer::load(QDomElement const&) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x43B2CA0: Transfer::Transfer(TransferGroup*, TransferFactory*, Scheduler*, KUrl const&, KUrl const&, QDomElement const*) (in /usr/lib/libkgetcore.so.4.1.0) ==9572== by 0x76F095A: (within /usr/lib/kde4/kget_kiofactory.so) ==9572== Address 0x0 is not stack'd, malloc'd or (recently) free'd ==9572== ==9572== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 68 from 4) ==9572== malloc/free: in use at exit: 2,520,345 bytes in 35,620 blocks. ==9572== malloc/free: 186,546 allocs, 150,926 frees, 109,236,341 bytes allocated. ==9572== For counts of detected errors, rerun with: -v ==9572== searching for pointers to 35,620 not-freed blocks. ==9572== checked 23,063,100 bytes. ==9572== ==9572== LEAK SUMMARY: ==9572== definitely lost: 9,589 bytes in 365 blocks. ==9572== possibly lost: 25,039 bytes in 957 blocks. ==9572== still reachable: 2,485,717 bytes in 34,298 blocks. ==9572== suppressed: 0 bytes in 0 blocks. ==9572== Rerun with --leak-check=full to see details of leaked memory. I've got a local case here (my somethingawful thread archives) where I can reliably generate this, so please ask me any questions you have (I'm SSJ_GZ on IRC). David's guess appears to be correct - a job is deleted just before it's scheduled to be started (I'll attach a log showing this in a sec). The item to look out for is "0x939efc8". The log is a complete run leading up to the crash. What it looks like is that a job is registered via doJob, then deleted before it is actually run via startJobDirect. Since d->m_slave is 0 at the time of deletion, the job is not cancelJob'd and the subsequent attempted use of the deleted job causes the crash. I hope this is of use to people; as mentioned, grab me on IRC if you want me to add some specific debug output :) When I get time (hopefully tomorrow), I'll add some dummy stuff to SimpleJob so that a flag is set by doJob and cleared by startJobDirect, then add a breakpoint in ~SimpleJob which is triggered if this flag is not clear at the time of destruction. Created attachment 28030 [details]
Log of the crash with added debug info, showing a delete'd job being started.
auxsvr: yours is an unrelated nepomuk/soprano crash, please file a separate bug report for it so that we can assign it to the nepomuk/soprano people. If I leave it crash without valgrind, the backtrace is similar to that of the comments above (findIdleSlave etc.). Should I post it? SVN commit 873972 by dfaure: Fix crash when deleting a job before it starts. Note that applications should use kill rather than delete anyway, on kio jobs (deleting -after- it starts leads to a warning in the dtor, and, hmm, the code to kill the slave is even ifdefed out right now...). CCBUG: 163171 M +1 -1 kio/job.cpp M +24 -0 tests/jobtest.cpp M +2 -0 tests/jobtest.h WebSVN link: http://websvn.kde.org/?view=rev&revision=873972 Simon St James said that my commit seemed to fix the crash, but he wasn't able to find where a job would be deleted without its kill() method being called first. However my commit only changes something in the case kill() is not called; the unit test shows that it was already working fine when kill() is called. So I'm still a bit unsure about this bug, and whether the commit really fixes it, and whether any code is deleting jobs without calling kill on them. But since I don't know how to trigger the bug in the first place, I can't do more currently. *** Bug 170288 has been marked as a duplicate of this bug. *** *** Bug 170869 has been marked as a duplicate of this bug. *** Just as an addendum to what David said: I actually completely lost the ability to reproduce the bug even after reverting David's patch (I originally managed to get a test case that would quite reliably trigger the bug, but it stopped working after a while - probably because it involved loading files over a network, which is notoriously non-deterministic), so I think that there is a high probability that this bug is indeed fixed. *** Bug 173468 has been marked as a duplicate of this bug. *** Any news on this ? Thanks I haven't seen this crash in months, it looks fixed to me. Please disregard the previous comment, wrong bug report. *** Bug 190120 has been marked as a duplicate of this bug. *** *** Bug 199183 has been marked as a duplicate of this bug. *** Backtrace from bug 199183 http://bugsfiles.kde.org/attachment.cgi?id=35124 Reporter is using kde 4.2.2. Is this still an issue in KDE v4.6 or newer ? The KIO::Scheduler itself was rewritten for KDE 4.5. Waw, it's an old one. No I don't remember having seen this issue for quiet a while now ;). (In reply to comment #30) > Waw, it's an old one. No I don't remember having seen this issue for quiet a > while now ;). Great. Feel free to reopen the ticket if you encounter the issue again. |