Bug 132851

Summary: KResolver raises assert KResolverWorkerBase::acquireResolver(): Assertion `th != 0L' failed
Product: [Unmaintained] kdelibs Reporter: Marten Seemann <martenseemann>
Component: knetworkAssignee: Thiago Macieira <thiago>
Status: RESOLVED UNMAINTAINED    
Severity: crash CC: amarok-bugs-dist, cfeck, ian.monroe, iwilcox
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: crash information when trying to add a daap share in amarok-1.4.3
output of gdb amarokapp, after running "bt"
complete output of running gdb amarokapp
Backtrace with line numbers

Description Marten Seemann 2006-08-23 11:28:40 UTC
Version:           SVN (using KDE KDE 3.5.4)
Installed from:    Ubuntu Packages
Compiler:          SVN with MTP, MusicBrainz, using sqlite
OS:                Linux

Newest SVN should search the network for DAAP music shares. I don't have a local network here so I opened a music share using Banshee. Amarok seems to find it but immediately crashes. Here is the console output:

amarok: BEGIN: void DaapClient::resolvedDaap(bool)
amarok:           Banshee Music Share martenlaptop.local local _daap._tcp
amarokapp: /tmp/buildd/kdelibs-3.5.4/./kdecore/network/kresolverworkerbase.cpp:136: void KNetwork::KResolverWorkerBase::acquireResolver(): Assertion `th != 0L' failed.
amarok: END__: virtual void ThreadWeaver::Thread::run() - Took 3.2s
Comment 1 Ian Monroe 2006-09-04 17:32:40 UTC
I've seen this bug reported before and it doesn't make much sense. We don't do anything weird with kresolver.
Comment 2 Rick W. Chen 2006-09-06 07:30:38 UTC
I got the exact same thing as reported here before. I've solved it now but I'm not sure what was actually the problem, but I can say it's not amarok. What I did was to completely remove all the kdelibs and related things. Then reinstall the required packages again. after tha amarok doesn't crash like that anymore and it can do all the daap things.

I don't know how helpful this is but I am using kde-latest repositories and I build amarok from svn. I am using gnome desktop, and quite a while ago i was messing with kde apps and the repositories so some problems might have come from there.
Comment 3 Ian Monroe 2006-09-06 07:34:37 UTC
Well asserts are only thrown if kdelibs are compiled with debug on. So likely your new kdelibs was just compiled differently.
Comment 4 Antoine Latter 2006-09-09 22:08:28 UTC
I have this same bug, using the latest Kubuntu packages.

It goes away after I execute "sudo /etc/init.d/avahi-daemon stop", but I wouldn't really consider that a fix.
Comment 5 Ian Monroe 2006-09-10 01:35:25 UTC
You just have a normal kdelibs from Kubuntu?
Comment 6 jamese 2006-09-10 15:03:16 UTC
Can confirm this on Kubuntu Dapper, with kubuntu.org 3.5.4 packages.
Am using mt-daapd which starts fine, can browse for it using zeroconf:/ (appears as "WWW Server"). When trying to add the share, amarok crashes after step 5 in http://amarok.kde.org/wiki/Music_Sharing

will create an attachment showing the output from amarok when it crashes.

Cheers
James
Comment 7 jamese 2006-09-10 15:04:43 UTC
Created attachment 17690 [details]
crash information when trying to add a daap share in amarok-1.4.3
Comment 8 Ian Monroe 2006-09-10 21:39:55 UTC
Just talked to Riddell, they have kdelibs compiled with debug on, but stripped. He put removing asserts on his todo. 

of course, it would still be nice to not have this issue. I'll send this to kdelibs, they will be more knowledgeable.


Here's the relevant Amarok code that is activating the assert:
QString
DaapClient::resolve( const QString& hostname )
{
    KNetwork::KResolver resolver( hostname );
    resolver.start();
    if( resolver.wait( 5000 ) )
    {
        KNetwork::KResolverResults results = resolver.results();
        debug() << "Resolver error code (0 is no error): " << resolver.errorString( results.error() ) << ' ' << hostname << endl;
        if( !results.empty() )
        {
            QString ip = results[0].address().asInet().ipAddress().toString();
            debug() << "ip found is " << ip << endl;
            return ip;
        }
    }
    return "0"; //error condition
}
Comment 9 Thiago Macieira 2006-09-10 22:22:03 UTC
I need a backtrace of the failed assertion, with line-numbers.
Comment 10 Ian Monroe 2006-09-10 23:22:54 UTC
/tmp/buildd/kdelibs-3.5.4/./kdecore/network/kresolverworkerbase.cpp:136: void KNetwork::KResolverWorkerBase::acquireResolver(): Assertion `th != 0L' failed.

Thiago, this isn't enough?

If not, could someone get a backtrace? Directions are at:
http://amarok.kde.org/wiki/Debugging_HowTo
Comment 11 Thiago Macieira 2006-09-10 23:46:46 UTC
No, I need to know what called the acquireResolver function and in which context.

This is the first time in 3 years that that assert has been hit.
Comment 12 jamese 2006-09-11 13:21:49 UTC
Created attachment 17715 [details]
output of gdb amarokapp, after running "bt"
Comment 13 jamese 2006-09-11 13:23:47 UTC
Created attachment 17716 [details]
complete output of running gdb amarokapp
Comment 14 jamese 2006-09-11 13:27:30 UTC
Have added attachments showing gdb output. Let me know if I can help further.
Cheers
James
Comment 15 Thiago Macieira 2006-09-11 13:29:42 UTC
I need debugging symbols and line-numbers, even though I still don't know how this problem is possible.

th is never null at that point (inside run()).
Comment 16 Isaac Wilcox 2006-09-19 18:08:14 UTC
Some Kubuntu specifics that might be generally helpful...

I'm seeing this problem on my Kubuntu-Dapper-plus-hacks-laptop, and the Dapper on there is so hacked up it can't even build anything from source.  So I installed 1.3.3 on my pretty-much-vanilla-Dapper-desktop just to see.  With just 1.3.3 (and its dependencies from backports) installed, Amarok didn't crash.  (It didn't list my DAAP shares either, but that's for another Amarok bug, Ian :) )  On a hunch, I added KDE 3.5.4 sources (from http://kubuntu.org/packages/kde-354) and replaced:
 kdelibs4c2a 4:3.5.2-0ubuntu18.1
with:
 kdelibs4c2a 4:3.5.4-0ubuntu2~dapper1
(and the associated deps) and restarted Amarok.  It crashed.  So, reverting to stock Dapper kdelibs might be a usable workaround.

As I can't revert those kdelibs packages on my laptop, I'm currently building debug versions of the kdelibs* packages on the desktop so I can generate the full symbol-and-linenumbers backtrace Thiago needs.  I'll post more info when I have some.
Comment 17 Isaac Wilcox 2006-09-19 18:15:41 UTC
I meant 1.4.3.  D'oh.
Comment 18 Isaac Wilcox 2006-09-19 19:45:51 UTC
Created attachment 17840 [details]
Backtrace with line numbers

Finally managed to get Ubuntu's 3.5.4 package to build from source with debug. 
Here's a backtrace with line numbers.
Comment 19 Isaac Wilcox 2006-09-20 16:39:36 UTC
Think I know what the problem is, but not the best way to fix it.

1. A RequestData gets put on the work queue for some KResolverThread to do
2. A KResolverThread gets started, and runs its own run(), which calls requestData(), which calls findData().
3. findData() locates the RequestData structure in the queue.  The RequestData structure (called "data") already has a worker object (a *KGetAddrinfoWorker*...this is important!) and some data, and findData does "data->worker->th = <the current KResolverThread>", so th is initialised.  Control returns to KResolverThread::run().
4. KResolverThread decides to do the work, and calls data->worker->run(), which is really KGetAddrinfoWorker::run().
5. KGetAddrinfoWorker::run() creates a GetAddrInfoThread object and calls GetAddrInfoThread::run() on it.

So, we have a class hierarchy something like (forgive the ASCII):

      KResolverWorkerBase (the class that owns "th")
        /              \
GetAddrInfoThread   KStandardWorker
                        |
                    KGetAddrinfoWorker

findData initialized "th" in the KGetAddrinfoWorker.  That worker, when run, decided to create another worker, a GetAddrInfoThread, and hand the work to him - but in the GetAddrInfoThread worker "th" is uninitialized.  "th" is private in the base class, so only three classes can get at it: KResolverWorkerBase, and KResolverManagers and KResolverThread because they're "friend"s.

Solutions:
1. It's not really clear to me why there are two kinds of getaddrinfo() handler; if we had just one, this wouldn't be a problem.  So, combine the two into one class, avoiding the need to delegate.
2. Pass in an optional "th" to the GetAddrInfoThread constructor.
3. Make "th" protected instead of private, and do "newworker.th = th" in KGetAddrinfoWorker::run().
4. The dirtiest hack is simply to remove the assert.  The lookup works without it, and this might be because no data is actually ever accessed via 'th' (and this might explain why the problem has only just started cropping up - I'll bet it's just the first time the assert has been turned on in years).  I can't tell for sure because in my subtree I can't see where RES_INIT_THREADSAFE comes from, which affects things.

I think solution 1 is the cleanest, but Thiago probably knows best.
Comment 20 Thiago Macieira 2006-09-20 19:11:17 UTC
Thanks for the detailed explanation, but bear in mind that this code has existed for 3 years already and this is, so far, the only case where the assert is ever hit. So there has to be a race condition never before seen here.

I don't see how findData() can find a RequestData that already has a worker (step 3 in your list). "newRequests" contains only elements with th == 0 (because that's how they are created). Whenever "th" is set, it's also removed from "newRequests".

Maybe if one of you guys can show me this error live in Dublin, I can fix it.
Comment 21 Thiago Macieira 2006-09-20 19:23:29 UTC
Ok, I see the error. KGetAddrinfoWorker creates a worker on the stack and doesn't schedule it for running. Since it never goes through findData(), "th" is never set to anything.
Comment 22 Ian Monroe 2006-09-20 19:58:57 UTC
SVN commit 586828 by ianmonroe:

 r6793@wasabi:  ian | 2006-09-20 12:57:28 -0500
 Thiago's work around for a KResolver bug activated when kdelibs is compiled with
 asserts on (like is done in KUbuntu).
 CCBUG: 132851
 CCMAIL: imbrandon@kubuntu.org
 


 _M            . (directory)  
 M  +2 -0      ChangeLog  
 M  +1 -0      src/mediadevice/daap/daapclient.cpp  


--- trunk/extragear/multimedia/amarok/ChangeLog #586827:586828
@@ -29,6 +29,8 @@
     * Show a proper tag dialog when viewing information for DAAP music shares.
 
   BUGFIXES:
+    * The DAAP client would crash Amarok under certain conditions when 
+      kdelibs was compiled with asserts on. (BR 132851)
     * Configuring the toolbar would disable the stop button. Patch by 
       Markus Kaufhold <M.Kaufhold@gmx.de>. (BR 132477)
     * Xine-engine: Pausing during crossfade would not work properly. Patch by
--- trunk/extragear/multimedia/amarok/src/mediadevice/daap/daapclient.cpp #586827:586828
@@ -526,6 +526,7 @@
 DaapClient::resolve( const QString& hostname )
 {
     KNetwork::KResolver resolver( hostname );
+    resolver.setFamily( KNetwork::KResolver::KnownFamily ); //A drudic incantation from Thiago. Works around a KResolver bug #132851
     resolver.start();
     if( resolver.wait( 5000 ) )
     {
Comment 23 Christoph Feck 2011-07-28 14:40:46 UTC
KResolver is no longer maintained.