Version: 0.9 (using KDE KDE 3.3.0) Installed from: Gentoo Packages Compiler: gcc-3.4.1 OS: Linux When receiving an icq-message, gaim segfaults. I'm compiling kdenetwork with debug options, so I don't have a backtrace at the moment If I don't have a window open, i.e. a bubble appears at the systray with a new message, kopete wouldn't crash. But any other incoming icq-message make it crash. This bug is very severe and I haven't seen it beeing reported already, so I suspect it to be an x86_bug (May be casting a pointer or so) Note: I don't have _any_ aggressive Optimization flags (-O2 -fomit-frame-pointer -pipe), they are all considered perfectly stable.
Huch, I forgot to add: I'm on x86-64 (Gentoo-amd64), this might be the important thing here ;)
Is there a way to disable kcrash? That's all I get from KCrash, but I think I'd get more out of gdb. Can you help me with this? Using host libthread_db library "/lib/libthread_db.so.1". 0x00000038202e637e in waitpid () from /lib/libpthread.so.0 #0 0x00000038202e637e in waitpid () from /lib/libpthread.so.0 #1 0x000000381d9efb11 in KCrash::defaultCrashHandler () from /usr/kde/3.3/lib/libkdecore.so.4 #2 0x00000038202e5107 in __pthread_clock_settime () from /lib/libpthread.so.0 #3 0x000000382069c2a0 in killpg () from /lib/libc.so.6 #4 0x0000000000000000 in ?? () #5 0x0000000000000000 in ?? () #6 0x0000000000000000 in ?? () #7 0x0000003800000002 in ?? () #8 0x0000000000000000 in ?? () #9 0xfefefefefefefeff in ?? () #10 0x0000000000c7f4d0 in ?? () #11 0x0000000000000001 in ?? () #12 0x00000075a76a3030 in ?? () #13 0x00000075a76a2f70 in ?? () #14 0x0000000000977170 in ?? () #15 0x0000000000a308e0 in ?? () #16 0x0000000800abb588 in ?? () #17 0x00000075a76a25e0 in ?? () #18 0x00000075a76a2640 in ?? () #19 0x0000000000000000 in ?? () #20 0x0000000800abb588 in ?? () #21 0x0000000800abb588 in ?? () #22 0x0000000000c637b8 in ?? () #23 0x00000075a76a2630 in ?? () #24 0x0000003821d2b088 in QColor::blue (this=0xc3c9000000ff2508) at qcolor.h:196
Allright, I found a way to disable kcrash. But I think, the problem lies in the icq-plugin. I'm not getting a segfault with gdb, the program silently exits (Kcrash reportet a segfault). However, I got some debug-output from kopete, but I don't know how useful this is: kopete (oscar): [virtual void OscarSocket::slotRead()] SNAC(4,20), id=2264531911 kopete (oscar): [virtual void OscarSocket::slotRead()] SNAC(4,20), id=2264532126 kopete (oscar): [virtual void OscarSocket::slotRead()] SNAC(4,7), id=2264532287 kopete (oscar): [void OscarSocket::parseIM(Buffer&)] IM received on channel 2 from '165579559' kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 message kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] msgType=1, msgFlags=0, status=0, priority=33 kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 IM, normal/auto message After the last message, kopete died
Try to run it with valgrind ... if it works under x86-64. A good command with it, could be: valgrind --tool=addrcheck --num-callers=30 -- kopete --nofork This will report invalids read/write with a trace up to 30 lines.
Hmm... valgrind does not seem to work on amd64 yet. http://www.durables.org/software/valgrind/ - that's the only amd64 thing I found, and it can only handle 32bit executables.... so this is not an option. But I'm open to any suggestions. kopete is unusable like this...
Uhm... I think the problem most likely lies in the icq-plugin. Kopete-0.8.4 worked for me, so is there a way to get a diff of the icq-plugin between v-0.8.4 and 0.9? May be the diff is not huge....
the diff is huge. your debug output did help though. I'll take a look at it. What client sent you the message and did it have rich text formatting? (colors and stuff like that)
was the user on your contact list?
also, can you run kopete in gdb and get a decent backtrace? you'll need to run it with the '--nofork' argument. Hopefully this will be better than the first backtrace.
-Ok: I tried both with people on the contact list and not, I was able to crash with both, though it took longer with one that was not on my contact list. It was very odd - it took like 2-3 messages until kopete died (I'll try that again though). With somebody on the contact list, kopete instantly died. -There was no text-formatting -When setting --nofork in gdb, kopete would hang on startup. Kopete wouldn't listen to SIGTERM, but only to SIGKILL - even gdb became unresponsive...
Oh, and I just tried running kopete without gdb (Letting kcrash handling the segfault) with --nofork. Kopete wouldn't freeze, but the backtrace I got is the same I already posted
I truly hate spamming, but I just forgot another thing... ;) When running inside gdb with --nofork, the program stalled after the messages (There were a lot more above of course): kopete (jabber): [void JabberResourcePool::removeLock(const XMPP::Jid&)] Removing resource lock for config@jabber.org kopete (jabber): [void JabberResourcePool::removeLock(const XMPP::Jid&)] No locks found. kopete (jabber): [virtual void JabberByteStream::close()] Closing stream. kopete (jabber): [virtual void JabberConnector::connectToServer(const QString&)] Initiating connection to jabber.org kopete (jabber): [bool JabberByteStream::connect(QString, QString)] Connecting to jabber.org, service 5222
Wait a minute.... the debug output is indeed helpful ;) kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 message kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] msgType=1, msgFlags=0, status=0, priority=33 kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 IM, normal/auto message That's what I got right? parseAdvanceMessage uses kdebug twice... however, the second time parseAdvanceMessage gets called, The second kdebug won't show (with msgType=.... etc) - so the error must happen in oscarsocket.icq.cpp between lines 786 and 815 Could I be right about this?
ok, so, to make sure that i understand correctly: the first time parseAdvanceMessage gets called, you get the three lines of debug output from your last post. However, the second time parseAdvanceMessage gets called (with a new message) then you don't get the line of debug output that contains "msgType = ..." Is that correct?
Ok, this is really strange.....: You are basically understanding me correctly. I started up kopete, let a friend send me a message, and the last three lines of output I had were the three I pointed out, indicating that the second time, OscarSocket::parseAdvanceMessage got called didn't terminate anymore: (last three lines of kopete) kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 message kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] msgType=1, msgFlags=0, status=0, priority=33 kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 IM, normal/auto message However, I have to complicate things a little bit: but I just found somebody, with which it would not crash.... I got a very lengthy output, I have put it on the net: http://www.n.ethz.ch/student/bschindl/kopete The client, my friend was using (where I did get no crash): licq The client, another friend was using (where it did crash): icq (Normal windows version) The messages I received from friend 1(using licq), gave the following output: kopete (oscar): [void OscarSocket::parseSimpleIM(Buffer&, const UserInfo&)] RECV TYPE-1 IM from '42566108' kopete (oscar): [void OscarSocket::parseSimpleIM(Buffer&, const UserInfo&)] TLV(2) kopete (oscar): [void OscarSocket::parseMessage(const UserInfo&, OscarMessage&,BYTE, BYTE)] Got a normal message: 'hmm.. grins..' kopete (oscar): [void OscarAccount::slotReceivedMessage(const QString&, OscarMessage&)] account='57828646', type=0, sender='42566108' etc (I trust you not to misuse these icq-numbers ;) ) When receiving from friend 2 (using regular icq): kopete (oscar): [void OscarSocket::parseIM(Buffer&)] IM received on channel 2 from '294347960' kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 message kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] msgType=1, msgFlags=0, status=0, priority=33 kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 IM, normal/auto message KCrash: Application 'kopete' crashing... (Note again, the three times parseAdvanceMessage is listed, and per call, two entries have to be there) And btw... I tried it with somebody using miranda: same thing happens: crash
btw.... is it on purpose that parseAdvancedMessage gets called twice? If I look at the parseSimpleIM, I only have one call listed (2 Debug outputs per call as well)... why is that?
I'm terribly sorry.... I was wrong.. the third output came from later in the function call. So... forget about the calling twice and not terminating (Which I'm not sure abourt)
> btw.... is it on purpose that parseAdvancedMessage gets called twice? If I look at the parseSimpleIM, I only have one call listed (2 Debug outputs per call as well)... why is that? It actually doesn't get called twice, it's just three lines of debug output. :-)
I've seen that you have a lot of kdebugs commented out in the parseAdvanceMessage I'm installing now with uncommented kdebugs - hopefully this gives a clue where it's crashing
All right, I got the output from a crash: parseAdvancedMessage really does not terminage: kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 message kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] msgType=1, msgFlags=0, status=0, priority=33 kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] RECV TYPE-2 IM, normal/auto message kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] fg color=(0, 0, 0, 0) kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] bg color=(255, 255, 255, 0) kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] Peer announces message is RTF! kopete (oscar): [void OscarSocket::parseAdvanceMessage(Buffer&, UserInfo&, Buffer&)] messagetext='{\rtf1\ansi\ansicpg1252\deff0\deflang1031{\fonttbl{\f0\fswiss\fprq2\fcharset0 MS Sans Serif;}} {\colortbl ;\red21\green43\blue77;} \viewkind4\uc1\pard\cf1\f0\fs22 k\par } kopete (oscar): ' KCrash: Application 'kopete' crashing...
I just forgot to add: This means that it crashes in oscarsocket.icq.cpp between lines 886 and 895
The first line after the last successful kdDebug is this one: OscarContact *contact = static_cast<OscarContact*>(mAccount->contacts() [tocNormalize(user.sn)]); Maybe contact is NULL? Try adding: if( contact ) kdDebug << k_funcinfo << "contact was NULL!" << endl; after that line. See if the message gets printed.
I've looked at ServerToQString, and it looks safe in the case that rtf == true. So I guess that it's this line (line 890): oMsg.setText(msgText, OscarMessage::Rtf); That ends up calling some Flex-generated code. Without a backtrace, I'm not sure I can go any further. But you might want to try to regenerate rtf.cc from rtf.ll (instructions at the top of the latter file -- maybe an old version of Flex itself was generating some non-x86-64-compatible code?). As for your problem running Kopete in gdb (with --nofork), how long did you wait after you got the jabber-related debug messages above? It could be waiting for something to time out.
I'll have a look for it this, but I'm still in the middle of compiling with lots of debug-information. After this, I'll be able to tell you which function failed. I have learned one thing on amd64: NULL != 0 - gnometris crashed because it passed 0 as a NULL-Pointer. Just letting you know. As for --nofork: I tried with the --noconnect switch and then connected manually to icq (which works _very_ fast usually) and the program stalled as well
On Wednesday 18 August 2004 6:22 pm, config@gmx.ch wrote: > I have learned one thing on amd64: NULL != 0 - gnometris > crashed because it passed 0 as a NULL-Pointer. Just letting you know. Interesting. I guess the compiler in question has #define NULL ((void*)0) I read that the 0 -> pointer conversion is buggy on some versions of gcc targetting x86_64, so that could be the cause of that crash. But C++ programmers basically never use the NULL macro, and write 0 instead (mainly because ((void*)0) does not convert to other pointer types without a cast in C++).
I don't really know, but passing to a function 0, and then testing agains NULL will not work, that's why gnometris crashed. As far as I know, gcc is fine - it's just an arch-specific thing (Pointers are 64bit, and a 0 is 32bit, and the conversion does not result in the same thing) http://bugs.gentoo.org/show_bug.cgi?id=48948 - here is the description of the thing
btw... contact was not NULL or the function did not terminate. I forgot to include one debug statement, so I'm compiling again to figure out where it broke... *sigh*
Ouch - they were passing 0 to a varargs function, then trying to read a pointer from the argument list. I'm currently reviewing our code to make sure we've not done the same thing.
OscarContact *contact = static_cast<OscarContact*>(mAccount->contacts() [tocNormalize(user.sn)]); That's where it crashes. I'll look into it (probably not today anymore, since I'm really busy, but tomorrow (too bad this bug will be in release....)
Awww... I just overlooked another few lines... I was wrong again, it doesn't crash there, it doesn't crash where it's running ServerToQString (Which is good news ;) ) It seems to crash on: oMsg.setText(msgText, OscarMessage::Rtf); (I double checked, there is no line saying oMsg.setText finished... which I told it to do....)
Did this parser change over time? I'm asking because it could be the .ll code or flex which is broken. I've seen that there have been some updates to flex in july, so may something broke back then. I'll try it with an older version once... not today anymore though
Looking for this bug I found something else in oscartypes.h: typedef unsigned long DWORD; which is right for x86, but not for x86_64, as long on this platform is a 64bits. shouldn't it be instead: typedef unsigned DWORD; or typedef unsigned int DWORD;
Michel: better would be some automake magic, or uint32_t from inttypes.h. This definitely needs fixing, though it doesn't look to be the cause in this case (not that we can really tell without a backtrace). On the subject of backtraces, I've been told that "SIG32 events need to be ignored by gdb in order to keep kopete running when attempting to connect". I don't know how to set gdb to ignore them, but it sounds like something to try.
http://mythtv.org/pipermail/mythtv-dev/2004-March/020553.html handle SIG32 pass noprint nostop Like this, we can disable SIG32 events - I'll try it out and port results as soon as possibel
I'm sorry to disappoint you again, but kopete stalled again, but it got further this time: It requested the wallet-password, which it didn't before (without ignoring SIG32) That's the output I got when it stalled: kopete (oscar/icq): [const long unsigned int ICQAccount::fullStatus(long unsigned int)] ORing with show ip flag kopete (oscar/icq): [const long unsigned int ICQAccount::fullStatus(long unsigned int)] ORing with web aware flag kopete (oscar/icq): [virtual void ICQAccount::setStatus(long unsigned int, const QString&)] LOGGING IN; accountId='287665734', status=196608, awaymessage= libkopete: [void Kopete::WalletManager::slotWalletChangedStatus()] isOpen: true kopete (oscar): [void OscarSocket::doLogin(const QString&, int, const QString&,const QString&, const QString&, long unsigned int, const QString&)] Connecting to 'login.icq.com', port=5190 kopete (oscar): [void OscarAccount::slotOurStatusChanged(unsigned int)] Called;newStatus=10 kopete: [const Kopete::ContactPropertyTmpl& Kopete::Global::Properties::createProp(const QString&, const QString&, const QString&, bool) const] CREATING NEW ContactPropertyTmpl WITH key = onlineSince, label = Online Since, persisten = false kopete: [Kopete::ContactPropertyTmpl::ContactPropertyTmpl(const QString&, constQString&, const QString&, bool, bool, bool)] Creating new template for key = 'onlineSince' After this, kopete stalled (Still with the disconnected symbol in the systray) I'll try out the WORD thing - but this will take time to compile - again... If you've got any good idea to get a backtrace... please.... ;)
All right: I tried two things: I regenerated the flex-parser with my version generated by my version of flex (which differed a lot from the original version) and I replaced this unsigned long with unsigned int Unfortunatelly, kopete still crashed :(
Hei - I figured a few things - I'm not there yet, but it's a start ;) In QString RTF2HTML::Parse(const char *rtf, const char *_encoding), I added a few kdDebugs.... and indeed, in this function it crashes. The good: It does not crash in yylex - so it seems flex is ok Where does it crash? Well, here is my output: kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Finished yylex - the value of res is: 3 kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Start a for-iteration kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Finished yylex - the value of res is: 3 kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Start a for-iteration kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Finished yylex - the value of res is: 3 kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Start a for-iteration kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Finished yylex - the value of res is: 3 kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Start a for-iteration kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Finished yylex - the value of res is: 3 kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Start a for-iteration kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Finished yylex - the value of res is: 3 kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Start a for-iteration kopete (oscar): [QString RTF2HTML::Parse(const char*, const char*)] Finished yylex - the value of res is: 4 (There were a lot more "Start a for-iteration" etc messages, but I just posted the last few). When res is 4, this means it goes into case TXT, which makes: case TXT: cur_level.setText(yytext); break; I'm now going to trace further, but hey - it's a start :D
Oh, and btw - the reason why it didn't work with one friend is, that the message came in plain format (void OscarMessage::setText(const QString &txt, MessageFormat format)) - which works fine here too.
All right - I've traced it as far I can pretty much. There is still something to check, which I'll do tomorrow (I don't have any friends online anymore which have clients that make me crash) So far, what I have gathered... I'll try to reconstruct the backtrace: void Level::resetTag(TagEnum tag) <-- iterates through the for-loop many times. At the time it crashed, res is 4 void Level::setText(const char *str) <-- m_bcolors is false, as is m_bFontTbl, so it goes into the else part void RTF2HTML::FlushOutTags() <-- The for loop iterates two times. The second time, it crashes. When it crashes, t.tag is TAG_FONT_COLOR (if you look at the original backtrace, this is quite logical). It crashes here: QColor &c = colors[t.param-1]; I wanted to output the size of the colors vector and the value of t.param-1, but now all friends are gone, so I'm on my own for now.
Ok, I'm getting closer now... I got the following odd output of my run: kopete (oscar): [void RTF2HTML::FlushOutTags()] t.param > colors.size() was false... index is: 4294967295 Size of the color vector is: 1 And index is the value: t.params - 1 - this is for sure a very odd value. Where do these values get set? I don't know kopetes code very well
Ok, I have tracked this down. The problem is that t.param is an unsigned int, and if you subtract one and param is 0, you get an overflow and you get that well known upper-bound of int I'm seeing here. I'm attaching a patch which fixes the problem for me
Created attachment 7234 [details] Fixes the crash when receiving rtf/utf messages from icq Doing a boundary check for t.param
wow, thanks for the excellent debugging work and the patch. I should be able to review this by the end of the week and have it in for KDE 3.3.1. RL has been quite busy.
Just a little side-note... I was obvously the first to discover this bug. I suspect, that the bug is amd64 specific. This would mean, that t.param is not supposed to become 0. (Otherwise, it would trigger on 32bit arches too) I did not look further to figure this out, since it would require good knowledge of the code.
CVS commit by mattr: Fix byg 87390 with the patch provided by the reporter. Thanks again for the excellent debugging and patch. Should be in KDE 3.3.1 and will be forward ported shortly CCMAIL: 87390-done@bugs.kde.org M +1 -1 rtf.cc 1.3.2.1 M +1 -1 rtf.ll 1.4.2.1 --- kdenetwork/kopete/protocols/oscar/oscarsocket/rtf.cc #1.3:1.3.2.1 @@ -1791,5 +1791,5 @@ void RTF2HTML::FlushOutTags() { // RTF colors are 1-based; colors[] is a 0-based array. - if (t.param > colors.size()) + if (t.param > colors.size() || t.param == 0) break; QColor &c = colors[t.param-1]; --- kdenetwork/kopete/protocols/oscar/oscarsocket/rtf.ll #1.4:1.4.2.1 @@ -126,5 +126,5 @@ { // RTF colors are 1-based; colors[] is a 0-based array. - if (t.param > colors.size()) + if (t.param > colors.size() || t.param == 0) break; QColor &c = colors[t.param-1];
*** Bug 88748 has been marked as a duplicate of this bug. ***