Summary: | gcc bug! appletproxy crashed (SIGSEGV) when changing to clock analog | ||
---|---|---|---|
Product: | [Unmaintained] configure | Reporter: | Roger Larsson <roger.larsson> |
Component: | general | Assignee: | Christian Gebauer <gebauer> |
Status: | RESOLVED NOT A BUG | ||
Severity: | crash | ||
Priority: | VHI | ||
Version: | CVS | ||
Target Milestone: | --- | ||
Platform: | openSUSE | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | Backtrace of crash when changing to analog clock with seconds |
Description
Roger Larsson
2003-12-17 08:27:32 UTC
Created attachment 3737 [details]
Backtrace of crash when changing to analog clock with seconds
The crash when resetting Date is not 100%
reproducible - switch among the different types
can give crashes. But Analog with seconds always
does - race?
I cannot reproduce this with KDE CVS as of December 16th 2003, I start appletproxy as "appletproxy clockapplet" and then keep switching between different configurations of the Analog clock - no crash. Please paste backtraces instead of attaching them: [New Thread 16384 (LWP 1626)] 0x41260a86 in waitpid () from /lib/i686/libpthread.so.0 #0 0x41260a86 in waitpid () from /lib/i686/libpthread.so.0 #1 0x406f2a20 in KCrash::defaultCrashHandler(int) () from /opt/kdecvs/lib/libkdecore.so.4 #2 0x4125f96c in __pthread_sighandler () from /lib/i686/libpthread.so.0 #3 <signal handler called> #4 0x4166dc2d in AnalogClock::paintEvent(QPaintEvent*) (this=0x8294fe0) at clock.cpp:546 #5 0x40b54f56 in QWidget::event(QEvent*) (this=0x8294fe0, e=0xbfffea00) at kernel/qwidget.cpp:4529 #6 0x40aba6fb in QApplication::internalNotify(QObject*, QEvent*) ( this=0xbffff0bc, receiver=0x8294fe0, e=0xbfffea00) at kernel/qapplication.cpp:2582 #7 0x40aba32b in QApplication::notify(QObject*, QEvent*) (this=0xbffff0bc, receiver=0x8294fe0, e=0xbfffea00) at kernel/qapplication.cpp:2470 #8 0x406759f3 in KApplication::notify(QObject*, QEvent*) () from /opt/kdecvs/lib/libkdecore.so.4 #9 0x40a5177f in QApplication::sendEvent(QObject*, QEvent*) ( receiver=0x8294fe0, event=0xbfffea00) at qapplication.h:492 #10 0x40a84a69 in QWidget::repaint(int, int, int, int, bool) (this=0x8294fe0, x=0, y=0, w=54, h=54, erase=false) at kernel/qwidget_x11.cpp:1489 #11 0x40b569b2 in QWidget::repaint(QRect const&, bool) (this=0x8294fe0, r=@0xbfffeaa0, erase=false) at qwidget.h:813 #12 0x40b56534 in QWidget::repaint(bool) (this=0x8294fe0, erase=false) at kernel/qwidget.cpp:5803 #13 0x4166d673 in AnalogClock::updateClock() (this=0x8294fe0) at clock.cpp:450 #14 0x41670b9e in ClockApplet::slotUpdate() (this=0x817a250) at clock.cpp:1020 #15 0x41672cbc in ClockApplet::qt_invoke(int, QUObject*) (this=0x817a250, _id=46, _o=0xbfffebac) at clock.moc:573 #16 0x40b1d4bd in QObject::activate_signal(QConnectionList*, QUObject*) ( this=0x8192b90, clist=0x81b2468, o=0xbfffebac) at kernel/qobject.cpp:2333 #17 0x40b1d35c in QObject::activate_signal(int) (this=0x8192b90, signal=2) at kernel/qobject.cpp:2302 #18 0x40e61cda in QTimer::timeout() (this=0x8192b90) at .moc/debug-shared-mt/moc_qtimer.cpp:82 #19 0x40b41aeb in QTimer::event(QEvent*) (this=0x8192b90, e=0xbfffedfc) at kernel/qtimer.cpp:219 #20 0x40aba6fb in QApplication::internalNotify(QObject*, QEvent*) ( this=0xbffff0bc, receiver=0x8192b90, e=0xbfffedfc) at kernel/qapplication.cpp:2582 #21 0x40ab9bb8 in QApplication::notify(QObject*, QEvent*) (this=0xbffff0bc, receiver=0x8192b90, e=0xbfffedfc) at kernel/qapplication.cpp:2305 #22 0x406759f3 in KApplication::notify(QObject*, QEvent*) () from /opt/kdecvs/lib/libkdecore.so.4 #23 0x40a5177f in QApplication::sendEvent(QObject*, QEvent*) ( receiver=0x8192b90, event=0xbfffedfc) at qapplication.h:492 #24 0x40aa87c8 in QEventLoop::activateTimers() (this=0x8127b78) at kernel/qeventloop_unix.cpp:557 #25 0x40a63096 in QEventLoop::processEvents(unsigned) (this=0x8127b78, flags=4) at kernel/qeventloop_x11.cpp:346 #26 0x40ad01d6 in QEventLoop::enterLoop() (this=0x8127b78) at kernel/qeventloop.cpp:198 #27 0x40ad00f2 in QEventLoop::exec() (this=0x8127b78) at kernel/qeventloop.cpp:145 #28 0x40aba87b in QApplication::exec() (this=0xbffff0bc) at kernel/qapplication.cpp:2705 #29 0x407d780c in kdemain (argc=8, argv=0x80808b8) at appletproxy.cpp:119 #30 0x0804d58c in launch(int, char const*, char const*, char const*, int, char const*, bool, char const*, bool, char const*) () #31 0x0804e28e in handle_launcher_request(int) () #32 0x0804e753 in handle_requests(int) () #33 0x0804f6c5 in main () Recompiled (2004-01-11) still crashes... bad... qt-x11-free-3.2.3 from qt-copy Is it only me? If not it is an ugly one that you do not like reviewers to run into. This works in SLAX 3.0.25 - howcome... My KDE was compiled with --march=pentium3 -Os I guess SLAX was not... If I compile with --march=pentium2 -O1 it works ! My assumption is that gcc might add an instruction that I do not have, or makes an erronous conversion, or optimizes to get into a race (valgrind does not complain - much) Suspect instructions are: cvtsi2ss, movss gcc --version gcc (GCC) 3.3.1 (SuSE Linux) (I do have a pentium3! From /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 8 model name : Pentium III (Coppermine) stepping : 6 cpu MHz : 935.001 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse bogomips : 1867.77 ) When compiling clock.c for -march=pentium3 -O1 you will get cvtsi2ss instructions... But not all pentium3 systems has these instructions! The OS decides! (There are some security issues - so Linux does not enable them = no xmm) objdump -d kdebase/kicker/applets/clock/.libs/clock.o | grep cvtsi2ss 2bc2: f3 0f 2a 85 b4 fc ff cvtsi2ss 0xfffffcb4(%ebp),%xmm0 2c54: f3 0f 2a c6 cvtsi2ss %esi,%xmm0 2d4d: f3 0f 2a c0 cvtsi2ss %eax,%xmm0 2e56: f3 0f 2a c0 cvtsi2ss %eax,%xmm0 2f96: f3 0f 2a 85 b4 fc ff cvtsi2ss 0xfffffcb4(%ebp),%xmm0 If I check my /opt/kdecvs/bin I find the following files that have this instruction, to many to believe but... I report it here now and check it further tomorrow... (Some might need -Os to get the instruction, they might be placed in a rare spot) artsbuilder artscat artscontrol artsrec artstracker fsview gwenview k3b kasteroids kbackgammon kblackbox kblob.kss kbruch kcachegrind kdat kdf keuphoria.kss kfax kfiresaver3d kflux.kss kfontinst kfouleggs kfountain.kss kgoldrunner kgravity.kss khexedit kiten kjumpingcube klickety kmailcvt kmines kmplot kooka kpaint kpat kppp kppplogview krdc kreatecd kreversi krfb ksirtet ksnapshot ksolarwinds.kss kspace.kss kstars ksysguard ktouch kvoctrain kwave.kss kwikdisk mpeglibartsplay quanta rosegarden sfconvert sfinfo umbrello A possible workaround is to compile with -mcpu=pentium3 -march=pentium2 According to Intel's Pentium III Instruction Set Reference manual (ftp://download.intel.com/design/Pentium4/manuals/24547112.pdf), the cvtsi2ss instruction can generate the Undefined Instruction exception if the kernel didn't activate support for fxsave and restore instructions. I know Linux supports those, but not in every system. Was your kernel compiled for Pentium III? There are also a couple other conditions (bits TS or EM in CR0; floating-point exceptions), but I don't believe those to be the case. And I can't test because my Athlon processor doesn't support that instruction. In any event, this is NOT a KDE bug. you want to report here: http://gcc.gnu.org/bugzilla/ - Not a KDE bug. - Not a gcc bug. When Pentium3 as such has this feature. - Not a Linux kernel bug. It protects a minor rare security problem. But it will hurt KDE - applications crashes all over... It should be mentioned in the release documentation! "Do not build for the Pentium3 architecture" I was just asking for more information to see if it was a kernel bug or a gcc bug. Can someone with a Pentium III processor (a Coppermine if possible) and a Pentium III-targeted Linux kernel please try the following program? $ cat test.s .global _start _start: xorl %eax,%eax cvtsi2ss %eax,%xmm0 movl %eax, 1 int $0x80 $ gcc -nostdlib -o test test.s If it crashes, http://bugzilla.kernel.org is where you should report. ----------- (gdb) disassemble _start Dump of assembler code for function _start: 0x08048074 <_start+0>: xor %eax,%eax 0x08048076 <_start+2>: cvtsi2ss %eax,%xmm0 0x0804807a <_start+6>: mov %eax,0x1 0x0804807f <_start+11>: int $0x80 End of assembler dump. (gdb) n The program is not being run. (gdb) run Starting program: /home/roger/test Program received signal SIGSEGV, Segmentation fault. 0x0804807a in _start () (gdb) cont Continuing. Program terminated with signal SIGSEGV, Segmentation fault. The program no longer exists. ---------------- The strange thing is that it says nothing when running from command line directly... Oh... test is a bad name there is a test in bash... ./test crashes! (gdb) r Starting program: /tmp/test Program received signal SIGILL, Illegal instruction. The crash is different then. Subject: Re: gcc/linux bug! appletproxy crashed when changing to clock analog kclockapplet does not crash on exactly that line but soon after... And I can not get kasteroids to crash... hmm... Ahh.. Bugs in your test program, it should look like this... .global _start _start: xorl %eax, %eax cvtsi2ss %eax,%xmm0 movl $1, %eax int $0x80 Then it will run on my Coppermine (without xmm in flags... so Linux does not support it. Store / restore...) So either it is a race in kclockapplet or it is Linux that does not save (it says so...) Tried another thing... Modified the kdebase/kicker/applets/clock/Makefile to have -mcpu=pentium3 -march=pentium2 -O3 to get maximum optimization for my CPU without getting the xmm instructions. # touch clock.cpp # make install This works as expected. Then change -march to pentium3 -mcpu=pentium3 -march=pentium3 -O3 this enables the xmm instructions. # touch clock.cpp # make install This crashes kickers analog clock applet! Earlier I got it to crash with -march=pentium3 -O1 Anyone can try to compile this file with maximum optimization for their processor to see if it is a race - any takers? Ah, thanks for the fix. I had never noticed it because I get a SIGILL in the instruction. Now, I didn't quite understand: you said the program runs fine in your system, is that so? If so, why do you claim Linux doesn't support it? Also note that OS support is required for XMM instructions to be enabled (since fnsave won't save everything, but fxsave will). Your kernel must be compiled with proper support, try checking that. As for checking the exact crash locus, in GDB run: (gdb) x/a $pc for a dump of the registers: (gdb) info reg (gdb) info all-reg (or simply i r) It looks like the 'xmm' flag has been renamed 'sse'. So it has nothing to do with these instructions... - Linux kernel is probably correct. - GCC might generate errornous code for calc.cpp with -march=pentium3 - There might be a race in the code that only shows with full optimization. (contraindication: -march=pentium3 -O0 is buggy, while -march=pentium2 -mcpu=pentium3 -O3 is not) It would be interesting to try this with other 'sse' capable architectures. Argh... -march=pentium3 -O0 is OK it does not generate 'sse' instructions -march=pentium3 -O1 is NOK generates 'sse', but should(?) be slower than -march=pentium2 -mcpu=pentium3 -O3 that is OK no 'sse' gcc generates an unaligned 128 bit store => crash (this is a gcc bug!) New test code. # cat > test2.s .global _start _start: xorl %eax, %eax cvtsi2ss %eax,%xmm0 movl %esp,%ebp movaps %xmm0, -16(%ebp) movaps %xmm0, -8(%ebp) # simplified case, uncomment movaps %xmm0, 0xfffffc78(%ebp) # gets used in clock movl $1, %eax int $0x80 # gcc -Wall -g -nostdlib -o test2 test2.s # gdb ./test2 A processor with sse (Pentium3 and forward) should not crash on the first movaps. Pentium3 crashes on either of the following. What about others? Forgot gcc bug number: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13685 If you get a SIGSEGV in an application you can try: # gdb application (gdb) run --nocrashhandler [--nofork] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16384 (LWP 18506)] 0x41a84d47 in KivioArrowHead::paintArrowTriangle(KivioArrowHeadData*, bool) (this=0x83500d0, d=0xbfffe7a0, solid=false) at kivio_arrowhead.cpp:296 296 nvecX = - vecX / length; (gdb) disassemble 0x41a84d47 - - - 0x41a84d45 <KivioArrowHead::paintArrowTriangle+147>: fdiv %st(1),%st 0x41a84d47 <...+149>: movaps %xmm0,0xffffff80(%ebp) 0x41a84d4b <...+153>: fstps 0xffffff98(%ebp) - - - [Note: text inside <> is mangled. # echo _ZN14KivioArrowHead9paintForkEP18KivioArrowHeadData | c++filt KivioArrowHead::paintFork(KivioArrowHeadData*)] If the instruction on the crashed line ends with 'ps' or are cvtps2pi, cvtss2si then you have found yet another DUP. The bug is that gcc forgets to align (16 bytes) data on stack so if the sum of 0xffffff80+ %ebp is not 0x???????0 it will crash! *** Bug 72672 has been marked as a duplicate of this bug. *** Found another working option combination today: -march=pentium3 -mno-sse According to the gcc texinfo, it would seem that the 387 fpmath unit is chosen in all but the Athlon64 compiler. See: info:/gcc/i386 and x86-64 Options That is true for math (add, sub, mul, div) but not for movement to/from stack. And it is those instructons that are problematic! (gcc 3.1.1 SuSE) You can check it yourself. Modify Makefile in kdebase/kicker/applets/clock to include "-march=pentium3 -O1" # touch clock.cpp # make # objdump -d .lib/clock.o | grep xmm 2b4b: f3 0f 10 83 00 00 00 movss 0x0(%ebx),%xmm0 => 2b53: 0f 29 85 78 fc ff ff movaps %xmm0,0xfffffc78(%ebp) 2bc2: f3 0f 2a 85 b4 fc ff cvtsi2ss 0xfffffcb4(%ebp),%xmm0 2bca: f3 0f 11 85 44 fc ff movss %xmm0,0xfffffc44(%ebp) 2c54: f3 0f 2a c6 cvtsi2ss %esi,%xmm0 2c58: f3 0f 11 85 9c fc ff movss %xmm0,0xfffffc9c(%ebp) 2d4d: f3 0f 2a c0 cvtsi2ss %eax,%xmm0 2d51: f3 0f 11 85 74 fc ff movss %xmm0,0xfffffc74(%ebp) 2e56: f3 0f 2a c0 cvtsi2ss %eax,%xmm0 2e5a: f3 0f 11 85 70 fc ff movss %xmm0,0xfffffc70(%ebp) 2f96: f3 0f 2a 85 b4 fc ff cvtsi2ss 0xfffffcb4(%ebp),%xmm0 2f9e: f3 0f 11 85 44 fc ff movss %xmm0,0xfffffc44(%ebp) The one with an arrow is the problematic one since it is a 128 bit memory movement to an unaligned location... boys, it might be fun playing with it, but you really should find another forum for playing with gcc options. Coolo's right. Let's stop it here. It's a gcc bug and a major one at that. Your gcc bug ticket seems best: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13685 |