Bug 70655 - gcc bug! appletproxy crashed (SIGSEGV) when changing to clock analog
Summary: gcc bug! appletproxy crashed (SIGSEGV) when changing to clock analog
Status: RESOLVED NOT A BUG
Alias: None
Product: configure
Classification: Unmaintained
Component: general (show other bugs)
Version: CVS
Platform: openSUSE Linux
: VHI crash
Target Milestone: ---
Assignee: Christian Gebauer
URL:
Keywords:
: 72672 (view as bug list)
Depends on:
Blocks:
 
Reported: 2003-12-17 08:27 UTC by Roger Larsson
Modified: 2004-01-16 16:45 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Backtrace of crash when changing to analog clock with seconds (3.44 KB, text/plain)
2003-12-17 08:32 UTC, Roger Larsson
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roger Larsson 2003-12-17 08:27:32 UTC
Version:           unknown (using KDE 3.1.94 (CVS >= 20031206), compiled sources)
Compiler:          gcc version 3.3.1 (SuSE Linux)
OS:          Linux (i686) release 2.4.21-144-default

When trying to change to "Analog clock" with "Seconds"
appletproxy crashes.
Changing to other clock types work including other Analog types.

But changing back from Analog with Date (no seconds)
to plain Analog (no Display options) also crashes.
Comment 1 Roger Larsson 2003-12-17 08:32:43 UTC
Created attachment 3737 [details]
Backtrace of crash when changing to analog clock with seconds

The crash when resetting Date is not 100%
reproducible - switch among the different types
can give crashes. But Analog with seconds always
does - race?
Comment 2 Frerich Raabe 2003-12-17 14:51:07 UTC
I cannot reproduce this with KDE CVS as of December 16th 2003, I start appletproxy as "appletproxy clockapplet" and then keep switching between different configurations of the Analog clock - no crash.
Comment 3 Thiago Macieira 2003-12-17 17:10:22 UTC
Please paste backtraces instead of attaching them:

[New Thread 16384 (LWP 1626)]
0x41260a86 in waitpid () from /lib/i686/libpthread.so.0
#0  0x41260a86 in waitpid () from /lib/i686/libpthread.so.0
#1  0x406f2a20 in KCrash::defaultCrashHandler(int) ()
   from /opt/kdecvs/lib/libkdecore.so.4
#2  0x4125f96c in __pthread_sighandler () from /lib/i686/libpthread.so.0
#3  <signal handler called>
#4  0x4166dc2d in AnalogClock::paintEvent(QPaintEvent*) (this=0x8294fe0)
    at clock.cpp:546
#5  0x40b54f56 in QWidget::event(QEvent*) (this=0x8294fe0, e=0xbfffea00)
    at kernel/qwidget.cpp:4529
#6  0x40aba6fb in QApplication::internalNotify(QObject*, QEvent*) (
    this=0xbffff0bc, receiver=0x8294fe0, e=0xbfffea00)
    at kernel/qapplication.cpp:2582
#7  0x40aba32b in QApplication::notify(QObject*, QEvent*) (this=0xbffff0bc, 
    receiver=0x8294fe0, e=0xbfffea00) at kernel/qapplication.cpp:2470
#8  0x406759f3 in KApplication::notify(QObject*, QEvent*) ()
   from /opt/kdecvs/lib/libkdecore.so.4
#9  0x40a5177f in QApplication::sendEvent(QObject*, QEvent*) (
    receiver=0x8294fe0, event=0xbfffea00) at qapplication.h:492
#10 0x40a84a69 in QWidget::repaint(int, int, int, int, bool) (this=0x8294fe0, 
    x=0, y=0, w=54, h=54, erase=false) at kernel/qwidget_x11.cpp:1489
#11 0x40b569b2 in QWidget::repaint(QRect const&, bool) (this=0x8294fe0, 
    r=@0xbfffeaa0, erase=false) at qwidget.h:813
#12 0x40b56534 in QWidget::repaint(bool) (this=0x8294fe0, erase=false)
    at kernel/qwidget.cpp:5803
#13 0x4166d673 in AnalogClock::updateClock() (this=0x8294fe0) at clock.cpp:450
#14 0x41670b9e in ClockApplet::slotUpdate() (this=0x817a250) at clock.cpp:1020
#15 0x41672cbc in ClockApplet::qt_invoke(int, QUObject*) (this=0x817a250, 
    _id=46, _o=0xbfffebac) at clock.moc:573
#16 0x40b1d4bd in QObject::activate_signal(QConnectionList*, QUObject*) (
    this=0x8192b90, clist=0x81b2468, o=0xbfffebac) at kernel/qobject.cpp:2333
#17 0x40b1d35c in QObject::activate_signal(int) (this=0x8192b90, signal=2)
    at kernel/qobject.cpp:2302
#18 0x40e61cda in QTimer::timeout() (this=0x8192b90)
    at .moc/debug-shared-mt/moc_qtimer.cpp:82
#19 0x40b41aeb in QTimer::event(QEvent*) (this=0x8192b90, e=0xbfffedfc)
    at kernel/qtimer.cpp:219
#20 0x40aba6fb in QApplication::internalNotify(QObject*, QEvent*) (
    this=0xbffff0bc, receiver=0x8192b90, e=0xbfffedfc)
    at kernel/qapplication.cpp:2582
#21 0x40ab9bb8 in QApplication::notify(QObject*, QEvent*) (this=0xbffff0bc, 
    receiver=0x8192b90, e=0xbfffedfc) at kernel/qapplication.cpp:2305
#22 0x406759f3 in KApplication::notify(QObject*, QEvent*) ()
   from /opt/kdecvs/lib/libkdecore.so.4
#23 0x40a5177f in QApplication::sendEvent(QObject*, QEvent*) (
    receiver=0x8192b90, event=0xbfffedfc) at qapplication.h:492
#24 0x40aa87c8 in QEventLoop::activateTimers() (this=0x8127b78)
    at kernel/qeventloop_unix.cpp:557
#25 0x40a63096 in QEventLoop::processEvents(unsigned) (this=0x8127b78, flags=4)
    at kernel/qeventloop_x11.cpp:346
#26 0x40ad01d6 in QEventLoop::enterLoop() (this=0x8127b78)
    at kernel/qeventloop.cpp:198
#27 0x40ad00f2 in QEventLoop::exec() (this=0x8127b78)
    at kernel/qeventloop.cpp:145
#28 0x40aba87b in QApplication::exec() (this=0xbffff0bc)
    at kernel/qapplication.cpp:2705
#29 0x407d780c in kdemain (argc=8, argv=0x80808b8) at appletproxy.cpp:119
#30 0x0804d58c in launch(int, char const*, char const*, char const*, int, char const*, bool, char const*, bool, char const*) ()
#31 0x0804e28e in handle_launcher_request(int) ()
#32 0x0804e753 in handle_requests(int) ()
#33 0x0804f6c5 in main ()
Comment 4 Roger Larsson 2004-01-12 09:31:24 UTC
Recompiled (2004-01-11) still crashes... bad...
qt-x11-free-3.2.3 from qt-copy
Is it only me? If not it is an ugly one that you do not like reviewers to
run into.
Comment 5 Roger Larsson 2004-01-13 00:01:25 UTC
This works in SLAX 3.0.25 - howcome...
Comment 6 Roger Larsson 2004-01-14 01:15:58 UTC
My KDE was compiled with --march=pentium3 -Os I guess SLAX was not...
If I compile with --march=pentium2 -O1 it works !

My assumption is that gcc might add an instruction that I do not have,
or makes an erronous conversion, or optimizes to get into a race
(valgrind does not complain - much)

Suspect instructions are: cvtsi2ss, movss

gcc --version
gcc (GCC) 3.3.1 (SuSE Linux)

(I do have a pentium3! From /proc/cpuinfo

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 6
cpu MHz         : 935.001
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr sse
bogomips        : 1867.77
)
Comment 7 Roger Larsson 2004-01-14 02:16:03 UTC
When compiling clock.c for -march=pentium3 -O1 you will get cvtsi2ss instructions...


But not all pentium3 systems has these instructions! The OS decides!
(There are some security issues - so Linux does not enable them = no xmm)

objdump -d kdebase/kicker/applets/clock/.libs/clock.o | grep cvtsi2ss
    2bc2:       f3 0f 2a 85 b4 fc ff    cvtsi2ss 0xfffffcb4(%ebp),%xmm0
    2c54:       f3 0f 2a c6             cvtsi2ss %esi,%xmm0
    2d4d:       f3 0f 2a c0             cvtsi2ss %eax,%xmm0
    2e56:       f3 0f 2a c0             cvtsi2ss %eax,%xmm0
    2f96:       f3 0f 2a 85 b4 fc ff    cvtsi2ss 0xfffffcb4(%ebp),%xmm0

If I check my /opt/kdecvs/bin I find the following files that have this
instruction, to many to believe but... I report it here now and check it
further tomorrow...
(Some might need -Os to get the instruction, they might be placed in
a rare spot)

artsbuilder
artscat
artscontrol
artsrec
artstracker
fsview
gwenview
k3b
kasteroids
kbackgammon
kblackbox
kblob.kss
kbruch
kcachegrind
kdat
kdf
keuphoria.kss
kfax
kfiresaver3d
kflux.kss
kfontinst
kfouleggs
kfountain.kss
kgoldrunner
kgravity.kss
khexedit
kiten
kjumpingcube
klickety
kmailcvt
kmines
kmplot
kooka
kpaint
kpat
kppp
kppplogview
krdc
kreatecd
kreversi
krfb
ksirtet
ksnapshot
ksolarwinds.kss
kspace.kss
kstars
ksysguard
ktouch
kvoctrain
kwave.kss
kwikdisk
mpeglibartsplay
quanta
rosegarden
sfconvert
sfinfo
umbrello

A possible workaround is to compile with
-mcpu=pentium3 -march=pentium2
Comment 8 Thiago Macieira 2004-01-14 06:22:25 UTC
According to Intel's Pentium III Instruction Set Reference manual (ftp://download.intel.com/design/Pentium4/manuals/24547112.pdf), the cvtsi2ss instruction can generate the Undefined Instruction exception if the kernel didn't activate support for fxsave and restore instructions. 

I know Linux supports those, but not in every system. Was your kernel compiled for Pentium III? 

There are also a couple other conditions (bits TS or EM in CR0; floating-point exceptions), but I don't believe those to be the case. And I can't test because my Athlon processor doesn't support that instruction.

In any event, this is NOT a KDE bug.
Comment 9 Stephan Kulow 2004-01-14 09:26:33 UTC
you want to report here: http://gcc.gnu.org/bugzilla/

Comment 10 Roger Larsson 2004-01-14 14:08:52 UTC
- Not a KDE bug.
- Not a gcc bug. When Pentium3 as such has this feature. 
- Not a Linux kernel bug. It protects a minor rare security problem.

But it will hurt KDE - applications crashes all over...

It should be mentioned in the release documentation!
"Do not build for the Pentium3 architecture"
Comment 11 Thiago Macieira 2004-01-14 14:57:49 UTC
I was just asking for more information to see if it was a kernel bug or a gcc bug. Can someone with a Pentium III processor (a Coppermine if possible) and a Pentium III-targeted Linux kernel please try the following program?

$ cat test.s
.global _start
_start:
        xorl    %eax,%eax
        cvtsi2ss %eax,%xmm0

        movl    %eax, 1
        int     $0x80
$ gcc -nostdlib -o test test.s

If it crashes, http://bugzilla.kernel.org is where you should report.
Comment 12 Roger Larsson 2004-01-14 15:17:32 UTC
-----------
(gdb) disassemble _start
Dump of assembler code for function _start:
0x08048074 <_start+0>:  xor    %eax,%eax
0x08048076 <_start+2>:  cvtsi2ss %eax,%xmm0
0x0804807a <_start+6>:  mov    %eax,0x1
0x0804807f <_start+11>: int    $0x80
End of assembler dump.
(gdb) n
The program is not being run.
(gdb) run
Starting program: /home/roger/test

Program received signal SIGSEGV, Segmentation fault.
0x0804807a in _start ()
(gdb) cont
Continuing.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
----------------
The strange thing is that it says nothing when running from command line
directly... Oh... test is a bad name there is a test in bash...
./test crashes!
Comment 13 Thiago Macieira 2004-01-14 15:36:48 UTC
(gdb) r
Starting program: /tmp/test

Program received signal SIGILL, Illegal instruction.

The crash is different then.
Comment 14 Roger Larsson 2004-01-14 16:22:32 UTC
Subject: Re:  gcc/linux bug! appletproxy crashed when changing to clock analog

kclockapplet does not crash on exactly that line but soon after...
And I can not get kasteroids to crash... hmm...

Ahh.. Bugs in your test program, it should look like this...

.global _start
_start:
        xorl	%eax, %eax
        cvtsi2ss %eax,%xmm0

        movl    $1, %eax
        int     $0x80

Then it will run on my Coppermine (without xmm in flags... so Linux does not 
support it. Store / restore...)

So either it is a race in kclockapplet or it is Linux that does not save (it 
says so...)

Comment 15 Roger Larsson 2004-01-14 16:52:54 UTC
Tried another thing...

Modified the kdebase/kicker/applets/clock/Makefile to have
-mcpu=pentium3 -march=pentium2 -O3
to get maximum optimization for my CPU without getting the xmm
instructions.

# touch clock.cpp
# make install
This works as expected.

Then change -march to pentium3
-mcpu=pentium3 -march=pentium3 -O3
this enables the xmm instructions.
# touch clock.cpp
# make install
This crashes kickers analog clock applet!

Earlier I got it to crash with -march=pentium3 -O1
Anyone can try to compile this file with maximum optimization for their processor to see if it is a race - any takers?
Comment 16 Thiago Macieira 2004-01-14 17:36:50 UTC
Ah, thanks for the fix. I had never noticed it because I get a SIGILL in the instruction. Now, I didn't quite understand: you said the program runs fine in your system, is that so? If so, why do you claim Linux doesn't support it?

Also note that OS support is required for XMM instructions to be enabled (since fnsave won't save everything, but fxsave will). Your kernel must be compiled with proper support, try checking that.

As for checking the exact crash locus, in GDB run:
(gdb) x/a $pc
for a dump of the registers:
(gdb) info reg
(gdb) info all-reg
(or simply i r)
Comment 17 Roger Larsson 2004-01-14 18:46:17 UTC
It looks like the 'xmm' flag has been renamed 'sse'.
So it has nothing to do with these instructions...

- Linux kernel is probably correct.
- GCC might generate errornous code for calc.cpp with -march=pentium3
- There might be a race in the code that only shows with full optimization.
	(contraindication: -march=pentium3 -O0 is buggy,
	while -march=pentium2 -mcpu=pentium3 -O3 is not)

It would be interesting to try this with other 'sse' capable architectures.
Comment 18 Roger Larsson 2004-01-14 19:17:45 UTC
Argh...
-march=pentium3 -O0 is OK  it does not generate 'sse' instructions
-march=pentium3 -O1 is NOK generates 'sse', but should(?) be slower than
-march=pentium2 -mcpu=pentium3 -O3 that is OK no 'sse'
Comment 19 Roger Larsson 2004-01-14 20:56:51 UTC
gcc generates an unaligned 128 bit store => crash (this is a gcc bug!)

New test code.

# cat > test2.s
.global _start
_start:
        xorl	%eax, %eax
        cvtsi2ss %eax,%xmm0
	movl	%esp,%ebp
	movaps %xmm0, -16(%ebp)
	movaps %xmm0, -8(%ebp)			# simplified case, uncomment
	movaps %xmm0, 0xfffffc78(%ebp)		# gets used in clock
        movl    $1, %eax
        int     $0x80

# gcc -Wall -g -nostdlib -o test2 test2.s
# gdb ./test2

A processor with sse (Pentium3 and forward) should not crash on the first movaps. Pentium3 crashes on either of the following.
What about others?
Comment 20 Roger Larsson 2004-01-14 20:58:33 UTC
Forgot gcc bug number:

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13685
Comment 21 Roger Larsson 2004-01-15 00:23:06 UTC
If you get a SIGSEGV in an application you can try:

# gdb application
(gdb) run --nocrashhandler [--nofork]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 16384 (LWP 18506)]
0x41a84d47 in KivioArrowHead::paintArrowTriangle(KivioArrowHeadData*, bool) (this=0x83500d0, d=0xbfffe7a0, solid=false)
    at kivio_arrowhead.cpp:296
296         nvecX = - vecX / length;
(gdb) disassemble 0x41a84d47
- - -
0x41a84d45 <KivioArrowHead::paintArrowTriangle+147>:        fdiv   %st(1),%st
0x41a84d47 <...+149>:        movaps %xmm0,0xffffff80(%ebp)
0x41a84d4b <...+153>:        fstps  0xffffff98(%ebp)
- - -
[Note: text inside <> is mangled.
# echo _ZN14KivioArrowHead9paintForkEP18KivioArrowHeadData | c++filt
KivioArrowHead::paintFork(KivioArrowHeadData*)]

If the instruction on the crashed line ends with 'ps' or are cvtps2pi, cvtss2si
then you have found yet another DUP.

The bug is that gcc forgets to align (16 bytes) data on stack so if the sum of
0xffffff80+ %ebp is not 0x???????0 it will crash!
Comment 22 Roger Larsson 2004-01-15 00:48:39 UTC
*** Bug 72672 has been marked as a duplicate of this bug. ***
Comment 23 Roger Larsson 2004-01-15 23:13:57 UTC
Found another working option combination today:
 -march=pentium3 -mno-sse
Comment 24 Thiago Macieira 2004-01-16 05:50:37 UTC
According to the gcc texinfo, it would seem that the 387 fpmath unit is chosen in all but the Athlon64 compiler.

See: info:/gcc/i386 and x86-64 Options
Comment 25 Roger Larsson 2004-01-16 09:17:56 UTC
That is true for math (add, sub, mul, div) but not for movement to/from stack.
And it is those instructons that are problematic!
(gcc 3.1.1 SuSE)

You can check it yourself.
Modify Makefile in kdebase/kicker/applets/clock
to include "-march=pentium3 -O1"
# touch clock.cpp
# make
# objdump -d .lib/clock.o | grep xmm
    2b4b:       f3 0f 10 83 00 00 00    movss  0x0(%ebx),%xmm0
=>  2b53:       0f 29 85 78 fc ff ff    movaps %xmm0,0xfffffc78(%ebp)
    2bc2:       f3 0f 2a 85 b4 fc ff    cvtsi2ss 0xfffffcb4(%ebp),%xmm0
    2bca:       f3 0f 11 85 44 fc ff    movss  %xmm0,0xfffffc44(%ebp)
    2c54:       f3 0f 2a c6             cvtsi2ss %esi,%xmm0
    2c58:       f3 0f 11 85 9c fc ff    movss  %xmm0,0xfffffc9c(%ebp)
    2d4d:       f3 0f 2a c0             cvtsi2ss %eax,%xmm0
    2d51:       f3 0f 11 85 74 fc ff    movss  %xmm0,0xfffffc74(%ebp)
    2e56:       f3 0f 2a c0             cvtsi2ss %eax,%xmm0
    2e5a:       f3 0f 11 85 70 fc ff    movss  %xmm0,0xfffffc70(%ebp)
    2f96:       f3 0f 2a 85 b4 fc ff    cvtsi2ss 0xfffffcb4(%ebp),%xmm0
    2f9e:       f3 0f 11 85 44 fc ff    movss  %xmm0,0xfffffc44(%ebp)

The one with an arrow is the problematic one since it is a 128 bit memory
movement to an unaligned location...
Comment 26 Stephan Kulow 2004-01-16 10:07:07 UTC
boys, it might be fun playing with it, but you really should find another forum for playing with gcc options.
Comment 27 Thiago Macieira 2004-01-16 16:45:46 UTC
Coolo's right. Let's stop it here. It's a gcc bug and a major one at that.

Your gcc bug ticket seems best:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13685