Bug 210264 - SSE code gives different output when running under valgrind
Summary: SSE code gives different output when running under valgrind
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.6 SVN
Platform: unspecified Unspecified
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-10-12 01:16 UTC by Vitor Sessak
Modified: 2010-02-21 23:04 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments
Source code of the problematic file (5.62 KB, text/plain)
2009-10-12 01:16 UTC, Vitor Sessak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vitor Sessak 2009-10-12 01:16:51 UTC
Created attachment 37527 [details]
Source code of the problematic file

Hello,

As a part of my effort to make FFmpeg regression test suite to pass with no valgrind errors/warnings, I stumbled what I suppose is a valgrind bug.

The attached testcase (that uses a lot of SSE asm) gives a different output depending on if I run it on valgrind or directly on the CPU. Valgrind gives no error.

vitor@vitor-laptop:/tmp$ ./a.out 
11085493066641058.000000
vitor@vitor-laptop:/tmp$ valgrind ./a.out 
==7720== Memcheck, a memory error detector.
==7720== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==7720== Using LibVEX rev 1804, a library for dynamic binary translation.
==7720== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==7720== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation framework.
==7720== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==7720== For more details, rerun with: -v
==7720== 
nan
==7720== 
==7720== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 11 from 1)
==7720== malloc/free: in use at exit: 4,096 bytes in 2 blocks.
==7720== malloc/free: 2 allocs, 0 frees, 4,096 bytes allocated.
==7720== For counts of detected errors, rerun with: -v
==7720== searching for pointers to 2 not-freed blocks.
==7720== checked 60,188 bytes.
==7720== 
==7720== LEAK SUMMARY:
==7720==    definitely lost: 4,096 bytes in 2 blocks.
==7720==      possibly lost: 0 bytes in 0 blocks.
==7720==    still reachable: 0 bytes in 0 blocks.
==7720==         suppressed: 0 bytes in 0 blocks.
==7720== Rerun with --leak-check=full to see details of leaked memory.

vitor@vitor-laptop:/tmp$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU     T7250  @ 2.00GHz
stepping	: 13
cpu MHz		: 800.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm ida
bogomips	: 3990.27
clflush size	: 64

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU     T7250  @ 2.00GHz
stepping	: 13
cpu MHz		: 800.000
cache size	: 2048 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm ida
bogomips	: 3989.97
clflush size	: 64
Comment 1 Julian Seward 2010-02-21 20:37:59 UTC
Valgrind mishandles "cvtpi2pd m128, xmm".  It inadvertantly switches
the x87 FPU to MMX mode, which makes the call to printf fail.  Fix in
progress.
Comment 2 Julian Seward 2010-02-21 21:41:55 UTC
Fixed (vex r1961).  Please verify.

Is it OK to add the test program to valgrind's test suite
(with a GPL2+ license) ?
Comment 3 Vitor Sessak 2010-02-21 23:04:16 UTC
Tested rev. 1961 and I confirm it is fixed. I took the liberty of marking it as "RESOLVED, FIXED".

The code is originally licensed "Copyright (c) 2007 Loren Merritt" under the LGPL v. 2.1 or later, which AFAIK is compatible with the GPL, so you can put it in the test suite.

In case you need to know, the original code is in http://git.ffmpeg.org/?p=ffmpeg;a=blob;f=libavcodec/x86/lpc_mmx.c;hb=HEAD .

Thanks!