Bug 310845

Summary: Application being checked generates illegal instruction
Product: [Developer tools] valgrind Reporter: tim.gallagher
Component: memcheckAssignee: Julian Seward <jseward>
Status: REPORTED ---    
Severity: critical CC: alan, cpigat242, lopresti, tim.gallagher, tom
Priority: NOR    
Version First Reported In: 3.8.0   
Target Milestone: ---   
Platform: openSUSE   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: Full output with trace signals on
Added imm8 codes
adds imm code 0x72 0x42 and 0x62 for pcmpistri / pcmpistrm

Description tim.gallagher 2012-11-28 21:48:56 UTC
Valgrind version is 3.8.1 but that option doesn't exist in the dropdown. 

I'm running an application compiled with the Intel compiler suite (and the shared libraries are also compiled with the same suite) and the application runs (but has memory errors) without Valgrind, but crashes with the Intel runtime library saying it generated an Illegal Instruction. The error report is:

forrtl: severe (168): Program Exception - illegal instruction
Image              PC                Routine            Line        Source             
libirc.so          000000000405ED3C  Unknown               Unknown  Unknown
libhdf5.so.7       00000000061338E2  Unknown               Unknown  Unknown
libhdf5.so.7       00000000061A73CE  Unknown               Unknown  Unknown
libhdf5.so.7       00000000061A9D6F  Unknown               Unknown  Unknown
libhdf5_fortran.s  0000000006B23233  Unknown               Unknown  Unknown
libhdf5_fortran.s  0000000006B19FF9  Unknown               Unknown  Unknown
les3d.x            00000000006D815C  Unknown               Unknown  Unknown
les3d.x            00000000006A983D  Unknown               Unknown  Unknown
les3d.x            00000000006DB0F4  Unknown               Unknown  Unknown
les3d.x            00000000006F353D  Unknown               Unknown  Unknown
les3d.x            0000000000404CCC  Unknown               Unknown  Unknown
libc.so.6          0000000008F5F23D  Unknown               Unknown  Unknown
les3d.x            0000000000404BC9  Unknown               Unknown  Unknown

The compile options for the library are only -O2 and the main executable is -O2 -xhost -g -traceback (which is also odd that it doesn't generate any traceback information for at least the les3d.x stack). 


Reproducible: Always

Steps to Reproduce:
1. Any of my applications linked to the HDF5 libraries do this




My CPU information is (it's a "12" core, only first core information is shown):

processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 44
model name	: Intel(R) Core(TM) i7 CPU         000  @ 3.33GHz
stepping	: 2
cpu MHz		: 1600.000
cache size	: 12288 KB
physical id	: 0
siblings	: 12
core id		: 0
cpu cores	: 6
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 6674.04
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:
Comment 1 Tom Hughes 2012-11-28 22:32:58 UTC
Can you try running with --trace-signals=yes and post the output so we can see whether this is a real SIGILL from the kernel, or one valgrind has synthesised?
Comment 2 tim.gallagher 2012-11-28 22:38:46 UTC
Created attachment 75523 [details]
Full output with trace signals on
Comment 3 tim.gallagher 2012-11-28 22:40:12 UTC
With the command valgrind --trace-signals=yes ./les3d.x 2> valgrindOutput.txt and then grep for sigaction:

--24629-- sys_sigaction: sigNo 32, new 0x7fefff398, old 0x0, new flags 0x4000004
--24629-- sys_sigaction: sigNo 33, new 0x7fefff398, old 0x0, new flags 0x14000004
--24629-- sys_sigaction: sigNo 8, new 0x7fefff1d8, old 0x0, new flags 0x54000004
--24629-- sys_sigaction: sigNo 4, new 0x7fefff1d8, old 0x0, new flags 0x54000004
--24629-- sys_sigaction: sigNo 11, new 0x7fefff1d8, old 0x0, new flags 0x54000004
--24629-- sys_sigaction: sigNo 6, new 0x7fefff1d8, old 0x0, new flags 0x54000004
--24629-- sys_sigaction: sigNo 15, new 0x7fefff1d8, old 0x0, new flags 0x54000004
--24629-- sys_sigaction: sigNo 3, new 0x7fefff1d8, old 0x7fefff278, new flags 0x54000004
--24629-- sys_sigaction: sigNo 2, new 0x7fefff1d8, old 0x7fefff278, new flags 0x54000004
--24629-- sys_sigaction: sigNo 13, new 0x7feffec38, old 0x7feffecd8, new flags 0x14000000
--24629-- sys_sigaction: sigNo 10, new 0x7feffed58, old 0x7feffedf8, new flags 0x14000000
--24629-- sys_sigaction: sigNo 10, new 0x7feffec78, old 0x7feffed18, new flags 0x14000000
--24629-- sys_sigaction: sigNo 14, new 0x7feffec78, old 0x7feffed18, new flags 0x14000000
--24629-- sys_sigaction: sigNo 15, new 0x7feffec78, old 0x7feffed18, new flags 0x14000000
--24629-- sys_sigaction: sigNo 12, new 0x7feffec78, old 0x7feffed18, new flags 0x14000000
--24629-- sys_sigaction: sigNo 11, new 0x7feffbec0, old 0x7feffbf60, new flags 0x44000004
--24629-- sys_sigaction: sigNo 4, new 0x7feffbec0, old 0x7feffbf60, new flags 0x44000004
--24629-- sys_sigaction: sigNo 7, new 0x7feffbec0, old 0x7feffbf60, new flags 0x44000004
--24629-- sys_sigaction: sigNo 8, new 0x7feffbec0, old 0x7feffbf60, new flags 0x44000004
--24629-- sys_sigaction: sigNo 11, new 0x7feffbec0, old 0x0, new flags 0x54000004
--24629-- sys_sigaction: sigNo 4, new 0x7feffbec0, old 0x0, new flags 0x54000004
--24629-- sys_sigaction: sigNo 7, new 0x7feffbec0, old 0x0, new flags 0x4000000
--24629-- sys_sigaction: sigNo 8, new 0x7feffbec0, old 0x0, new flags 0x54000004
--24629-- sys_sigaction: sigNo 11, new 0x402bdf1e0, old 0x0, new flags 0x0
--24629-- sys_sigaction: sigNo 7, new 0x402bdf1e0, old 0x0, new flags 0x0
--24629-- sys_sigaction: sigNo 4, new 0x402bdf1e0, old 0x0, new flags 0x0
--24629-- sys_sigaction: sigNo 8, new 0x402bdf1e0, old 0x0, new flags 0x0

I attached the full output in case that's needed too, but it's full of the conditional jump uninitialized value and invalid read errors that come from using the Intel compiler suite.
Comment 4 Tom Hughes 2012-11-28 22:51:56 UTC
Well it doesn't look like there was a SIGILL from the kernel so I think it must be the instruction decoder in the JIT which is triggering it, which is unusual as the JIT normally generates it's own error if it can't handle the instruction.

The problem instruction appears to be at 0x405ED3C in __intel_sse4_strrchr in libirc.so - any chance you could disassemble that function and show us a few instructions around that point, given VEX hasn't reported the problem instruction in the normal way...
Comment 5 tim.gallagher 2012-11-29 00:10:18 UTC
I've never disassembled anything before, so I'm sorry if this isn't what you needed (let me know and I'll adjust it):

> objdump -t /opt/intel/2011.sp1.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64/libirc.so | grep intel_sse4_strrchr
0000000000018d30 g     F .text	0000000000000040              __intel_sse4_strrchr

> objdump -d -f --start-address=0x0000000000018d30 --stop-address=0x0000000000018d70 /opt/intel/2011.sp1.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64/libirc.so

/opt/intel/2011.sp1.1/composer_xe_2011_sp1.7.256/compiler/lib/intel64/libirc.so:     file format elf64-x86-64
architecture: i386:x86-64, flags 0x00000150:
HAS_SYMS, DYNAMIC, D_PAGED
start address 0x0000000000007040


Disassembly of section .text:

0000000000018d30 <__intel_sse4_strrchr>:
   18d30:	56                   	push   %rsi
   18d31:	66 0f 38 17 c9       	ptest  %xmm1,%xmm1
   18d36:	74 21                	je     18d59 <__intel_sse4_strrchr+0x29>
   18d38:	48 83 c2 10          	add    $0x10,%rdx
   18d3c:	66 0f 3a 63 0a 42    	pcmpistri $0x42,(%rdx),%xmm1
   18d42:	74 09                	je     18d4d <__intel_sse4_strrchr+0x1d>
   18d44:	73 f2                	jae    18d38 <__intel_sse4_strrchr+0x8>
   18d46:	89 c8                	mov    %ecx,%eax
   18d48:	48 03 c2             	add    %rdx,%rax
   18d4b:	eb eb                	jmp    18d38 <__intel_sse4_strrchr+0x8>
   18d4d:	73 05                	jae    18d54 <__intel_sse4_strrchr+0x24>
   18d4f:	89 c8                	mov    %ecx,%eax
   18d51:	48 03 c2             	add    %rdx,%rax
   18d54:	48 83 c4 08          	add    $0x8,%rsp
   18d58:	c3                   	retq   
   18d59:	48 89 d0             	mov    %rdx,%rax
   18d5c:	e8 67 db fe ff       	callq  68c8 <__intel_sse4_strend@plt>
   18d61:	48 83 c4 08          	add    $0x8,%rsp
   18d65:	c3                   	retq   
   18d66:	48 89 f6             	mov    %rsi,%rsi
   18d69:	48 8d bf 00 00 00 00 	lea    0x0(%rdi),%rdi
Comment 6 Tom Hughes 2012-11-29 08:51:47 UTC
No, that's fine - the problem is the pcmpistri instruction which appears to be a variant we don't handle yet. Julian can probably say more.
Comment 7 tim.gallagher 2012-11-30 19:59:14 UTC
It's unlikely, but is there a workaround and/or a development/unstable version that might have this fixed in it already or soon? 

I'm trying to track down a memory problem and the 6 other tools I've tried to use as a replacement due to this are completely useless but I'm hoping valgrind will find it. It's never let me down before.
Comment 8 Julian Seward 2012-11-30 22:43:33 UTC
Patrick LoPresti has another patch on some bug somewhere, which makes 
Memcheck work more sanely with optimised icc code.  BTW, you should
use the trunk and make sure you give the flag --partial-loads-ok=yes.

Re the 0x42 pcmpistri case, that might just be a case of allowing that
particular immediate value to be handled by the existing code.  But to
be sure we'd have to test it with none/tests/amd64/pcmpstr64.c/
Comment 9 Tom Hughes 2012-11-30 22:48:29 UTC
I've just been discussing it with Julian on IRC and we think that just adding 0x42 to the list of cases in the switch statement at line 17069 of VEX/priv/guest_amd64_toIR.c may be all that is needed if you want to give that a try.
Comment 10 tim.gallagher 2012-11-30 23:11:58 UTC
I tried what I think you suggested and ran with /data3/valgrindInstall/bin/valgrind --partial-loads-ok=yes ./les3d.x:

> svn diff VEX/priv/guest_amd64_toIR.c 
Index: VEX/priv/guest_amd64_toIR.c
===================================================================
--- VEX/priv/guest_amd64_toIR.c	(revision 2574)
+++ VEX/priv/guest_amd64_toIR.c	(working copy)
@@ -17070,7 +17070,7 @@
       case 0x00:
       case 0x02: case 0x08: case 0x0A: case 0x0C: case 0x12:
       case 0x1A: case 0x38: case 0x3A: case 0x44: case 0x4A:
-      case 0x46:
+      case 0x46: case 0x42:
          break;
       case 0x01: // the 16-bit character versions of the above
       case 0x03: case 0x09: case 0x0B: case 0x0D: case 0x13:

And it resulted in:

vex: priv/guest_amd64_helpers.c:3266 (amd64g_dirtyhelper_PCMPxSTRx): Assertion `ok' failed.
vex storage: T total 829967584 bytes allocated
vex storage: P total 640 bytes allocated

valgrind: the 'impossible' happened:
   LibVEX called failure_exit().
==20981==    at 0x38056EB6: report_and_quit (m_libcassert.c:235)
==20981==    by 0x38056F13: panic (m_libcassert.c:319)
==20981==    by 0x380570C8: vgPlain_core_panic_at (m_libcassert.c:324)
==20981==    by 0x380570DA: vgPlain_core_panic (m_libcassert.c:329)
==20981==    by 0x3806EC22: failure_exit (m_translate.c:713)
==20981==    by 0x380FD1A8: vex_assert_fail (main_util.c:219)
==20981==    by 0x3813B6F4: amd64g_dirtyhelper_PCMPxSTRx (guest_amd64_helpers.c:3266)
==20981==    by 0x403B1245D: ???
==20981==    by 0x402ABEF4F: ???

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==20981==    at 0x7D6F9EB: __intel_sse4_strrchr (in /opt/intel/2013.initial/composer_xe_2013.0.079/compiler/lib/intel64/libirc.so)
==20981==    by 0x612EC11: H5_build_extpath (in /opt/hdf5/intel/2013.initial/lib64/libhdf5.so.7.0.2)
==20981==    by 0x61A25BB: H5F_open (in /opt/hdf5/intel/2013.initial/lib64/libhdf5.so.7.0.2)
==20981==    by 0x61A4F51: H5Fopen (in /opt/hdf5/intel/2013.initial/lib64/libhdf5.so.7.0.2)
==20981==    by 0x6B245C2: h5fopen_c_ (in /opt/hdf5/intel/2013.initial/lib64/libhdf5_fortran.so.7.0.2)
==20981==    by 0x6B1BFA4: h5f_mp_h5fopen_f_ (in /opt/hdf5/intel/2013.initial/lib64/libhdf5_fortran.so.7.0.2)
==20981==    by 0x6CD9C9: gridoperations_m_mp_readgridhdf5_ (in /data1/randomPack/bin_randPackMSDM_van/les3d.x)
==20981==    by 0x69F967: gridoperations_m_mp_readgrid_ (in /data1/randomPack/bin_randPackMSDM_van/les3d.x)
==20981==    by 0x6D0953: simulationsetup_m_mp_readinputsandallocatedata_ (in /data1/randomPack/bin_randPackMSDM_van/les3d.x)
==20981==    by 0x6E8194: MAIN__ (in /data1/randomPack/bin_randPackMSDM_van/les3d.x)
==20981==    by 0x404BDB: main (in /data1/randomPack/bin_randPackMSDM_van/les3d.x)

I have joined the IRC channel in the event there are other quick things to test out. I've never looked at the source before, but I'm willing to put in any work I can to help resolve it.
Comment 11 Patrick J. LoPresti 2012-11-30 23:51:22 UTC
I think you need a similar change to the switch statement in guest_generic_x87.c:compute_PCMPxSTRx()

Also you definitely want to add this test case to none/tests/amd64/pcmpstr64.c.

(Actually getting the validity bit tracking to work properly on this one will be...  Fun.  I will take a crack at it eventually.)
Comment 12 tim.gallagher 2012-11-30 23:55:16 UTC
(In reply to comment #11)
> I think you need a similar change to the switch statement in
> guest_generic_x87.c:compute_PCMPxSTRx()
> 
> Also you definitely want to add this test case to
> none/tests/amd64/pcmpstr64.c.
> 
> (Actually getting the validity bit tracking to work properly on this one
> will be...  Fun.  I will take a crack at it eventually.)

Yes, we discussed that in the IRC channel. Adding it there does in fact make it run, sort of. I'm getting another illegal instruction in another function (I'm sure it's the same issue, different instruction) but I can't figure out how to translate the instruction into the ID's in the code. Ie. how pcmpistri = 0x42. Once I can figure that out, I can identify the problem instruction.
Comment 13 tim.gallagher 2012-12-01 00:04:21 UTC
From my searching through guest_x86_toIR.c, I think the problem instruction is pcmpistrm. That's the only one that I don't find any reference to in guest_x86_toIR.c.
Comment 14 tim.gallagher 2012-12-01 00:49:23 UTC
Created attachment 75557 [details]
Added imm8 codes

Two imm8 codes needed to be added, one for pcmpistri (0x42) and one for pcmpistrm (0x62).
Comment 15 tim.gallagher 2012-12-01 00:50:42 UTC
Patch added that makes my case run. Testing not included in the patch, but I could take a crack at it once I finish fixing my own code that started all this! But that may be awhile.
Comment 16 Alan Wild 2013-12-16 20:43:38 UTC
I just hit this same bug today in valgrind 3.9.  I added the same imm8 codes and that got me past the __intel_sse4_strrchr call only to have it die in __intel_sse4_strspn.  Using the same objdump trick I found calls to pcmpistrm with a value of 0x72.

[arwild1@hplcslsp1 ~]$ /apps_hpc/GCC/gcc-4.7.2/bin/objdump -t /apps_hpc/intel/composer_xe_2013.3.163/compiler/lib/intel64/libintlc.so.5 | egrep __intel_sse4_strspn
00000000000209e0 g     F .text  0000000000000310 __intel_sse4_strspn
[arwild1@hplcslsp1 ~]$ /apps_hpc/GCC/gcc-4.7.2/bin/objdump -d -f --start-address=0x00000000000209e0 --stop-address=0x0000000000020cf0 /apps_hpc/intel/composer_xe_2013.3.163/compiler/lib/intel64/libintlc.so.5 | egrep pcmp
   20a7b:       66 0f 74 c8             pcmpeqb %xmm0,%xmm1
   20c41:       66 0f 3a 62 d1 72       pcmpistrm $0x72,%xmm1,%xmm2
   20c70:       66 0f 3a 62 e1 72       pcmpistrm $0x72,%xmm1,%xmm4
   20c99:       66 0f 3a 63 c9 3a       pcmpistri $0x3a,%xmm1,%xmm1

(NOTE: 0x3a is already handled in the above source files).

After adding 0x72 to the same two files I was able to run without error.  I'll attach my diff against 3.9.0 below.
Comment 17 Alan Wild 2013-12-16 20:47:05 UTC
Created attachment 84130 [details]
adds imm code 0x72 0x42 and 0x62 for pcmpistri / pcmpistrm
Comment 18 Patrick J. LoPresti 2013-12-16 21:45:02 UTC
At the risk of repeating myself...

Fixing the bug is good, of course. But if you want it to stay fixed, please create a test case and add it to none/tests/amd64/pcmpstr64.c. The test case should have the property that it fails without your patch and succeeds with it.

It is impossible to overstate how important this is. Sooner or later, some compiler will start emitting these instructions inline, at which point Valgrind will need to expand them into IR code that memcheck can actually instrument. (A few memcheck suppression rules for a handful of string library routines is feasible. Hundreds or thousands of them for inline operations is not.)

Taking a few minutes now to add a test case or two could save somebody else hours of time in the future.