Bug 359952

Summary: Unrecognised PCMPESTRM variants
Product: [Developer tools] valgrind Reporter: christoficostas
Component: generalAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal CC: mark
Priority: NOR    
Version First Reported In: 3.11.0   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:

Description christoficostas 2016-03-01 13:53:51 UTC
I have an application that uses SSE4.2 string instructions to speed up some text parsing. Unfortunately, it causes valgrind to raise SIGILL:

vex amd64->IR: unhandled instruction bytes: 0xC4 0xE3 0x79 0x60 0xD1 0x70 0xC5 0xF9
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=1 VEX.L=0 VEX.nVVVV=0x0 ESC=0F3A
vex amd64->IR:   PFX.66=1 PFX.F2=0 PFX.F3=0

  40255f:	c4 e3 79 60 d1 70    	vpcmpestrm $0x70,%xmm1,%xmm2

The same happens if I compile with -mno-avx:

vex amd64->IR: unhandled instruction bytes: 0x66 0xF 0x3A 0x60 0xD1 0x70 0x66 0xF
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F3A
vex amd64->IR:   PFX.66=1 PFX.F2=0 PFX.F3=0

  4024df:	66 0f 3a 60 d1 70    	pcmpestrm $0x70,%xmm1,%xmm2

For completeness, the culprit is the following bit of code (g++ 4.8.4):

#include <nmmintrin.h>
...
__m128i mask = _mm_cmpestrm(separators, 2, data, 16,
	_SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_ANY |
	_SIDD_MASKED_NEGATIVE_POLARITY | _SIDD_UNIT_MASK);
Comment 1 Mark Wielaard 2016-07-19 09:16:50 UTC
Another unhandled x86 sse42 vpcmpestri variant (0x19) is generated by the latest openjdk hotspot.

Imm8 Control Byte == 0x19. If I am reading the instruction manual correctly 0x19 (0011001) means:
- source data format is unsigned words (01)
- comparison type is match (10)
- post-procession option is NOT CmprSumm (01)
- indexed output option selection is the least significant set bit (0)

A workaround is running java -XX:-UseSSE42Intrinsics to disable the string comparision optimisations when running under valgrind.
Comment 2 Julian Seward 2016-07-20 16:37:57 UTC
Fixed (both 0x70 and 0x19), vex r3228, tests valgrind r15910.