Bug 323713

Summary: Support mmxext (integer sse) subset on i386 (athlon)
Product: [Developer tools] valgrind Reporter: Mark Wielaard <mark>
Component: vexAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version: 3.9.0.SVN   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: VEX part of the fix to support mmxext subset.
valgrind part of the fix to support mmxext subset.
Alternative VEX patch to support mmxext subset.

Description Mark Wielaard 2013-08-19 11:53:15 UTC
Some processors like the AMD Athlon "Classic" support mmxext, a sse1 subset. This subset is not properly detected by VEX. The subset uses the same encoding as the sse1 instructions.

The subset is described at:
      http://support.amd.com/us/Embedded_TechDocs/22466.pdf
      https://en.wikipedia.org/wiki/3DNow!#3DNow.21_extensions

The mmxext instructions are MASKMOVQ MOVNTQ PAVGB PAVGW PMAXSW
PMAXUB PMINSW PMINUB PMULHUW PSADBW PSHUFW PEXTRW PINSRW PMOVMSKB
PREFETCHNTA PREFETCHT0 PREFETCHT1 PREFETCHT2 SFENCE

There is already a testcase for this subset:
memcheck/tests/x86/insn_mmxext
none/tests/x86/insn_mmxext

The prereq is slightly wrong, so it won't be tested on intel processors with full sse1 support. Fixing the prereq will make the test pass on those processors. These tests currently fails on AMD processors that have mmxext but not full sse1.

Reproducible: Always
Comment 1 Mark Wielaard 2013-08-19 12:01:44 UTC
Created attachment 81782 [details]
VEX part of the fix to support mmxext subset.

This introduces a new VEX_HWCAPS_X86_MMXEXT that sits between the baseline (0) and VEX_HWCAPS_X86_SSE1. There is also a new x86g_dirtyhelper_CPUID_mmxext to mimics a Athlon "Classic" (Model 2, K75 "Pluto/Orion"). To impact the instruction parser as little as possible it doesn't change the order of instruction parsing except when we have just mmxext. It uses gotos to jump through the mmxext subset in that case.  Luckily the mmxext subset is somewhat grouped together. Since this subset also provides sfence to code is updated slightly to take advantage of that if the when handling mfence.
Comment 2 Mark Wielaard 2013-08-19 12:04:42 UTC
Created attachment 81783 [details]
valgrind part of the fix to support mmxext subset.

Detects mmxext subset from cpuid information (and enables it when full sse1 is found). Also fixes the prereq of none/tests/x86/insn_mmxext.vgtest so that it also runs when full sse1 (and not just the mmxext subset) is found. It already passed on such configurations. With the VEX patch it also passes with just the mmxext subset.
Comment 3 Mark Wielaard 2013-08-27 09:53:04 UTC
Created attachment 81961 [details]
Alternative VEX patch to support mmxext subset.

Alternative VEX patch that instead of using some gotos to jump through the sse1 instruction subset in the parser just groups all mmxext instructions together in one block.
Comment 4 Mark Wielaard 2013-08-27 10:24:47 UTC
Used the second VEX patch after review from Julian.

VEX: r2745
valgrind: r13515