Bug 323713 - Support mmxext (integer sse) subset on i386 (athlon)
Summary: Support mmxext (integer sse) subset on i386 (athlon)
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.9.0.SVN
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-19 11:53 UTC by Mark Wielaard
Modified: 2013-08-27 10:24 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments
VEX part of the fix to support mmxext subset. (17.93 KB, patch)
2013-08-19 12:01 UTC, Mark Wielaard
Details
valgrind part of the fix to support mmxext subset. (2.66 KB, patch)
2013-08-19 12:04 UTC, Mark Wielaard
Details
Alternative VEX patch to support mmxext subset. (33.07 KB, patch)
2013-08-27 09:53 UTC, Mark Wielaard
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2013-08-19 11:53:15 UTC
Some processors like the AMD Athlon "Classic" support mmxext, a sse1 subset. This subset is not properly detected by VEX. The subset uses the same encoding as the sse1 instructions.

The subset is described at:
      http://support.amd.com/us/Embedded_TechDocs/22466.pdf
      https://en.wikipedia.org/wiki/3DNow!#3DNow.21_extensions

The mmxext instructions are MASKMOVQ MOVNTQ PAVGB PAVGW PMAXSW
PMAXUB PMINSW PMINUB PMULHUW PSADBW PSHUFW PEXTRW PINSRW PMOVMSKB
PREFETCHNTA PREFETCHT0 PREFETCHT1 PREFETCHT2 SFENCE

There is already a testcase for this subset:
memcheck/tests/x86/insn_mmxext
none/tests/x86/insn_mmxext

The prereq is slightly wrong, so it won't be tested on intel processors with full sse1 support. Fixing the prereq will make the test pass on those processors. These tests currently fails on AMD processors that have mmxext but not full sse1.

Reproducible: Always
Comment 1 Mark Wielaard 2013-08-19 12:01:44 UTC
Created attachment 81782 [details]
VEX part of the fix to support mmxext subset.

This introduces a new VEX_HWCAPS_X86_MMXEXT that sits between the baseline (0) and VEX_HWCAPS_X86_SSE1. There is also a new x86g_dirtyhelper_CPUID_mmxext to mimics a Athlon "Classic" (Model 2, K75 "Pluto/Orion"). To impact the instruction parser as little as possible it doesn't change the order of instruction parsing except when we have just mmxext. It uses gotos to jump through the mmxext subset in that case.  Luckily the mmxext subset is somewhat grouped together. Since this subset also provides sfence to code is updated slightly to take advantage of that if the when handling mfence.
Comment 2 Mark Wielaard 2013-08-19 12:04:42 UTC
Created attachment 81783 [details]
valgrind part of the fix to support mmxext subset.

Detects mmxext subset from cpuid information (and enables it when full sse1 is found). Also fixes the prereq of none/tests/x86/insn_mmxext.vgtest so that it also runs when full sse1 (and not just the mmxext subset) is found. It already passed on such configurations. With the VEX patch it also passes with just the mmxext subset.
Comment 3 Mark Wielaard 2013-08-27 09:53:04 UTC
Created attachment 81961 [details]
Alternative VEX patch to support mmxext subset.

Alternative VEX patch that instead of using some gotos to jump through the sse1 instruction subset in the parser just groups all mmxext instructions together in one block.
Comment 4 Mark Wielaard 2013-08-27 10:24:47 UTC
Used the second VEX patch after review from Julian.

VEX: r2745
valgrind: r13515