Summary: | x86: SSE cvtpi2ps with memory source does transition to MMX state | ||
---|---|---|---|
Product: | [Developer tools] valgrind | Reporter: | Janne Grunau <janne-kde> |
Component: | vex | Assignee: | Julian Seward <jseward> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | NOR | ||
Version: | 3.10.0 | ||
Target Milestone: | --- | ||
Platform: | Gentoo Packages | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | simplified test case from libav's checkasm |
Description
Janne Grunau
2015-12-22 17:29:23 UTC
Created attachment 96265 [details]
simplified test case from libav's checkasm
I'm not sure your test program is correct. The tag word is 16 bits at byte offsets 8 and 9, but the program tests fenv[9] and [10]. That said .. even after changing the 9 and 10 to 8 and 9, it still gives different results natively vs on V. So something's up here. Is this just a curiosity, or is it causing a problem for you? If I had to guess, I would say that the Sept 2015 Intel docs are wrong, and that this instruction (cvtpi2ps) should behave the same way as cvtpi2pd does -- that is, a transition to MMX state happens only if the source is a MMX, not when it is a memory operand. Unfortunately the AMD docs I have don't say anything at all about it. (In reply to Julian Seward from comment #2) > I'm not sure your test program is correct. The tag word is 16 bits > at byte offsets 8 and 9, but the program tests fenv[9] and [10]. > > That said .. even after changing the 9 and 10 to 8 and 9, it still > gives different results natively vs on V. So something's up here. Oops, yes, the sample program is wrong but the real check uses the correct offset: https://git.libav.org/?p=libav.git;a=blob;f=tests/checkasm/x86/checkasm.asm;h=55212fc24b3be71f25eb3e9f8066bd2cee1c5eef;hb=HEAD#l227 > Is this just a curiosity, or is it causing a problem for you? It's more than curiosity. We added tests for handwritten asm in libav (see tests/checkasm/). It also checks if the asm follows calling convention. I.e restores callee saved registers, makes no assumption of the upper half of int arguments on 64-bit targets and checks if the fpu state was restored properly. The latter check failed under valgrind on a function using cvtpi2ps with a memory operand and no other MMX usage. It only affects a function targeting SSE which will be only used if SSE2 is not available so I added the emms in https://git.libav.org/?p=libav.git;a=commitdiff;h=8563f9887194b07c972c3475d6b51592d77f73f7 . So it's not really a problem for us although there is still the issue that valgrind's behaviour differs from all CPU I tested. Fixed as described in comment #3. Janne, thanks for spotting this. |