397089 – [PATCH] Incorrect decoding of three-register vmovss/vmovsd opcode 11h

Bug 397089 - [PATCH] Incorrect decoding of three-register vmovss/vmovsd opcode 11h

Summary: [PATCH] Incorrect decoding of three-register vmovss/vmovsd opcode 11h

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	vex (other bugs)
Version First Reported In:	3.13.0
Platform:	Compiled Sources Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2018-08-02 12:45 UTC by Tomas Trnka
Modified:	2018-09-14 11:26 UTC (History)
CC List:	0 users

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
Proposed patch (2.23 KB, patch) 2018-08-02 12:45 UTC, Tomas Trnka	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Tomas Trnka 2018-08-02 12:45:29 UTC

Created attachment 114267 [details]
Proposed patch

vmovss/vmovsd xmm1, xmm2, xmm3 using opcode 11h was decoded exactly the same as opcode 10h, swapping the xmm1 and xmm3 arguments. This leads to all kinds of floating point software going crazy under Valgrind because of assignments suddenly working in the wrong direction.

Comment 1 Julian Seward 2018-09-10 10:08:38 UTC

The really worrying thing here is not that there is a bug, but that
the test suite failed to detect it.  Investigating.

Comment 2 Julian Seward 2018-09-10 10:47:44 UTC

Tomas: the test suite (none/tests/amd64/avx-1.c) contains only this

GEN_test_Ronly(VMOVSS_REG_XMM, "vmovss %%xmm9, %%xmm7, %%xmm8")

and I don't know whether that is the 10h or 11h instruction since
the Intel docs give the same assembly syntax for both :-/  Obviously
it can't be both, with the effect that the Vex implementation of one
or the other must always have been un-tested.

Do you know if there is a way to write both variants, using the
GNU as syntax?  Do you maybe have a .o or .so file that uses both,
that you can objdump -d, to see if there is a difference in the
assembly syntax?

Comment 3 Tomas Trnka 2018-09-10 11:22:26 UTC

I don't know if there's any special assembly syntax to force one or the other. I wouldn't be surprised if there's not, because these three-register variants are functionally equivalent (except for the direction).

I have found that my GNU as (2.29.1) reliably picks opcode 11h when the first operand is xmm8+ and the third operand is xmm[0-7]. All other combinations seem to use opcode 10h. (No idea why. I just bruteforced all combinations). This is probably the reason why the three-register opcode 11h version is somewhat less common (and thus why most of the FP code out there seems to mostly work).

   0:   c5 6b 10 c1             vmovsd %xmm1,%xmm2,%xmm8

   0:   c5 6b 11 c1             vmovsd %xmm8,%xmm2,%xmm1

Comment 4 Julian Seward 2018-09-14 11:22:47 UTC

Landed as 27194eb985b22a66b439925e821a41800bcb881e.  Thanks for the patch.

Comment 5 Julian Seward 2018-09-14 11:26:31 UTC

Test cases landed as 529436ff596203a451cf525824b3538d006de0bb.