359524 – bt, btc, btr and bts instruction improperly translated by VEX on x86-64

Bug 359524 - bt, btc, btr and bts instruction improperly translated by VEX on x86-64

Summary: bt, btc, btr and bts instruction improperly translated by VEX on x86-64

Status:	RESOLVED NOT A BUG

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	vex (show other bugs)
Version:	unspecified
Platform:	Ubuntu Linux

Importance:	NOR major
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2016-02-18 10:24 UTC by lmrs2
Modified:	2016-07-05 08:23 UTC (History)
CC List:	1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description lmrs2 2016-02-18 10:24:57 UTC

Platform: Ubuntu 12.04, AMD64, 3.2.0-53-generic
See "Steps to reproduce" for valgrind's output.

Reproducible: Always

Steps to Reproduce:
1. Consider the code below:
----------------- code ------------------------
.text 

.globl _start 

_start:

    movq $8, %rbx
    movq $12, %rdx
    btq %rbx, %rdx
    #btc %rbx, %rdx   #uncomment to reproduce
    #btr %rbx, %rdx   #uncomment to reproduce
    #bts %rbx, %rdx   #uncomment to reproduce
    # exit
    movq $60, %rax  # use the _exit syscall
    movq $0, %rdi   # error code 0
    syscall         # make syscall
----------------- end code -------------------------------

2. Compile the code as follows:
    $as -o test.o test.s
    $ld -o test test.o

3. Run Valgrind as follows:
   $valgrind --vex-guest-chase-thresh=0 --trace-flags=10000000 --trace-notabove=999 ./test

The out is as follows. 
------------------------ Output ------------------------

	0x4000B0:  movq $8, %rbx

              ------ IMark(0x4000B0, 7, 0) ------
              PUT(40) = 0x8:I64
              PUT(184) = 0x4000B7:I64

	0x4000B7:  movq $12, %rdx

              ------ IMark(0x4000B7, 7, 0) ------
              PUT(32) = 0xC:I64
              PUT(184) = 0x4000BE:I64

	0x4000BE:  btq %rbx, %rdx

              ------ IMark(0x4000BE, 4, 0) ------
              t2 = GET:I64(40)
              t6 = Sub64(GET:I64(48),0x120:I64)
              PUT(48) = t6
              STle(t6) = GET:I64(32)
              t7 = t6
              t3 = And64(t2,0x3F:I64)
              t5 = Add64(t7,Sar64(t3,0x3:I8))
              t4 = 64to8(And64(t3,0x7:I64))
              t0 = LDle:I8(t5)
              PUT(144) = 0x0:I64
              PUT(160) = 0x0:I64
              PUT(152) = And64(Shr64(8Uto64(t0),t4),0x1:I64)
              PUT(168) = 0x0:I64
              PUT(48) = Add64(t6,0x120:I64)
              PUT(184) = 0x4000C2:I64
	[...]
---------------------- end output -----------------------

Actual Results:  
The VEX translation of bt, btc, btr, and bts seem incorrect. See below.

x86-64 documetation says: bt, btc, btr and bts instructions test one bit of their first operand, whose index is given by the second operand, and store the value of that bit into the carry flag. 

However, as can be seen in valgrind output, 2 things don't add up:
1- line t6 = Sub64(GET:I64(48),0x120:I64) means it's reading register 48. However this register is never set prior to reading it. Maybe this is assumed to be 0 on startup?

2- the code is doing some erroneous calculations on certain registers that it interprets as memory addresses. Subsequently, these memory locations are used to read (t0 = LDle:I8(t5)) 
and store (STle(t6) = GET:I64(32)). This appears to be wrong because the bt instruction should not update its operands, let alone memory locations. 

Instructions btc, btr and bts update their operands. However, we found that when translated by VEX, their resulting IR accesses memory locations even if their operands are registers with constant values (see our test.s by un-commenting commented lines). When using memory address as operands for btc, btr and bts instruction, memory accesses also appear to be incorrect.

Comment 1 Julian Seward 2016-07-05 06:23:44 UTC

What you're seeing is the result of a kludge, in which btq for a register operand
is implemented by pushing the argument on the (guest) stack temporarily,
and then executing the same IR as for btq with a memory operand.  Have a look
at the relevant bits of guest_amd64_toIR.c.  I'm sure it's documented there.

Comment 2 Julian Seward 2016-07-05 06:25:07 UTC

Honestly .. do you think any large program would actually run properly on Valgrind
if these instructions had really been misimplemented?

Comment 3 lmrs2 2016-07-05 08:23:54 UTC

Thx for your reply. The implementation surely works, but the workaround adds side effects that the original program would not have, eg when doing taint tracking -- which is what we were doing...