Bug 371491

Summary: handleAddrOverrides() is truncating the segment base address when ASO prefix is used
Product: [Developer tools] valgrind Reporter: Michael Daniels <mdaniels>
Component: vexAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Other   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: Proposed patch

Description Michael Daniels 2016-10-22 18:35:07 UTC
When running Valgrind on amd64, I believe handleAddrOverrides() is
incorrectly truncating the segment base addresses when the address-size
override prefix is used.

The way I stumbled upon this was having a value over the 4GB boundary
in the fs register and hitting this instruction:

0x0000000000052105 <+5>: 64 67 8b 00 mov %fs:(%eax),%eax

The final address is truncated, which ends up reading from the wrong
location.

Intel Developers Manual (Address Calculations in 64-Bit Mode) says that
the base is added after the effective address is truncated, and that
the base itself does not get truncated. The AMD64 Programmers Manual
Vol2 says something similar.

Reproducible: Always

Steps to Reproduce:
I do not have a good way to test / reproduce the problem.

One possible way would be to have your kernel put its per-cpu data in high memory so that the fs segment is over the 4GB boundary, then use an instruction like the one in the description.

Actual Results:  
The segment base address gets truncated and the read happens from the wrong address.

Expected Results:  
The segment base address is not touched and the read happens from the right address.
Comment 1 Michael Daniels 2016-10-22 18:36:35 UTC
Created attachment 101707 [details]
Proposed patch
Comment 2 Julian Seward 2016-10-24 05:59:28 UTC
Sounds plausible, and it's nice that it's easy to fix.  But I'm a bit concerned
about the untestability of this.  Is there no easy way to test this?
Comment 3 Michael Daniels 2016-10-24 15:49:50 UTC
The testing that I had done, which I realize I was not made clear in the description, was specific to our OS (QNX Neutrino). Our kernel is putting the per-cpu data above the 4GB boundary and sets the fs base address to that value. Since the fs base address is always over 4GB, I can reproduce the behaviour quite easily locally.

I thought of different ways this might be testable, but to my knowledge fs is not writable in user-mode / valgrind and at least for Linux it looks like it's 0. So I am not aware of a reliable / portable way to force fs + 32bit addr over 4GB to validate the truncation.

I am open to suggestions if anyone else has ideas.
Comment 4 Julian Seward 2017-05-11 14:29:39 UTC
Committed, vex r3364.  Thank you for the patch.