When running Valgrind on amd64, I believe handleAddrOverrides() is incorrectly truncating the segment base addresses when the address-size override prefix is used. The way I stumbled upon this was having a value over the 4GB boundary in the fs register and hitting this instruction: 0x0000000000052105 <+5>: 64 67 8b 00 mov %fs:(%eax),%eax The final address is truncated, which ends up reading from the wrong location. Intel Developers Manual (Address Calculations in 64-Bit Mode) says that the base is added after the effective address is truncated, and that the base itself does not get truncated. The AMD64 Programmers Manual Vol2 says something similar. Reproducible: Always Steps to Reproduce: I do not have a good way to test / reproduce the problem. One possible way would be to have your kernel put its per-cpu data in high memory so that the fs segment is over the 4GB boundary, then use an instruction like the one in the description. Actual Results: The segment base address gets truncated and the read happens from the wrong address. Expected Results: The segment base address is not touched and the read happens from the right address.
Created attachment 101707 [details] Proposed patch
Sounds plausible, and it's nice that it's easy to fix. But I'm a bit concerned about the untestability of this. Is there no easy way to test this?
The testing that I had done, which I realize I was not made clear in the description, was specific to our OS (QNX Neutrino). Our kernel is putting the per-cpu data above the 4GB boundary and sets the fs base address to that value. Since the fs base address is always over 4GB, I can reproduce the behaviour quite easily locally. I thought of different ways this might be testable, but to my knowledge fs is not writable in user-mode / valgrind and at least for Linux it looks like it's 0. So I am not aware of a reliable / portable way to force fs + 32bit addr over 4GB to validate the truncation. I am open to suggestions if anyone else has ideas.
Committed, vex r3364. Thank you for the patch.