Bug 403123 - vex amd64->IR: unhandled instruction bytes: 0xF3 0x48 0xF 0xAE 0xD3 (wrfsbase)
Summary: vex amd64->IR: unhandled instruction bytes: 0xF3 0x48 0xF 0xAE 0xD3 (wrfsbase)
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (show other bugs)
Version: 3.15 SVN
Platform: Compiled Sources FreeBSD
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-12 01:23 UTC by Roman Bogorodskiy
Modified: 2019-03-14 15:22 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Suppress FSGSBASE feature flag for AVX2 capable CPUs (738 bytes, patch)
2019-03-10 12:11 UTC, Tom Hughes
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Roman Bogorodskiy 2019-01-12 01:23:43 UTC
I'm running a freebsd fork based 3.15, however, it looks like this isuue is not related to the freebsd specifics.

Running any application (using uname in this example as a very simple one) results in:

==2934== Memcheck, a memory error detector
==2934== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2934== Using Valgrind-3.15.0.GIT and LibVEX; rerun with -h for copyright info
==2934== Command: uname
==2934==
vex amd64->IR: unhandled instruction bytes: 0xF3 0x48 0xF 0xAE 0xD3 0x48 0x83 0xC4 0x8 0x5B
vex amd64->IR:   REX=1 REX.W=1 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=1
==2934== valgrind: Unrecognised instruction at address 0x400899c.
==2934==    at 0x400899C: ??? (in /libexec/ld-elf.so.1)
==2934==    by 0x4009D0F: ??? (in /libexec/ld-elf.so.1)
==2934==    by 0x4008018: ??? (in /libexec/ld-elf.so.1)
==2934== Your program just tried to execute an instruction that Valgrind
==2934== did not recognise.  There are two possible reasons for this.
==2934== 1. Your program has a bug and erroneously jumped to a non-code
==2934==    location.  If you are running Memcheck and you just saw a
==2934==    warning about a bad jump, it's probably your program's fault.
==2934== 2. The instruction is legitimate but Valgrind doesn't handle it,
==2934==    i.e. it's Valgrind's fault.  If you think this is the case or
==2934==    you are not sure, please let us know and we'll try to fix it.
==2934== Either way, Valgrind will now raise a SIGILL signal which will
==2934== probably kill your program.
==2934==
==2934== Process terminating with default action of signal 4 (SIGILL): dumping core
==2934==  Illegal opcode at address 0x400899C
==2934==    at 0x400899C: ??? (in /libexec/ld-elf.so.1)
==2934==    by 0x4009D0F: ??? (in /libexec/ld-elf.so.1)
==2934==    by 0x4008018: ??? (in /libexec/ld-elf.so.1)
==2934==
==2934== HEAP SUMMARY:
==2934==     in use at exit: 0 bytes in 0 blocks
==2934==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==2934==
==2934== All heap blocks were freed -- no leaks are possible
==2934==
==2934== For lists of detected and suppressed errors, rerun with: -s
==2934== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
Illegal instruction

$ objdump -S /libexec/ld-elf.so.1|grep -i 899c:
    899c:       f3 48 0f ae d3          wrfsbase %rbx
$

System compiler version is:
FreeBSD clang version 7.0.1 (tags/RELEASE_701/final 349250) (based on LLVM 7.0.1)

Linker:
LLD 7.0.1 (FreeBSD 349250-1300001) (compatible with GNU linkers)
Comment 1 Julian Seward 2019-03-10 09:49:54 UTC
Is there any fix for this?  The FreeBSD people give the impression
that V more-or-less works on FreeBSD, so I'm a bit surprised this fails
for you every time.
Comment 2 Roman Bogorodskiy 2019-03-10 10:23:09 UTC
(In reply to Julian Seward from comment #1)
> Is there any fix for this?  The FreeBSD people give the impression
> that V more-or-less works on FreeBSD, so I'm a bit surprised this fails
> for you every time.

I'm not aware of any fix for that.
V more-or-less works on FreeBSD, however I'm running -CURRENT which has some important changes made, e.g. linker changed and wrfsbase added.
Comment 3 Tom Hughes 2019-03-10 10:28:17 UTC
Well this is just an unimplemented instruction so it will depend on what compiler flags were used when building - it's not some general FreeBSD problem unless that is coming from assembly code?
Comment 4 Roman Bogorodskiy 2019-03-10 11:54:36 UTC
(In reply to Tom Hughes from comment #3)
> Well this is just an unimplemented instruction so it will depend on what
> compiler flags were used when building - it's not some general FreeBSD
> problem unless that is coming from assembly code?

As far as I understand, it's a general problem because it's coming from ld-elf.so.1. It seems that the call was introduced here: https://svnweb.freebsd.org/base/head/libexec/rtld-elf/amd64/reloc.c?r1=339897&r2=339896&pathrev=339897

FWIW, the test app (uname) was built using the following cflags:

(15:40) novel@romashka:/usr/src/usr.bin/uname %> make -V CFLAGS
-O2 -pipe   -g  -std=gnu99 -fstack-protector-strong -Wsystem-headers -Werror -Wall -Wno-format-y2k -W -Wno-unused-parameter -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Wreturn-type -Wcast-qual -Wwrite-strings -Wswitch -Wshadow -Wunused-parameter -Wcast-align -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wold-style-definition -Wno-pointer-sign -Wmissing-variable-declarations -Wthread-safety -Wno-empty-body -Wno-string-plus-int -Wno-unused-const-variable  -Qunused-arguments 
(15:40) novel@romashka:/usr/src/usr.bin/uname %>
Comment 5 Tom Hughes 2019-03-10 12:04:05 UTC
Right but that function is presumably just assembly that invokes the instruction explicitly, so it falls in to the "unless that is coming from assembly code" part of my comment.
Comment 6 Tom Hughes 2019-03-10 12:05:19 UTC
Also it looks like the issue is that we are advertising a CPU capability in our feature mask that we don't actually support (the "fsgsbase" feature) so one quick fix would be for us to mask that out which will cause that code to take the else branch.
Comment 7 Tom Hughes 2019-03-10 12:11:38 UTC
Created attachment 118681 [details]
Suppress FSGSBASE feature flag for AVX2 capable CPUs

Try this patch - it should hopefully suppress the FSGSBASE feature for those CPUs (currently only AVX2 capable ones) where we were reporting it.
Comment 8 Roman Bogorodskiy 2019-03-10 12:38:22 UTC
(In reply to Tom Hughes from comment #5)
> Right but that function is presumably just assembly that invokes the
> instruction explicitly, so it falls in to the "unless that is coming from
> assembly code" part of my comment.

Ah, yeah, you're right, wrfsbase() is(In reply to Tom Hughes from comment #7)
> Created attachment 118681 [details]
> Suppress FSGSBASE feature flag for AVX2 capable CPUs
> 
> Try this patch - it should hopefully suppress the FSGSBASE feature for those
> CPUs (currently only AVX2 capable ones) where we were reporting it.

Thanks, looks like this helps. At least it doesn't crash and report leaks on a basic test app.
Comment 9 Roman Bogorodskiy 2019-03-14 14:32:42 UTC
Any plans to get that into master?
Comment 10 Tom Hughes 2019-03-14 14:35:08 UTC
Well I didn't commit it because I wasn't sure if Julian would prefer to implement the instruction instead.

Patches are normally reviewed for inclusion before a release in any case.
Comment 11 Julian Seward 2019-03-14 14:38:22 UTC
(In reply to Tom Hughes from comment #10)
> Well I didn't commit it because [..]

Oh! I wasn't aware of that.  Land it; if there's borkage (which I would
find highly surprising), we can just back it out.
Comment 12 Tom Hughes 2019-03-14 15:21:23 UTC
Landed as 09566120e705d8831aaa7076b439d3ad90b78773.