Bug 197259

Summary: Unsupported arch_prtctl PR_SET_GS option
Product: [Developer tools] valgrind Reporter: cournape
Component: generalAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: crash CC: austinenglish, dimitry, joe, marc.bessieres, njn, peter.maydell, philippe.waroquiers, tom
Priority: NOR    
Version: unspecified   
Target Milestone: wanted3.6.0   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Log of unsupported arch_prtctl option
test.cpp
set/get_gs hack similar to set/get_fs hack

Description cournape 2009-06-20 13:42:05 UTC
Version:            (using Devel)
Compiler:          stock gcc from Ubuntu 9.04 64 (gcc 4.3.3) 
OS:                Linux
Installed from:    Compiled sources

I tried using valgrind for debugging a windows 64 bits application under wine, but this fails with "valgrind: the 'impossible' happened: Unsupported arch_prtctl option". The tested application is python for windows 64, but I guess the problem is in wine. The command line was:

  wine --trace-children=yes python -c ""

Info:
  Ubuntu 64 bits 9.04
  valgrind 3.4.1 (ubuntu package)

The complete output:

==3803== Memcheck, a memory error detector.
==3803== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==3803== Using LibVEX rev 1884, a library for dynamic binary translation.
==3803== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==3803== Using valgrind-3.4.1-Debian, a dynamic binary instrumentation framework.
==3803== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==3803== For more details, rerun with: -v
==3803== 
==3804== Memcheck, a memory error detector.
==3804== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==3804== Using LibVEX rev 1884, a library for dynamic binary translation.
==3804== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==3804== Using valgrind-3.4.1-Debian, a dynamic binary instrumentation framework.
==3804== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==3804== For more details, rerun with: -v
==3804== 
==3804== 
==3804== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1)
==3804== malloc/free: in use at exit: 0 bytes in 0 blocks.
==3804== malloc/free: 1,387 allocs, 1,387 frees, 76,062 bytes allocated.
==3804== For counts of detected errors, rerun with: -v
==3804== All heap blocks were freed -- no leaks are possible.
==3805== 
==3805== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1)
==3805== malloc/free: in use at exit: 10,018 bytes in 56 blocks.
==3805== malloc/free: 68 allocs, 12 frees, 15,368 bytes allocated.
==3805== For counts of detected errors, rerun with: -v
==3805== searching for pointers to 56 not-freed blocks.
==3805== checked 82,896 bytes.
==3805== 
==3805== LEAK SUMMARY:
==3805==    definitely lost: 0 bytes in 0 blocks.
==3805==      possibly lost: 0 bytes in 0 blocks.
==3805==    still reachable: 10,018 bytes in 56 blocks.
==3805==         suppressed: 0 bytes in 0 blocks.
==3805== Rerun with --leak-check=full to see details of leaked memory.
==3806== Memcheck, a memory error detector.
==3806== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==3806== Using LibVEX rev 1884, a library for dynamic binary translation.
==3806== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==3806== Using valgrind-3.4.1-Debian, a dynamic binary instrumentation framework.
==3806== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==3806== For more details, rerun with: -v
==3806== 
==3806== 
==3806== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1)
==3806== malloc/free: in use at exit: 0 bytes in 0 blocks.
==3806== malloc/free: 1,387 allocs, 1,387 frees, 76,062 bytes allocated.
==3806== For counts of detected errors, rerun with: -v
==3806== All heap blocks were freed -- no leaks are possible.
==3803== Memcheck, a memory error detector.
==3803== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al.
==3803== Using LibVEX rev 1884, a library for dynamic binary translation.
==3803== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP.
==3803== Using valgrind-3.4.1-Debian, a dynamic binary instrumentation framework.
==3803== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.
==3803== For more details, rerun with: -v
==3803== 

valgrind: the 'impossible' happened:
   Unsupported arch_prtctl option
==3803==    at 0x3802A7AC: report_and_quit (m_libcassert.c:140)
==3803==    by 0x3802A8C4: panic (m_libcassert.c:215)
==3803==    by 0x3802A932: vgPlain_core_panic_at (m_libcassert.c:220)
==3803==    by 0x3802A951: vgPlain_core_panic (m_libcassert.c:225)
==3803==    by 0x380961E9: vgSysWrap_amd64_linux_sys_arch_prctl_before (syswrap-amd64-linux.c:531)
==3803==    by 0x380501A0: vgPlain_client_syscall (syswrap-main.c:942)
==3803==    by 0x3804D672: handle_syscall (scheduler.c:824)
==3803==    by 0x3804E676: vgPlain_scheduler (scheduler.c:1018)
==3803==    by 0x38060CB0: run_a_thread_NORETURN (syswrap-linux.c:89)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==3803==    at 0x4C7C367: arch_prctl (in /lib/libc-2.9.so)
==3803==    by 0x55987D6: thread_init (thread.c:293)
==3803==    by 0x5565989: __wine_process_init (loader.c:2657)
==3803==    by 0x4633FE3: wine_init (loader.c:657)
==3803==    by 0x7BF01435: main (main.c:221)
Comment 1 Tom Hughes 2009-07-02 15:43:23 UTC
I suspect that this PR_SET_GS although the stack trace quoted doesn't quite match my copy of wine.
Comment 2 Tom Hughes 2009-07-02 15:52:45 UTC
That should read ARCH_SET_FS of course. Fixing this looks like it will be tricky as it will need to signal the GS value to VEX as it currently does for FS when ARCH_SET_FS is used.

That may be as simple as setting guest_GS_0x60 in the guest state, as it currently does for guest_FS_ZERO when setting FS but Julian probably needs to look at this and work out the best way to handle it.
Comment 3 Joe Drew 2012-08-08 20:06:15 UTC
I have exactly this problem when running a 32-bit valgrind to debug a 32-bit wine (which itself is running a 32-bit Firefox). However, all this is running on a 64-bit host; I presume that shouldn't matter.
Comment 4 sworddragon2 2014-05-19 18:38:16 UTC
Created attachment 86713 [details]
Log of unsupported arch_prtctl option

The bug still exists in the development version 3.10 (2014-04-11) as the attachment shows. The operating system is Ubuntu 14.10 dev (x86_64) and Wine was compiled as x86_64 too.
Comment 5 Austin English 2014-05-20 22:36:42 UTC
I'm seeing this as well, when running the 32-bit wine unit tests under a 64-bit wine. With a 32-bit wine, it does not occur.

An example is the hlink/hlink test.

wine-1.7.19-27-gabea10f, valgrind 3.9.0, gentoo64, gcc 4.8.2.
Comment 6 Peter Maydell 2014-11-11 10:31:53 UTC
This bug (lack of support for arch_prctl(ARCH_SET_GS, ...)) also prevents running valgrind on QEMU's linux-user binaries on x86-64 hosts:

$ valgrind --smc-check=all ./build/x86/arm-linux-user/qemu-arm ~/linaro/qemu-misc-tests/auxv-a32 
==24273== Memcheck, a memory error detector
==24273== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==24273== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==24273== Command: ./build/x86/arm-linux-user/qemu-arm /home/petmay01/linaro/qemu-misc-tests/auxv-a32
==24273== 
==24273== Warning: set address range perms: large range [0x3a04b000, 0x13104b000) (noaccess)
==24273== Warning: ignored attempt to set SIGKILL handler in sigaction();
==24273==          the SIGKILL signal is uncatchable
==24273== Warning: ignored attempt to set SIGRT32 handler in sigaction();
==24273==          the SIGRT32 signal is used internally by Valgrind

valgrind: the 'impossible' happened:
   Unsupported arch_prtctl option

host stacktrace:
==24273==    at 0x380A484F: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24273==    by 0x380A4944: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24273==    by 0x380A4B71: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24273==    by 0x380A4B9A: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24273==    by 0x3812EA08: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24273==    by 0x380F7FD0: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24273==    by 0x380F48AA: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24273==    by 0x380F5F86: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)
==24273==    by 0x38105810: ??? (in /usr/lib/valgrind/memcheck-amd64-linux)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==24273==    at 0x60993D7: arch_prctl (syscall-template.S:81)
==24273==    by 0x148B07: setup_guest_base_seg (tcg-target.c:1428)
==24273==    by 0x14A685: tcg_target_qemu_prologue (tcg-target.c:2278)
==24273==    by 0x14ACC8: tcg_prologue_init (tcg.c:381)
==24273==    by 0x174ECB: main (main.c:4071)


Incidentally there's a typo in the valgrind error message: it says "prtctl" rather than "prctl".
Comment 7 dimitry 2014-12-05 00:14:30 UTC
We have the same problem on android when running x86_64 version of dex2oat.

Based on how kernel handles this call - this pretty much as simple as enabling darwin hack for %gs which is assuming that gs is always a constant - it is actually 0x6b on linux, but it should not matter.
Comment 8 Philippe Waroquiers 2014-12-05 22:27:12 UTC
Can you run the crashing programs with --trace-syscalls=yes
and give the trace produced by :
PRE(sys_arch_prctl)
{
   ThreadState* tst;
   PRINT( "arch_prctl ( %ld, %lx )", ARG1, ARG2 ); <<<<<<< this line

Thanks
Comment 9 dimitry 2014-12-05 22:32:53 UTC
Sure, here you go:

SYSCALL[477,1](158) arch_prctl ( 4098, 4057740 ) --> [pre-success] Success(0x0:0x0)
Comment 10 dimitry 2014-12-05 22:34:48 UTC
Created attachment 89839 [details]
test.cpp

Attached the test that reproduces the problem
Comment 11 dimitry 2014-12-05 22:55:59 UTC
this is another one... without [pre-success]:
SYSCALL[477,1](158) arch_prctl ( 4097, 601048 )
Comment 12 Philippe Waroquiers 2014-12-05 23:37:00 UTC
(In reply to dimitry from comment #7)
> We have the same problem on android when running x86_64 version of dex2oat.
> 
> Based on how kernel handles this call - this pretty much as simple as
> enabling darwin hack for %gs which is assuming that gs is always a constant
> - it is actually 0x6b on linux, but it should not matter.

Are you really testing on android + x86 64 bits ?
To my knowledge, that is not a supported valgrind platform.
At least, I do not find a trace of VGPV_amd64_linux_android
i.e. the equivalent of e.g. VGPV_x86_linux_android or VGPV_arm64_linux_android
Comment 13 dimitry 2014-12-05 23:46:20 UTC
We run some android binaries on host (linux workstation) as well. But speaking of android kernel - it is basically linux kernel. So the same code should work on android.
Comment 14 Philippe Waroquiers 2014-12-06 00:08:25 UTC
Created attachment 89841 [details]
set/get_gs hack similar to set/get_fs hack

Implement for GS a similar hack as for FS
This generalises the 0x60 hack that was done for Darwin.
FS_ZERO and GS_0x60 have been renames FS_CONST and GS_CONST,
as in fact these hacks are working with whatever value of FS/GS, as
long as these are not modified

Feedback about patch welcome (in particular, does it still compile on darwin)

Thanks
Comment 15 dimitry 2014-12-06 00:21:12 UTC
VEX/priv/guest_amd64_helpers.c:
... alwaysDefd <- is there missing ALWAYSDEFD(guest_FS_CONST)?

VEX/pub/libvex_guest_amd64.h:
Looks like the comment for 'guest_GS_CONST' needs an update..

Otherwise lgtm.
Comment 16 Philippe Waroquiers 2014-12-07 17:24:47 UTC
(In reply to dimitry from comment #15)
Thanks for the feedback/testing.

> VEX/priv/guest_amd64_helpers.c:
> ... alwaysDefd <- is there missing ALWAYSDEFD(guest_FS_CONST)?
The current code has a commented line that contains such an ALWAYSDEFD for FS
and GS. Not too much idea why it is like that, but I see no reason to change it.

> 
> VEX/pub/libvex_guest_amd64.h:
> Looks like the comment for 'guest_GS_CONST' needs an update..
Not too sure what was missing, I have added the hardcoded linux value.
The new comment is:
+      /* HACK to make e.g. tls on darwin work, wine on linux work, ...
+         %gs only ever seems to hold a constant value (e.g. 0x60 on darwin,
+         0x6b on linux), and so guest_GS_CONST holds the 64-bit offset
+         associated with this constant %gs value.  (A direct analogue
+         of the %fs-const hack for amd64-linux). */
+      ULong guest_GS_CONST;

Does that look ok ? Otherwise, what do you suggest ?

Thanks
Comment 17 dimitry 2014-12-08 17:45:06 UTC
looks good to me, thanks!
Comment 18 Philippe Waroquiers 2014-12-17 00:00:46 UTC
Fix committed as 
  VEX Committed revision 3043.
  valgrind Committed revision 14815.