Bug 255009 - drd (and also helgrind) crashing on chmod with invalid parameter (with patch)
Summary: drd (and also helgrind) crashing on chmod with invalid parameter (with patch)
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: drd (show other bugs)
Version: 3.6.0
Platform: Unlisted Binaries Linux
: NOR crash
Target Milestone: ---
Assignee: Bart Van Assche
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-10-23 00:21 UTC by Philippe Waroquiers
Modified: 2011-02-09 23:58 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philippe Waroquiers 2010-10-23 00:21:55 UTC
For the below (wrong) program, both helgrind and drd are crashing,
showing an internal error.
When running outside of helgrind/drd, the program just reports
a failed system call.
With the addition of the following code at the beginning of
drd_pre_mem_read_asciiz and evh__pre_mem_read_asciiz, no crash occurs
(program behaviour is then the same with or without helgrind/drd):

   // check if the address is valid. If not valid, do nothing.
   // We have a bit of a problem here: what is the length to be checked ?
   // For this, we must call strlen, but this implies to have access to it.
   // Currently, we just check one byte.
   if (!VG_(am_is_valid_for_client) (a, 1, VKI_PROT_NONE))
   {
      VG_(umsg)("System call invoked with invalid string address %p\n",
                (void*)a);
      return;
   }


Note that the wrong chmod happens in a big executable, in a lib for
which we do not have the sources, so bypassing that in  helgrind/drd
would be nice to have.


#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
int main ()
{
  int res;
  char *path = (char *) 0x2;
  res = chmod (path, 0);
  if (res == 0)
    printf ("unexpected success of chmod of brol address\n");
  else
    perror ("show error of chmod of brol address");
}


gcc -g -o chmod_brol chmod_brol.c

install/bin/valgrind --tool=helgrind ./chmod_brol
==26778== Helgrind, a thread error detector
==26778== Copyright (C) 2007-2010, and GNU GPL'd, by OpenWorks LLP et al.
==26778== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info
==26778== Command: ./chmod_brol
==26778== 
--26778-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--26778-- si_code=1;  Faulting address: 0x2;  sp: 0x40345bd38

valgrind: the 'impossible' happened:
   Killed by fatal signal
==26778==    at 0x38019B42: vgPlain_strlen (m_libcbase.c:174)
==26778==    by 0x38004582: evh__pre_mem_read_asciiz (hg_main.c:1769)
==26778==    by 0x380588AF: vgPlain_client_syscall (syswrap-main.c:1443)
==26778==    by 0x38055F71: vgPlain_scheduler (scheduler.c:895)
==26778==    by 0x3807BD9E: run_a_thread_NORETURN (syswrap-linux.c:94)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==26778==    at 0x4EEB6C7: chmod (in /lib64/libc-2.5.so)
==26778==    by 0x400545: main (chmod_brol.c:8)




install/bin/valgrind --tool=drd ./chmod_brol
==26894== drd, a thread error detector
==26894== Copyright (C) 2006-2010, and GNU GPL'd, by Bart Van Assche.
==26894== Using Valgrind-3.6.0 and LibVEX; rerun with -h for copyright info
==26894== Command: ./chmod_brol
==26894== 
--26894-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--26894-- si_code=1;  Faulting address: 0x2;  sp: 0x40305bd58

valgrind: the 'impossible' happened:
   Killed by fatal signal
==26894==    at 0x3800FF40: drd_pre_mem_read_asciiz (drd_main.c:264)
==26894==    by 0x380562DF: vgPlain_client_syscall (syswrap-main.c:1443)
==26894==    by 0x380539A1: vgPlain_scheduler (scheduler.c:895)
==26894==    by 0x380797CE: run_a_thread_NORETURN (syswrap-linux.c:94)

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable
==26894==    at 0x4EF56C7: chmod (in /lib64/libc-2.5.so)
==26894==    by 0x400545: main (chmod_brol.c:8)
Comment 1 Bart Van Assche 2010-10-23 13:11:19 UTC
I can reproduce the reported behavior. But I'm not sure it is the purpose of Helgrind or DRD to report pointer errors instead of crashing on them - reporting pointer errors is the purpose of memcheck. Furthermore, in the manuals of both Helgrind and DRD it is recommended to make a program memcheck-clean before starting to analyze it with Helgrind or DRD. Is it OK for you to close this bug report ?
Comment 2 Philippe Waroquiers 2010-10-23 17:20:39 UTC
(In reply to comment #1)
> I can reproduce the reported behavior. But I'm not sure it is the purpose of
> Helgrind or DRD to report pointer errors instead of crashing on them -
> reporting pointer errors is the purpose of memcheck. Furthermore, in the
> manuals of both Helgrind and DRD it is recommended to make a program
> memcheck-clean before starting to analyze it with Helgrind or DRD. Is it OK for
> you to close this bug report ?
This error happens in a library for which we do not have the sources
=> can't fix it.
memcheck does not crash on such an error, so we can put it in the memcheck
ignore list, but cannot tell drd/helgrind to ignore it, as they both crash.

So, to still allow drd/helgrind to be used, it looks better to either
ignore such unaddressable bytes (with a msg maybe: the patch I did does that)

or alternatively by defining a new kind of drd/helgrind errors
(which would tell something like: 
"unaddressable byte(s) accessed, use memcheck for chasing this error")

With this, drd/helgrind would be more usable (and would not change
the behaviour of system calls).
Comment 3 Julian Seward 2011-02-09 12:36:31 UTC
Philippe, the use of VKI_PROT_NONE in the patch is not quite
right.  It allows the use of VG_(strlen) even if the area is
mapped but does not have read permissions (eg, --- or -x-).

Can you test your patch with VKI_PROT_READ instead of
VKI_PROT_NONE on your big application?  For
the small test case, VKI_PROT_READ works OK.
Comment 4 Julian Seward 2011-02-09 13:51:02 UTC
Modified version of the patch committed as r11533, for both Helgrind
and DRD, on the assumption that VKI_PROT_READ is OK -- shout if not!
Thanks for looking into this.
Comment 5 Philippe Waroquiers 2011-02-09 23:32:58 UTC
(In reply to comment #4)
> Modified version of the patch committed as r11533, for both Helgrind
> and DRD, on the assumption that VKI_PROT_READ is OK -- shout if not!
> Thanks for looking into this.

I am quite sure the patch will avoid the problem on the big executable
(but I will check in any case)

Two comments however:
1. if SHOW_EVENTS is set to >= 1, then I believe the trace will crash.
2. if a page is only PROT_WRITE or PROT_EXEC, posix standard tells
   the implementation may (implicitely) give read access.
   I think linux behaves like that (e.g. PROT_WRITE implicitely gives
   read access, but without having PROT_READ in the protection bits).
   So, a "fully correct" check implies to check for READ
   or (if the kernel gives implicit read) for WRITE or for EXEC.
   Using VKI_PROT_NONE is on the safe side for finding race condition,
   but is on the unsafe side for crashes: 
   VG still crashes if the protection is only PROT_NONE.

   Using VKI_PROT_READ is on the safe side to avoid VG crashes, but might
   cause helgrind/drd to "not see" some reads of a PROT_WRITE or PROT_EXEC.
      
   At the end, the VKI_PROT_READ looks more readable/understandable than
   the VKI_PROT_NONE check, and avoids VG crashes.
   And probably no one will bitterly complain about a false negative
   in such a bizarre PROT_WRITE or PROT_EXEC only case.

   There might however be cases (or other tools)
   where the above subtility is more important,
   so this is why I give this (long) comment that you can ignore :).
 
Thanks for this fix and for the threaded fork fix.

Philippe
Comment 6 Philippe Waroquiers 2011-02-09 23:58:35 UTC
(In reply to comment #5)

> Two comments however:
> 1. if SHOW_EVENTS is set to >= 1, then I believe the trace will crash.
At second reading, this belief is wrong:
it is the pointer a which is traced, not the string.
So, an invalid pointer will not cause a crash in the trace.