Bug 357833 - Valgrind is broken on recent linux kernel
Summary: Valgrind is broken on recent linux kernel
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.10.0
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-01-11 09:25 UTC by kmu
Modified: 2016-01-21 11:38 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Remove setting zero RLIMIT_DATA (4.54 KB, patch)
2016-01-20 13:03 UTC, Mark Wielaard
Details

Note You need to log in before you can comment on or make changes to this bug.
Description kmu 2016-01-11 09:25:14 UTC
Valgrind fails to start on recent linux-next with the following error:

valgrind: mmap(0x600000, 8192) failed in UME with error 12 (Cannot
allocate memory).

I bisected linux kernel and found, that the problem is bacause of this patch https://lkml.org/lkml/2015/12/14/72 . Patch classifies all memory allocation (brk and mmap) in several groups and add checks against respective rlimits. When memcheck-amd64-linux starts it set RLIMIT_DATA to 0, but then it tries to mmap part of checked app binary with the following call (reported by strace):

mmap(0x600000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0)

which fails, because PROT_WRITE MAP_PRIVATE mappings are now classified as a data (and it sounds reasonable to me, because private writeable mmap is essentialy an allocation), and therefore check agains RLIMIT_DATA fails.

I don't know the logic behind setting RLIMIT_DATA to 0, but after i commend out setrlimit call in the coregrind/m_main.c the problem was gone.

Reproducible: Always

Steps to Reproduce:
1. git clone https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
2. cd linux-next
3. git checkout 1f4ea4be97bea8f6d10ebbfef15d32511a81bcc4
4. configure and build linux kernel
5. create and compile a simple "Hello, World"
6. run the program under valgrind on the compiled linux kernel

Actual Results:  
valgrind fails with error:
valgrind: mmap(0x600000, 8192) failed in UME with error 12 (Cannot allocate memory)

Expected Results:  
valgrind finishes successfully without errors
Comment 1 Tom Hughes 2016-01-11 09:56:07 UTC
I presume that this code in coregrind/m_main.c is the issue (so will affect all tools, not just memcheck):

   //--------------------------------------------------------------
   // Get the current process datasize rlimit, and set it to zero.
   // This prevents any internal uses of brk() from having any effect.
   // We remember the old value so we can restore it on exec, so that
   // child processes will have a reasonable brk value.
   VG_(getrlimit)(VKI_RLIMIT_DATA, &VG_(client_rlimit_data));
   zero.rlim_max = VG_(client_rlimit_data).rlim_max;
   VG_(setrlimit)(VKI_RLIMIT_DATA, &zero);

The logic is self-explanatory, but equally it seems it is only an attempt to detect mistakes in valgrind, so that any attempt to allocate memory with brk() would fail.
Comment 2 kmu 2016-01-11 10:02:43 UTC
> I presume that this code in coregrind/m_main.c is the issue

indeed, after i commented out "VG_(setrlimit)(VKI_RLIMIT_DATA, &zero);" the problem was gone (as i wrote in the description).
Comment 3 Mark Wielaard 2016-01-20 10:52:16 UTC
Looks like this commit went into linux git tree recently:

commit 84638335900f1995495838fe1bd4870c43ec1f67
Author:     Konstantin Khlebnikov <koct9i@gmail.com>
AuthorDate: Thu Jan 14 15:22:07 2016 -0800
Commit:     Linus Torvalds <torvalds@linux-foundation.org>
CommitDate: Thu Jan 14 16:00:49 2016 -0800

    mm: rework virtual memory accounting

So it should be possible to replicate this now with a "normal" linux git master build. But I haven't done so yet.
Comment 4 Mark Wielaard 2016-01-20 11:39:09 UTC
Replicated it now with the fedora rawhide kernel 4.5.0-0.rc0.git6.1.fc24.i686+PAE
Not setting the RLIMIT_DATA to zero in coregrind/m_main.c (valgrind_main) does indeed work around it.
Comment 5 Mark Wielaard 2016-01-20 11:44:39 UTC
For reference here is the full commit explaining that previously the RLIMIT_DATA value indeed was mostly harmless only affecting brk, but now restricts any data area allocations:

commit 84638335900f1995495838fe1bd4870c43ec1f67
Author:     Konstantin Khlebnikov <koct9i@gmail.com>
AuthorDate: Thu Jan 14 15:22:07 2016 -0800
Commit:     Linus Torvalds <torvalds@linux-foundation.org>
CommitDate: Thu Jan 14 16:00:49 2016 -0800

    mm: rework virtual memory accounting
    
    When inspecting a vague code inside prctl(PR_SET_MM_MEM) call (which
    testing the RLIMIT_DATA value to figure out if we're allowed to assign
    new @start_brk, @brk, @start_data, @end_data from mm_struct) it's been
    commited that RLIMIT_DATA in a form it's implemented now doesn't do
    anything useful because most of user-space libraries use mmap() syscall
    for dynamic memory allocations.
    
    Linus suggested to convert RLIMIT_DATA rlimit into something suitable
    for anonymous memory accounting.  But in this patch we go further, and
    the changes are bundled together as:
    
     * keep vma counting if CONFIG_PROC_FS=n, will be used for limits
     * replace mm->shared_vm with better defined mm->data_vm
     * account anonymous executable areas as executable
     * account file-backed growsdown/up areas as stack
     * drop struct file* argument from vm_stat_account
     * enforce RLIMIT_DATA for size of data areas
    
    This way code looks cleaner: now code/stack/data classification depends
    only on vm_flags state:
    
     VM_EXEC & ~VM_WRITE            -> code  (VmExe + VmLib in proc)
     VM_GROWSUP | VM_GROWSDOWN      -> stack (VmStk)
     VM_WRITE & ~VM_SHARED & !stack -> data  (VmData)
    
    The rest (VmSize - VmData - VmStk - VmExe - VmLib) could be called
    "shared", but that might be strange beast like readonly-private or VM_IO
    area.
    
     - RLIMIT_AS            limits whole address space "VmSize"
     - RLIMIT_STACK         limits stack "VmStk" (but each vma individually)
     - RLIMIT_DATA          now limits "VmData"
    
    Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
    Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
    Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
    Cc: Vegard Nossum <vegard.nossum@oracle.com>
    Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Willy Tarreau <w@1wt.eu>
    Cc: Andy Lutomirski <luto@amacapital.net>
    Cc: Kees Cook <keescook@google.com>
    Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
    Cc: Pavel Emelyanov <xemul@virtuozzo.com>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Comment 6 Mark Wielaard 2016-01-20 13:03:05 UTC
Created attachment 96751 [details]
Remove setting zero RLIMIT_DATA

The simplest seems to be to just remove the zero data rlimit. It also gets rid of some nasty hacks in our execv and spawn wrappers.
Comment 7 Ivo Raisr 2016-01-20 17:03:59 UTC
Hi Mark, Thank you for providing the fix also for Solaris.
It works ok on Solaris 12, regression tests passed.
Comment 8 Mark Wielaard 2016-01-21 11:38:40 UTC
(In reply to Ivo Raisr from comment #7)
> Hi Mark, Thank you for providing the fix also for Solaris.
> It works ok on Solaris 12, regression tests passed.

Thanks for testing!

This has now been committed as valgrind svn r15766.