While profiling a program of mine using callgrind I ran into weird assertion failures in my code. On closer inspection it turned out that an array returned by calloc() in fact contained some elements that were not zero. This also happens with cachegrind and nulgrind, but not with memcheck and massif. Clearly, for code that expects calloc() to return zero-filled memory (as it should), this is a serious problem. This occurs both with Valgrind 3.1.0 and with the latest SVN sources. To reproduce: the Glibc sources contain a test program for calloc() that reproduces this bug: http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/libc/malloc/tst-calloc.c?rev=1.3&content-type=text/plain&cvsroot=glibc [eelco@tyros:/tmp/glibc-2.3.6/malloc]$ gcc ./tst-calloc.c [eelco@tyros:/tmp/glibc-2.3.6/malloc]$ ./a.out [eelco@tyros:/tmp/glibc-2.3.6/malloc]$ /tmp/inst/bin/valgrind --tool=none ./a.out ==17800== Nulgrind, a binary JIT-compiler. ==17800== Copyright (C) 2002-2005, and GNU GPL'd, by Nicholas Nethercote. ==17800== Using LibVEX rev 1574, a library for dynamic binary translation. ==17800== Copyright (C) 2004-2005, and GNU GPL'd, by OpenWorks LLP. ==17800== Using valgrind-3.2.0.SVN, a dynamic binary instrumentation framework. ==17800== Copyright (C) 2000-2005, and GNU GPL'd, by Julian Seward et al. ==17800== For more details, rerun with: -v ==17800== ./a.out: byte not cleared (size 176, element 482, byte 100) ==17800== [eelco@tyros:/tmp/glibc-2.3.6/malloc]$ uname -a Linux tyros 2.6.13-15.7-default #1 Tue Nov 29 14:32:29 UTC 2005 i686 i686 i386 GNU/Linux
Wow. That's a good one. You discovered a bug in the brk() simulation introduced by the address space manager rewrite in 3.1.0. The following fixes it for me -- can you try it? Index: coregrind/m_syswrap/syswrap-generic.c =================================================================== --- coregrind/m_syswrap/syswrap-generic.c (revision 5646) +++ coregrind/m_syswrap/syswrap-generic.c (working copy) @@ -947,6 +947,21 @@ if (seg && seg->hasT) VG_(discard_translations)( newbrk, VG_(brk_limit) - newbrk, "do_brk(shrink)" ); + /* Since we're being lazy and not unmapping pages, we have to + zero out the area, so that if the area later comes back into + circulation, it will be filled with zeroes, as if it really + had been unmapped and later remapped. Be a bit paranoid and + try hard to ensure we're not going to segfault by doing the + write - check both ends of the range are in the same segment + and that segment is writable. */ + if (seg) { + /* pre: newbrk < VG_(brk_limit) + => newbrk <= VG_(brk_limit)-1 */ + NSegment* seg2 = VG_(am_find_nsegment)( VG_(brk_limit)-1 ); + if (seg2 && seg == seg2 && seg->hasW) + VG_(memset)( (void*)newbrk, 0, VG_(brk_limit) - newbrk ); + } + VG_(brk_limit) = newbrk; return newbrk; }
It works! Thanks for your speedy fix :-)
Fixed (valgrind r5647).