Bug 205541

Summary: false positive on write after mprotect(,,PROT_WRITE) after mmap(,,PROT_NONE,,,)
Product: [Developer tools] valgrind Reporter: John Reiser <jreiser>
Component: memcheckAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal CC: astasenko, dank, jakub, njn, vargaz
Priority: NOR    
Version: 3.5.0   
Target Milestone: blocking3.5.1   
Platform: Unlisted Binaries   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: patch mmap to act like mprotect when PROT_NONE
*.diff after "make regtest"
alternative patch
patch implementing proposal in comment #7

Description John Reiser 2009-08-29 00:03:49 UTC
A large region was mmap()ed at a fixed address using PROT_NONE.  A prefix of the region was mprotect()ed to allow both read and write.  A write was done to the interior of the prefix, but memcheck complained incorrectly.  There is no complaint if the region is mmap()ed initially using PROT_WRITE|PROT_READ instead of PROT_NONE.

-----bug.c
#include <sys/mman.h>

main()
{
	char *addr = mmap((char *)0x7f000000, 1114112, PROT_NONE,
		MAP_ANONYMOUS|MAP_FIXED|MAP_PRIVATE, -1, 0);
	mprotect((char *)0x7f000000, 65536, PROT_WRITE|PROT_READ);
	*(int *)0x7f000054 = 0x12345678;  /* legal: inside 1st 64KiB */
	return 0;
}
-----
$ gcc -g -o bug bug.c
$ valgrind -v --trace-syscalls=yes ./bug
==2236== Memcheck, a memory error detector
==2236== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==2236== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==2236== Command: ./bug
==2236== 
--2236-- Valgrind options:
--2236--    -v
--2236--    --trace-syscalls=yes
--2236-- Contents of /proc/version:
--2236--   Linux version 2.6.31-0.180.rc7.git4.fc12.i686.PAE (mockbuild@x86-4.fedora.phx.redhat.com) (gcc version 4.4.1 20090818 (Red Hat 4.4.1-6) (GCC) ) #1 SMP Wed Aug 26 16:15:07 EDT 2009
--2236-- Arch and hwcaps: X86, x86-sse1-sse2
--2236-- Page sizes: currently 4096, max supported 4096
--2236-- Valgrind library directory: /usr/lib/valgrind
--2236-- Reading syms from /lib/ld-2.10.90.so (0x1ef000)
--2236-- Reading debug info from /usr/lib/debug/lib/ld-2.10.90.so.debug ..
--2236-- .. CRC mismatch (computed 43eb8a91 wanted e2127ff6)
--2236-- Reading syms from /home/jreiser/bug (0x8048000)
--2236-- Reading syms from /usr/lib/valgrind/memcheck-x86-linux (0x38000000)
--2236--    object doesn't have a dynamic symbol table
   <<snip>>
SYSCALL[2236,1](192) sys_mmap2 ( 0x7f000000, 1114112, 0, 50, -1, 0 ) --> [pre-success] Success(0x0:0x7f000000) 
SYSCALL[2236,1](125) sys_mprotect ( 0x7f000000, 65536, 3 )[sync] --> Success(0x0:0x0) 
==2236== Invalid write of size 4
==2236==    at 0x8048456: main (bug.c:22)
==2236==  Address 0x7f000054 is not stack'd, malloc'd or (recently) free'd


$ uname -a
Linux f12a32.local 2.6.31-0.180.rc7.git4.fc12.i686.PAE #1 SMP Wed Aug 26 16:15:07 EDT 2009 i686 i686 i386 GNU/Linux
## This happens to be Fedora 12 Alpha, but the problem also appears on Fedora 11.
Comment 1 John Reiser 2009-08-29 20:36:02 UTC
Created attachment 36564 [details]
patch mmap to act like mprotect when PROT_NONE

Something like this seems plausible.
Comment 2 Julian Seward 2009-08-30 22:31:44 UTC
(In reply to comment #1)
> seems plausible.

Does this work properly (no effect on regtests) ?  The reason
mprotect is not handled consistently with mmap is that doing so
used to screw up m_debuginfo's fragile logic for figuring out
when/where text/data segments got mapped.  All that stuff needs
revisiting anyway.  There are other bugs open on it.
Comment 3 John Reiser 2009-08-30 23:32:44 UTC
Created attachment 36584 [details]
*.diff after "make regtest"

With the patch, running under Fedora 11 on x86_64 [and with "-Wl,-defsym,valt_load_address=0x7c000000"], "make regtest" ends with:

== 534 tests, 9 stderr failures, 0 stdout failures, 0 post failures ==
memcheck/tests/linux/stack_switch        (stderr)
memcheck/tests/long_namespace_xml        (stderr)
none/tests/procfs-linux                  (stderr)
helgrind/tests/tc06_two_races_xml        (stderr)
helgrind/tests/tc20_verifywrap           (stderr)
helgrind/tests/tc23_bogus_condwait       (stderr)
drd/tests/annotate_trace_memory          (stderr)
drd/tests/tc09_bad_unlock                (stderr)
exp-ptrcheck/tests/bad_percentify        (stderr)

and this attachment contains the *.diff.  These look the same to me as the *.diff due to Fedora 11 anyway.
Comment 4 Dan Kegel 2009-10-18 05:37:29 UTC
*** Bug 210268 has been marked as a duplicate of this bug. ***
Comment 5 Julian Seward 2010-01-26 02:38:15 UTC
Created attachment 40234 [details]
alternative patch

Here's another way to make memcheck handle mmap and mprotect
in a mutually consistent way.  John's patch allows memcheck to
believe that memory mmap'd PROT_NONE is still accessible, which
isn't good.  This patch leaves the mmap handling unchanged and
instead makes mprotect behave consistently with mmap -- that is,
if memory is mprotected to be PROT_NONE then it's considered
inaccessible, whereas if it is mprotected to anything else then
it's considered accessible and initialised (oh well :-)

This makes memcheck stop complaining about the test program
provided, and also shuts it up with Wine, which afaics is the
real point here.
Comment 6 John Reiser 2010-01-26 04:23:57 UTC
Yes, the patch attached to comment #5 makes mmap and mprotect consistent.  Yes, it reduces the false positive complaints from wine+memcheck.  However, the patch introduces a new class [or perhaps reinforces an old class] of false negative reports, undetected memory usage errors that could/would have been detected.  mprotect(addr, len, PROT_NONE) followed by mprotect(addr, len, anything_except_PROT_NONE) of the same region will force all the Valid bits to 1, whereas the operating system actually retains the prior contents and validity [the "oh well" from comment #5.]

So choose between easy-to-explain-but-not-always-correct, and correct-but-difficult-to-understand (and the original plain-wrong-but-least-work.)
Comment 7 Julian Seward 2010-01-26 11:52:11 UTC
Ah, I see this is more complex than it looked at 3.30am yesterday :-)

How about the following proposal?

  mmap NONE       ->  make noaccess
  mmap other      ->  make defined

  mprotect NONE   ->  # no change
  mprotect other  ->  change any "noaccess" to "defined"

I think this is what Jakub is suggesting in the very last sentence of
https://bugs.kde.org/show_bug.cgi?id=210268#c1

This gives the behaviours:

   mmap NONE; rd/wr                 -> MC warns, then we get SEGV
   mmap other; rd                   -> reads as initialised
   mmap other; wr                   -> no error
   mmap NONE; mprotect other; rd    -> reads as initialised
   mmap NONE; mprotect NONE; rd/wr  -> MC warns, then we get SEGV

so mmap and mprotect are handled consistently.  The case we can't
deal with really correctly is (eg)

   mmap other; mprotect NONE; rd/wr  -> MC doesn't warn, but we get SEGV

but at least this retains the V bits for the mprotected area, so there
is no chance of false negatives.  And if there is an access then at
least we get a SEGV traceback, which is better than nothing.
Comment 8 Julian Seward 2010-01-26 13:10:08 UTC
Created attachment 40251 [details]
patch implementing proposal in comment #7
Comment 9 John Reiser 2010-01-26 18:18:10 UTC
The analysis in comment #7 and patch in comment #8 are OK.  The byte-by-byte implementation of make_mem_defined_if_noaccess() is OK for now, but high usage or large intervals of mprotect(,,other) could benefit from optimization to take advantage of intervals larger than 1 byte that are all NOACCESS or all not-NOACCESS.
Comment 10 Julian Seward 2010-01-27 11:29:11 UTC
(In reply to comment #9)
> or large intervals of mprotect(,,other) could benefit from optimization [...]

Sure, can unroll/vectorise/whatever if need be.
Comment 11 Julian Seward 2010-01-27 11:29:48 UTC
Fixed (r11031).
Comment 12 Tom Hughes 2010-02-01 22:17:36 UTC
*** Bug 225188 has been marked as a duplicate of this bug. ***