Bug 345824 - aspacem segment mismatch: seen with none/tests/bigcode
Summary: aspacem segment mismatch: seen with none/tests/bigcode
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.10 SVN
Platform: macOS (DMG) macOS
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-03 10:55 UTC by Rhys Kidd
Modified: 2015-05-01 06:33 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
proposed patch (2.72 KB, patch)
2015-04-06 16:36 UTC, Florian Krohm
Details
patch for s390 (1.25 KB, patch)
2015-04-07 20:26 UTC, Florian Krohm
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rhys Kidd 2015-04-03 10:55:44 UTC
OS X reports the following failure with the recently introduced none/tests/bigcode, although the underlying components of that test have been in place for some time.

 aspacem segment mismatch: V's seg 1st, kernel's 2nd:
 aspacem 143: anon 00006e7000-00019e6fff     19m -wx-- SmFixed d=0x000 i=0       o=0       (-1,-1) (none)
 aspacem ...: .... 00006e7000-00019e6fff     19m rwx.. ....... d=0x000 i=0       o=0       (.) m=. (none)
 aspacem sync check at m_aspacemgr/aspacemgr-linux.c:2087 (Bool vgPlain_am_notify_client_mmap(Addr, SizeT, UInt, UInt, Int, Off64T)): FAILED
 aspacem 
 aspacem Valgrind: FATAL: aspacem assertion failed:
 aspacem   VG_(am_do_sync_check) (__PRETTY_FUNCTION__,__FILE__,__LINE__)
--90159:0: aspacem   at m_aspacemgr/aspacemgr-linux.c:2087 (Bool vgPlain_am_notify_client_mmap(Addr, SizeT, UInt, UInt, Int, Off64T))
--90159:0: aspacem Exiting now.

Reproducible: Always

Steps to Reproduce:
1. make regtest
2. $ perl tests/vg_regtest none/tests/bigcode
OR
3. $ ./vg-in-place --num-transtab-sectors=2 --sanity-level=4 ./perf/bigcode 1

Actual Results:  
$ perl tests/vg_regtest none/tests/bigcode
bigcode:         valgrind   --num-transtab-sectors=2 --sanity-level=4 ./../../perf/bigcode 1
*** bigcode failed (stdout) ***
*** bigcode failed (stderr) ***

== 1 test, 1 stderr failure, 1 stdout failure, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==

[Behind the scenes]
==90159== Memcheck, a memory error detector
==90159== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==90159== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright info
==90159== Command: ./perf/bigcode 1
==90159== 
mode 1: 20000 copies of f(), 1 reps
--90159:0: aspacem segment mismatch: V's seg 1st, kernel's 2nd:
--90159:0: aspacem 143: anon 00006e7000-00019e6fff     19m -wx-- SmFixed d=0x000 i=0       o=0       (-1,-1) (none)
--90159:0: aspacem ...: .... 00006e7000-00019e6fff     19m rwx.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--90159:0: aspacem sync check at m_aspacemgr/aspacemgr-linux.c:2087 (Bool vgPlain_am_notify_client_mmap(Addr, SizeT, UInt, UInt, Int, Off64T)): FAILED
--90159:0: aspacem 
--90159:0: aspacem Valgrind: FATAL: aspacem assertion failed:
--90159:0: aspacem   VG_(am_do_sync_check) (__PRETTY_FUNCTION__,__FILE__,__LINE__)
--90159:0: aspacem   at m_aspacemgr/aspacemgr-linux.c:2087 (Bool vgPlain_am_notify_client_mmap(Addr, SizeT, UInt, UInt, Int, Off64T))
--90159:0: aspacem Exiting now.

Expected Results:  
$ perl tests/vg_regtest none/tests/bigcode
bigcode:         valgrind   --num-transtab-sectors=2 --sanity-level=4 ./../../perf/bigcode 1

== 1 test, 0 stderr failure, 0 stdout failure, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
Comment 1 Florian Krohm 2015-04-03 12:28:50 UTC
The testcase also fails on this s390 variant (nightly build):

uname -mrs:        Linux 3.0.101-0.47.52-default s390x
Vendor version:    Welcome to SUSE Linux Enterprise Server 11 SP3 

like so (probably not an assert):

=================================================
./valgrind-new/none/tests/bigcode.stdout.diff
=================================================
--- bigcode.stdout.exp	2015-04-03 04:02:18.000000000 +0200
+++ bigcode.stdout.out	2015-04-03 04:09:19.000000000 +0200
@@ -1,2 +1,2 @@
 mode 1: 20000 copies of f(), 1 reps
-....................result = -37457500
+....................result = 100070000


The s390 box at Marist runs this test OK though:
Linux lfedora1.lf-dev.marist.edu 3.18.7-200.fc21.s390x #1 SMP 

The "community" s390 box also fails:

Linux 2.6.9-42.EL s390x
Red Hat Enterprise Linux AS release 4 (Nahant Update 4)

like the darwin machine

==14390== Nulgrind, the minimal Valgrind tool
==14390== Copyright (C) 2002-2013, and GNU GPL'd, by Nicholas Nethercote.
==14390== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright info
==14390== Command: ./perf/bigcode 1
==14390== 
mode 1: 20000 copies of f(), 1 reps
--14390:0: aspacem segment mismatch: V's seg 1st, kernel's 2nd:
--14390:0: aspacem   7: anon 0004026000-0005325fff     19m -wx-- SmFixed d=0x000 i=0       o=0       (-1,-1) (none)
--14390:0: aspacem ...: .... 0004026000-0005325fff     19m rwx.. ....... d=0x000 i=0       o=0       (.) m=. (none)
--14390:0: aspacem sync check at m_aspacemgr/aspacemgr-linux.c:2087 (vgPlain_am_notify_client_mmap): FAILED
--14390:0: aspacem 
--14390:0: aspacem Valgrind: FATAL: aspacem assertion failed:
--14390:0: aspacem   VG_(am_do_sync_check) (__PRETTY_FUNCTION__,__FILE__,__LINE__)
--14390:0: aspacem   at m_aspacemgr/aspacemgr-linux.c:2087 (vgPlain_am_notify_client_mmap)
--14390:0: aspacem Exiting now.
Comment 2 Philippe Waroquiers 2015-04-03 16:16:35 UTC
(In reply to Florian Krohm from comment #1)
> The testcase also fails on this s390 variant (nightly build):
> 
> uname -mrs:        Linux 3.0.101-0.47.52-default s390x
> Vendor version:    Welcome to SUSE Linux Enterprise Server 11 SP3 
> 
> like so (probably not an assert):
> 
> =================================================
> ./valgrind-new/none/tests/bigcode.stdout.diff
> =================================================
> --- bigcode.stdout.exp	2015-04-03 04:02:18.000000000 +0200
> +++ bigcode.stdout.out	2015-04-03 04:09:19.000000000 +0200
> @@ -1,2 +1,2 @@
>  mode 1: 20000 copies of f(), 1 reps
> -....................result = -37457500
> +....................result = 100070000

Before committing, I tested the change on an x86, and amd64 and a ppc64,
so assumed this was relatively well arch independent.
So, either there is a bug, or this test is not arch independent.

> 
> 

> The "community" s390 box also fails:
...
> --14390:0: aspacem segment mismatch: V's seg 1st, kernel's 2nd:
Normally, aspacemgr should have the same image as the kernel.
I think there is a comment in the darwin code that tells that it is normal
that these diverges on Darwin.
It is also normal that it diverges for an 'inner' valgrind (as it cannot observe the mmap
done by the outer).

But a divergence on linux seems suspicious.

Philippe
Comment 3 Philippe Waroquiers 2015-04-03 16:29:44 UTC
(In reply to Philippe Waroquiers from comment #2)
> I think there is a comment in the darwin code that tells that it is normal
> that these diverges on Darwin.
See e.g. aspacemgr-linux.c:867 with the comment
   /* GrP fixme not */
See also other #ifdef darwin or comments mentionning darwin or search for kludge.

So, on darwin, it looks "normal" to have aspacemgr and kernel desynchronised.
Comment 4 Florian Krohm 2015-04-04 12:00:14 UTC
I looked at the s390 failure:

-14390:0: aspacem segment mismatch: V's seg 1st, kernel's 2nd:
--14390:0: aspacem   7: anon 0004026000-0005325fff     19m -wx-- SmFixed d=0x000 i=0       o=0       (-1,-1) (none)
--14390:0: aspacem ...: .... 0004026000-0005325fff     19m rwx.. ....... d=0x000 i=0       o=0       (.) m=. (none)

The mismatch occurs because the protection is different. Valgrind thinks is "wx" kernel says "rwx".
This error occurs for the call in bigcode.c:
   char* a = mmap(0, FN_SIZE * n_fns, 
                     PROT_EXEC|PROT_WRITE, 
                     MAP_PRIVATE|MAP_ANONYMOUS, -1,0);

which explains why valgrind thinks it "wx".
Looking at the mmap manpage on that s390 box, it seems that it is perfectly valid for the kernel to give "rwx" protection to the mapped space given the request above:

    An  implementation [of mmap]  may permit accesses other than those specified by prot;  ...

and then some lengthy explanation of what the kernel may or may not do to the protection.
I'm not sure whether this behaviour is allowed in general or is special for s390. Any opinions ?

So, in essence, I think that the check in aspacemgr-linux.c needs adjusting and that's it.
Something similar to what has been done with sloppyXcheck...
Comment 5 Florian Krohm 2015-04-06 16:36:38 UTC
Created attachment 91910 [details]
proposed patch

Here's the page that says that mmap may grant more permissions than being asked for.
http://pubs.opengroup.org/onlinepubs/009695399/functions/mmap.html
I do not know why the linux specific man pages do not mention it
http://man7.org/linux/man-pages/man2/mmap.2.html

Anyhow the patch here changes the permission check such that the permissions granted by the kernel are at least those being asked for.
Regtested on x86-64, s390, ppc64 with no new failures.
Comment 6 Christian Borntraeger 2015-04-07 12:47:33 UTC
Well, on s390 hw-wise x implies r. Seems that some kernels (or glibc?) will
change -wx to rwx (SLES11) and others (3.19) will still show -wx in /proc/*/maps.

I cant find a suspicious commit in the kernel that explains the change in behaviour, maybe its
a glibc change or config option or whatever.
Comment 7 Tom Hughes 2015-04-07 13:11:05 UTC
It should be easy enough to tell if glibc is tweaking the flags - just use strace to see what flags are really passed to the kernel when you do an mmap with PROT_EXEC but no PROT_READ.
Comment 8 Florian Krohm 2015-04-07 20:25:38 UTC
(In reply to Tom Hughes from comment #7)
> It should be easy enough to tell if glibc is tweaking the flags - just use
> strace to see what flags are really passed to the kernel when you do an mmap
> with PROT_EXEC but no PROT_READ.

Right.. glibc does not tweak anything as strace reveals this:

mmap(NULL, 4980000, PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)

i.e. permission flags unchanged. So it's the kernel who's doing the tweaking.
I'm going to apply the following patch which on s390 ensures that wx and rwx permissions compare equal.
Comment 9 Florian Krohm 2015-04-07 20:26:32 UTC
Created attachment 91935 [details]
patch for s390
Comment 10 Julian Seward 2015-04-28 11:19:32 UTC
Patch looks OK to me.
Comment 11 Florian Krohm 2015-04-28 13:08:35 UTC
(In reply to Julian Seward from comment #10)
> Patch looks OK to me.

That patch has already gone in as r15075 on Apr 7.
I think this bug is waiting for Rhys to look at the OS X failure.
Comment 12 Rhys Kidd 2015-05-01 06:33:14 UTC
Resolved for OS X in r15171.