Bug 317127

Summary: Fedora18/x86_64 --sanity-level=3 : aspacem segment mismatch
Product: [Developer tools] valgrind Reporter: Dmitry Djachenko <dimhen>
Component: generalAssignee: Paul Floyd <pjfloyd>
Status: RESOLVED FIXED    
Severity: normal CC: mark, pjfloyd, tom
Priority: NOR    
Version: 3.9.0.SVN   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: patch, amd64 only
Avoid dev/inode check on btrfs with --sanity-level=3

Description Dmitry Djachenko 2013-03-21 11:39:44 UTC
run empty program
int main() { return 0;}
valgrind --tool=none --sanity-level=3 FAIL

Reproducible: Always

Steps to Reproduce:
1. echo "int main() {return 0;}" > tst.c
2. gcc tst.c
3. valrind --tool=none --sanity-level=3 ./a.out
Actual Results:  
$ valgrind --tool=none -v --sanity-level=3 ./a.out 
--17626:0:aspacem  segment mismatch: V's seg 1st, kernel's 2nd:
--17626:0:aspacem    1: file 0000400000-0000400fff    4096 r-x-- SmFixed d=0x024 i=7181509 o=0       (1) m=0 /home/dimhen/errs/V/a.out
--17626:0:aspacem  ...: .... 0000400000-0000400fff    4096 r-x.. ....... d=0x01c i=7181509 o=0       (.) m=. /home/dimhen/errs/V/a.out
--17626:0:aspacem  sync check at m_aspacemgr/aspacemgr-linux.c:1932 (vgPlain_am_get_advisory): FAILED
--17626:0:aspacem  
--17626:0:aspacem  Valgrind: FATAL: aspacem assertion failed:
--17626:0:aspacem    VG_(am_do_sync_check) (__PRETTY_FUNCTION__,__FILE__,__LINE__)
--17626:0:aspacem    at m_aspacemgr/aspacemgr-linux.c:1932 (vgPlain_am_get_advisory)
--17626:0:aspacem  Exiting now.



Expected Results:  
no errors

Fedora 18, debuginfo installed
$ uname -a
Linux localhost.localdomain 3.8.3-203.fc18.x86_64 #1 SMP Mon Mar 18 12:59:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ LANG=C gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.7.2/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --disable-build-with-cxx --disable-build-poststage1-with-cxx --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.7.2 20121109 (Red Hat 4.7.2-8) (GCC)
Comment 1 Tom Hughes 2013-03-21 11:46:44 UTC
Does this only happen when you turn up the sanity level? and if it does then what made you want to do that? It isn't something I would normally expect an end user to change...
Comment 2 Tom Hughes 2013-03-21 11:53:21 UTC
Anyway the problem seems to be that valgrind thinks that /home/dimhen/errs/V/a.out is on device 0x024 while /proc/xxx/maps says it is on device 0x01c.

Is there anything unusual about the filesystem containing that file?

What does "stat  /home/dimhen/errs/V/a.out" say?
Comment 3 Dmitry Djachenko 2013-03-21 12:28:45 UTC
(In reply to comment #2)
> Anyway the problem seems to be that valgrind thinks that
> /home/dimhen/errs/V/a.out is on device 0x024 while /proc/xxx/maps says it is
> on device 0x01c.
> 
> Is there anything unusual about the filesystem containing that file?
$ mount | grep home
/dev/sda6 on /home type btrfs (rw,relatime,seclabel,space_cache)

> 
> What does "stat  /home/dimhen/errs/V/a.out" say?

$ stat ./a.out 
  File: './a.out'
  Size: 6984      	Blocks: 16         IO Block: 4096   regular file
Device: 24h/36d	Inode: 7181509     Links: 1
Access: (0775/-rwxrwxr-x)  Uid: ( 1000/  dimhen)   Gid: ( 1000/  dimhen)
Context: unconfined_u:object_r:user_home_t:s0
Access: 2013-03-21 15:02:19.649045507 +0400
Modify: 2013-03-21 15:02:16.277933130 +0400
Change: 2013-03-21 15:02:16.277933130 +0400
 Birth: -
Comment 4 Tom Hughes 2013-03-21 12:39:04 UTC
Aha.... btrfs... I wonder if that has anything to do with it.

So stat says 0x24 for the device, which matches what valgrind has recorded, so why is /proc/maps saying something else I wonder.

Is /home actually just one sub-volume and /dev/sda6 the device backing the whole volume?

What does "ls -l /dev/sda6" show? Can you see any devices in /dev with "0, 25" or "0, 36" as their device number?
Comment 5 Dmitry Djachenko 2013-03-21 13:11:48 UTC
(In reply to comment #1)
> Does this only happen when you turn up the sanity level? and if it does then
> what made you want to do that? It isn't something I would normally expect an
> end user to change...

Yes. Only with --sanity-level=3.
Question and PR arose from tests failures.

For me valgrind-trunk 'make regtest' has 5 tests FAIL, 577 PASS

none/tests/map_unmap
none/tests/sigstackgrowth
none/tests/stackgrowth
has '--sanity-level=3' and FAIL

exp-sgcheck/tests/preen_invars -- look very similiar to PR255603
memcheck/tests/origin5-bz2 -- PR316903
Comment 6 Tom Hughes 2013-03-21 13:24:32 UTC
It's quite normal for a few tests to fail, so I wouldn't worry to much about that. If you can answer the questions I asked in comment #4 then we can try and get to the bottom of this specific issue.
Comment 7 Dmitry Djachenko 2013-03-21 13:26:07 UTC
(In reply to comment #4)
> Aha.... btrfs... I wonder if that has anything to do with it.
If i remember correctly another my box with ext4/lvm has the same errs.
I'll re-check.

> 
> So stat says 0x24 for the device, which matches what valgrind has recorded,
> so why is /proc/maps saying something else I wonder.
> 
> Is /home actually just one sub-volume and /dev/sda6 the device backing the
> whole volume?
100Mb fWin hidden area
210 Gb NTFS
524 Mb ext4 /boot
extended partition 4
   /dev/sda5 4.2 Gb swap
   /dev/sda6 286Gb btrfs /

# mount | grep sda6
/dev/sda6 on / type btrfs (rw,relatime,seclabel,space_cache)
/dev/sda6 on /home type btrfs (rw,relatime,seclabel,space_cache)


> 
> What does "ls -l /dev/sda6" show?
# ls -l /dev/sda6
brw-rw----. 1 root disk 8, 6 Mar 21 10:57 /dev/sda6

> Can you see any devices in /dev with "0,
> 25" or "0, 36" as their device number?
No "0, 25", "0, 36"
# ls -l /dev | egrep -w '24|25|36'
crw-rw-rw-. 1 root tty       5,   2 Mar 21 17:25 ptmx
crw--w----. 1 root tty       4,  24 Mar 21 10:57 tty24
crw--w----. 1 root tty       4,  25 Mar 21 10:57 tty25
crw--w----. 1 root tty       4,  36 Mar 21 10:57 tty36
Comment 8 Tom Hughes 2013-03-21 13:31:34 UTC
Right, so it looks like you do have at least two btrfs subvolumes on that device, which is almost certainly the root cause of the problem.

The device numbers being reported do seem very odd anyway, as they all have a major device number of zero. I rather suspect that btrfs has a stat that returns very dubious values in st_dev that don't reflect the underlying device numbers, probably because it can have multiple (sub)volumes on the save physical device and therefore multiple inode numbering spaces.
Comment 9 Tom Hughes 2013-03-21 13:32:49 UTC
Yep - there is a bug in RHBZ describing exactly this problem: https://bugzilla.redhat.com/show_bug.cgi?id=711881
Comment 10 Dmitry Djachenko 2013-03-21 18:31:37 UTC
(In reply to comment #7)
> (In reply to comment #4)
> > Aha.... btrfs... I wonder if that has anything to do with it.
> If i remember correctly another my box with ext4/lvm has the same errs.
> I'll re-check.
With ext4/lvm test PASS.

I think that test checks basic functionality and it's bad to skip it

So what to do?
-- ignore problem segment
-- print warning and not exit, add expected stderr.out

decrease --sanity-level to 2 is not a variant. Test was added in r3265 for this check (as part of 2.4.0 merge)
Comment 11 Dmitry Djachenko 2013-03-21 18:52:06 UTC
(In reply to comment #6)
> It's quite normal for a few tests to fail, so I wouldn't worry to much about
> that.
i hear sometimes somewhere "FAIL free testsuite is Right Thing To Do" :)
Comment 12 Tom Hughes 2013-03-21 19:03:11 UTC
Yes obviously it's not ideal that the test suite is not more reliable, but it turns out to be very hard to construct tests for valgrind that reliably pass everywhere - small changes in the operating environment can causes backtraces to change in subtle ways for example.

In this case the problem isn't the test at all, it's btrfs invalidating a basic unix assumption about the meaning of st_dev.

It may be that we will have to stop comparing device numbers in the sanity check, but certainly the test is not the problem.
Comment 13 Paul Floyd 2024-05-06 16:22:07 UTC
*** Bug 486662 has been marked as a duplicate of this bug. ***
Comment 14 Paul Floyd 2024-05-06 16:31:08 UTC
I think that the answer is to add something like

VG_(do_syscall3)(__NR_statfs, filename, &statfs);
if (statfs.f_type == BTRFS_SUPER_MAGIC) {
   cmp_devino = False;
}

inside the existing VGO_linux block.
Comment 15 Paul Floyd 2024-05-06 18:54:34 UTC
Created attachment 169247 [details]
patch, amd64 only

Initial version of a fix.
Comment 16 Paul Floyd 2024-05-06 19:03:22 UTC
The patch I just added seems to work OK.

There is a plan B. In the patch I've also updated the statx structure which (since kernel 5.8) has an stx_mnt_id which we could try to use. I don't know how many people are still using old kernels with btrfs. Not too many I imagine. The 5.8 kernel dates back to Fedora 33. And no guarantee that stx_mnt_id would work.
Comment 17 Paul Floyd 2024-05-06 19:05:16 UTC
And with the patch I get a clean slate

== 880 tests, 0 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
Comment 18 Mark Wielaard 2024-05-08 15:15:57 UTC
I must admit I don't fully understand what we are trying to do. As far as I understand it, the idea is that we sanity check what the aspacemgr records against what /proc/self/maps shows? But that includes some device specific inode values which we don't have/know? Can't we just make cmp_devino = False the default? aka never check those? Or are we loosing some valuable sanity checks/info that way?
Comment 19 Paul Floyd 2024-05-08 16:24:41 UTC
(In reply to Mark Wielaard from comment #18)
> I must admit I don't fully understand what we are trying to do. As far as I
> understand it, the idea is that we sanity check what the aspacemgr records
> against what /proc/self/maps shows? But that includes some device specific
> inode values which we don't have/know? Can't we just make cmp_devino = False
> the default? aka never check those? Or are we loosing some valuable sanity
> checks/info that way?

I think the goal is to check as many things as possible. Is it valuable? Hard to imagine reading the wrong exe or library that is mmap'd but if ever that did happen it would be fairly catastrophic.

The problem with btrfs is that /proc/self/maps uses the device id for the supernode whilst statx returns the device id for the btrfs subvolume. The two are different so the sanity check fails.

This check is turned off for Darwin. I'm just testing turning it back on for FreeBSD - it seems to work with a couple of tweaks.
Comment 20 Mark Wielaard 2024-06-30 21:06:33 UTC
Patch works for me.

It is a little unfortunate that vki_statfs is arch specific.

uapi/asm-generic/statfs.h says:

/*
 * Most 64-bit platforms use 'long', while most 32-bit platforms use '__u32'.
 * Yes, they differ in signedness as well as size.
 * Special cases can override it for themselves -- except for S390x, which
 * is just a little too special for us. And MIPS, which I'm not touching
 * with a 10' pole.
 */

But I think we can make it linux-generic even without updating vki_statfs everywhere.
We don't seem to use the new f_flags anywhere. We just need to check f_type.
Comment 21 Mark Wielaard 2024-07-03 20:25:18 UTC
Could we make it linux generic by just using __NR_statfs and struct statfs (which seems defined for each linux arch already):

diff --git a/coregrind/m_aspacemgr/aspacemgr-linux.c b/coregrind/m_aspacemgr/aspacemgr-linux.c
index 9362f65af3c8..3da68db381e1 100644
--- a/coregrind/m_aspacemgr/aspacemgr-linux.c
+++ b/coregrind/m_aspacemgr/aspacemgr-linux.c
@@ -891,6 +891,14 @@ static void sync_check_mapping_callback ( Addr addr, SizeT len, UInt prot,
       /* hack apparently needed on MontaVista Linux */
       if (filename && VG_(strstr)(filename, "/.lib-ro/"))
          cmp_devino = False;
+
+#if defined(VGO_linux)
+      struct vki_statfs statfs = {0};
+      SysRes res = VG_(do_syscall2)(__NR_statfs, (UWord)filename, (UWord)&statfs);
+      if (!sr_isError(res) && statfs.f_type == VKI_BTRFS_SUPER_MAGIC) {
+         cmp_devino = False;
+      }
+#endif
 #endif
       
       /* If we are doing sloppy execute permission checks then we
diff --git a/include/vki/vki-linux.h b/include/vki/vki-linux.h
index be3d76690cee..708a12b64d03 100644
--- a/include/vki/vki-linux.h
+++ b/include/vki/vki-linux.h
@@ -5455,6 +5460,12 @@ struct vki_open_how {
 #define VKI_CLOSE_RANGE_UNSHARE (1U << 1)
 #define VKI_CLOSE_RANGE_CLOEXEC (1U << 2)
 
+//----------------------------------------------------------------------
+// From linux/magic.h
+//----------------------------------------------------------------------
+
+#define VKI_BTRFS_SUPER_MAGIC    0x9123683E
+
 /*--------------------------------------------------------------------*/
 /*--- end                                                          ---*/
 /*--------------------------------------------------------------------*/
Comment 22 Mark Wielaard 2024-07-04 13:50:19 UTC
Created attachment 171372 [details]
Avoid dev/inode check on btrfs with --sanity-level=3

With --sanity-level=3 or higher the aspacemgr sanity checks the
device/inode numbers from /proc/self/maps to the file stat
results. These don't match on btrfs. So detect when a file is on a
btrfs volume and ignore the check in that case.

On Linux we can use the statfs call for that, except on nanomips
(which also doesn't have a sys_fstatfs syswrap).

Tested on (linux) i686, x86_64, aarch64, ppc64le and s390x.
Comment 23 Paul Floyd 2024-07-05 05:48:35 UTC
commit 3b06d458ffc5cc8de8d701926e5d86979185fa04 (HEAD -> master, origin/master, origin/HEAD)
Author: Mark Wielaard <mark@klomp.org>
Date:   Thu Jul 4 15:21:39 2024 +0200

    Avoid dev/inode check on btrfs with --sanity-level=3
    
    With --sanity-level=3 or higher the aspacemgr sanity checks the
    device/inode numbers from /proc/self/maps to the file stat
    results. These don't match on btrfs. So detect when a file is on a
    btrfs volume and ignore the check in that case.
    
    https://bugs.kde.org/show_bug.cgi?id=317127