run empty program int main() { return 0;} valgrind --tool=none --sanity-level=3 FAIL Reproducible: Always Steps to Reproduce: 1. echo "int main() {return 0;}" > tst.c 2. gcc tst.c 3. valrind --tool=none --sanity-level=3 ./a.out Actual Results: $ valgrind --tool=none -v --sanity-level=3 ./a.out --17626:0:aspacem segment mismatch: V's seg 1st, kernel's 2nd: --17626:0:aspacem 1: file 0000400000-0000400fff 4096 r-x-- SmFixed d=0x024 i=7181509 o=0 (1) m=0 /home/dimhen/errs/V/a.out --17626:0:aspacem ...: .... 0000400000-0000400fff 4096 r-x.. ....... d=0x01c i=7181509 o=0 (.) m=. /home/dimhen/errs/V/a.out --17626:0:aspacem sync check at m_aspacemgr/aspacemgr-linux.c:1932 (vgPlain_am_get_advisory): FAILED --17626:0:aspacem --17626:0:aspacem Valgrind: FATAL: aspacem assertion failed: --17626:0:aspacem VG_(am_do_sync_check) (__PRETTY_FUNCTION__,__FILE__,__LINE__) --17626:0:aspacem at m_aspacemgr/aspacemgr-linux.c:1932 (vgPlain_am_get_advisory) --17626:0:aspacem Exiting now. Expected Results: no errors Fedora 18, debuginfo installed $ uname -a Linux localhost.localdomain 3.8.3-203.fc18.x86_64 #1 SMP Mon Mar 18 12:59:28 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux $ LANG=C gcc -v Using built-in specs. COLLECT_GCC=/usr/bin/gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.7.2/lto-wrapper Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --disable-build-with-cxx --disable-build-poststage1-with-cxx --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --enable-java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib --with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 4.7.2 20121109 (Red Hat 4.7.2-8) (GCC)
Does this only happen when you turn up the sanity level? and if it does then what made you want to do that? It isn't something I would normally expect an end user to change...
Anyway the problem seems to be that valgrind thinks that /home/dimhen/errs/V/a.out is on device 0x024 while /proc/xxx/maps says it is on device 0x01c. Is there anything unusual about the filesystem containing that file? What does "stat /home/dimhen/errs/V/a.out" say?
(In reply to comment #2) > Anyway the problem seems to be that valgrind thinks that > /home/dimhen/errs/V/a.out is on device 0x024 while /proc/xxx/maps says it is > on device 0x01c. > > Is there anything unusual about the filesystem containing that file? $ mount | grep home /dev/sda6 on /home type btrfs (rw,relatime,seclabel,space_cache) > > What does "stat /home/dimhen/errs/V/a.out" say? $ stat ./a.out File: './a.out' Size: 6984 Blocks: 16 IO Block: 4096 regular file Device: 24h/36d Inode: 7181509 Links: 1 Access: (0775/-rwxrwxr-x) Uid: ( 1000/ dimhen) Gid: ( 1000/ dimhen) Context: unconfined_u:object_r:user_home_t:s0 Access: 2013-03-21 15:02:19.649045507 +0400 Modify: 2013-03-21 15:02:16.277933130 +0400 Change: 2013-03-21 15:02:16.277933130 +0400 Birth: -
Aha.... btrfs... I wonder if that has anything to do with it. So stat says 0x24 for the device, which matches what valgrind has recorded, so why is /proc/maps saying something else I wonder. Is /home actually just one sub-volume and /dev/sda6 the device backing the whole volume? What does "ls -l /dev/sda6" show? Can you see any devices in /dev with "0, 25" or "0, 36" as their device number?
(In reply to comment #1) > Does this only happen when you turn up the sanity level? and if it does then > what made you want to do that? It isn't something I would normally expect an > end user to change... Yes. Only with --sanity-level=3. Question and PR arose from tests failures. For me valgrind-trunk 'make regtest' has 5 tests FAIL, 577 PASS none/tests/map_unmap none/tests/sigstackgrowth none/tests/stackgrowth has '--sanity-level=3' and FAIL exp-sgcheck/tests/preen_invars -- look very similiar to PR255603 memcheck/tests/origin5-bz2 -- PR316903
It's quite normal for a few tests to fail, so I wouldn't worry to much about that. If you can answer the questions I asked in comment #4 then we can try and get to the bottom of this specific issue.
(In reply to comment #4) > Aha.... btrfs... I wonder if that has anything to do with it. If i remember correctly another my box with ext4/lvm has the same errs. I'll re-check. > > So stat says 0x24 for the device, which matches what valgrind has recorded, > so why is /proc/maps saying something else I wonder. > > Is /home actually just one sub-volume and /dev/sda6 the device backing the > whole volume? 100Mb fWin hidden area 210 Gb NTFS 524 Mb ext4 /boot extended partition 4 /dev/sda5 4.2 Gb swap /dev/sda6 286Gb btrfs / # mount | grep sda6 /dev/sda6 on / type btrfs (rw,relatime,seclabel,space_cache) /dev/sda6 on /home type btrfs (rw,relatime,seclabel,space_cache) > > What does "ls -l /dev/sda6" show? # ls -l /dev/sda6 brw-rw----. 1 root disk 8, 6 Mar 21 10:57 /dev/sda6 > Can you see any devices in /dev with "0, > 25" or "0, 36" as their device number? No "0, 25", "0, 36" # ls -l /dev | egrep -w '24|25|36' crw-rw-rw-. 1 root tty 5, 2 Mar 21 17:25 ptmx crw--w----. 1 root tty 4, 24 Mar 21 10:57 tty24 crw--w----. 1 root tty 4, 25 Mar 21 10:57 tty25 crw--w----. 1 root tty 4, 36 Mar 21 10:57 tty36
Right, so it looks like you do have at least two btrfs subvolumes on that device, which is almost certainly the root cause of the problem. The device numbers being reported do seem very odd anyway, as they all have a major device number of zero. I rather suspect that btrfs has a stat that returns very dubious values in st_dev that don't reflect the underlying device numbers, probably because it can have multiple (sub)volumes on the save physical device and therefore multiple inode numbering spaces.
Yep - there is a bug in RHBZ describing exactly this problem: https://bugzilla.redhat.com/show_bug.cgi?id=711881
(In reply to comment #7) > (In reply to comment #4) > > Aha.... btrfs... I wonder if that has anything to do with it. > If i remember correctly another my box with ext4/lvm has the same errs. > I'll re-check. With ext4/lvm test PASS. I think that test checks basic functionality and it's bad to skip it So what to do? -- ignore problem segment -- print warning and not exit, add expected stderr.out decrease --sanity-level to 2 is not a variant. Test was added in r3265 for this check (as part of 2.4.0 merge)
(In reply to comment #6) > It's quite normal for a few tests to fail, so I wouldn't worry to much about > that. i hear sometimes somewhere "FAIL free testsuite is Right Thing To Do" :)
Yes obviously it's not ideal that the test suite is not more reliable, but it turns out to be very hard to construct tests for valgrind that reliably pass everywhere - small changes in the operating environment can causes backtraces to change in subtle ways for example. In this case the problem isn't the test at all, it's btrfs invalidating a basic unix assumption about the meaning of st_dev. It may be that we will have to stop comparing device numbers in the sanity check, but certainly the test is not the problem.