Bug 473870 - FreeBSD 14 applications fail early at startup
Summary: FreeBSD 14 applications fail early at startup
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.22 GIT
Platform: Other FreeBSD
: NOR crash
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-28 21:49 UTC by Paul Floyd
Modified: 2023-08-31 12:23 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Floyd 2023-08-28 21:49:11 UTC
Something has changed on FreeBSD 14 amd64.

Seems to affect memcheck, drd, helgrind and dhat
none, massif, lackey, cachegrind and callgrind seem OK

./vg-in-place -q pwd

valgrind: m_redir.c:1212 (Addr vgPlain_redir_do_lookup(Addr, Bool *)): Assertion 'iFuncWrapper' failed.

host stacktrace:
==42610==    at 0x3810C626: ??? (in /home/paulf/valgrind/memcheck/memcheck-amd64-freebsd)
==42610==    by 0x1002AA9FDF: ???
==42610==    by 0x38105789: ??? (in /home/paulf/valgrind/memcheck/memcheck-amd64-freebsd)
==42610==    by 0x3810C625: ??? (in /home/paulf/valgrind/memcheck/memcheck-amd64-freebsd)
==42610==    by 0x1002AA978F: ???

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 100883)
==42610==    at 0x49B1BD0: stpcpy (in /lib/libc.so.7)
==42610==    by 0x40073D3: ??? (in /libexec/ld-elf.so.1)
==42610==    by 0x400A7CF: ??? (in /libexec/ld-elf.so.1)
==42610==    by 0x400975E: ??? (in /libexec/ld-elf.so.1)
==42610==    by 0x4006B88: ??? (in /libexec/ld-elf.so.1)
client stack range: [0x1FFBFFE000 0x1FFC000FFF] client SP: 0x1FFBFFFDE8
valgrind stack range: [0x10029AA000 0x1002AA9FFF] top usage: 7072 of 1048576


Note: see also the FAQ in the source distribution.
It contains workarounds to several common problems.
In particular, if Valgrind aborted or crashed after
identifying problems in your program, there's a good chance
that fixing those problems will prevent Valgrind aborting or
crashing, especially if it happened in m_mallocfree.c.

If that doesn't help, please report this bug to: www.valgrind.org

In the bug report, send all the above text, the valgrind
version, and what OS and version you are using.  Thanks.

I think that the assert is a red herring, and it's because the iFuncWrapper is not correctly initialized.

My first impression is that there is a problem with reading the mmap'd memcheck exe

--43056:2: aspacem   Reading /proc/self/maps
--43056:2: aspacem   <<< SHOW_SEGMENTS: With contents of /proc/self/maps (16 segments)
--43056:2: aspacem   1 segment names in 1 slots
--43056:2: aspacem   freelist is empty
--43056:2: aspacem   (0,4,3) /home/paulf/valgrind/memcheck/memcheck-amd64-freebsd
--43056:2: aspacem     0: RSVN 0000000000-0003ffffff     64m ----- SmFixed
--43056:2: aspacem     1:      0004000000-0037ffffff    832m
--43056:2: aspacem     2: FILE 0038000000-00380c2fff  798720 r---- d=0x05a i=2490977 o=0       (0,4)
--43056:2: aspacem     3: FILE 00380c3000-0038270fff 1761280 r-x-- d=0x05a i=2490977 o=794624  (0,4)
--43056:2: aspacem     4: ANON 0038271000-003a84efff     37m rw---

The same on FreeBSD 13.2 (works OK)

--2474:2: aspacem   Reading /proc/self/maps
--2474:2: aspacem   <<< SHOW_SEGMENTS: With contents of /proc/self/maps (15 segments)
--2474:2: aspacem   1 segment names in 1 slots
--2474:2: aspacem   freelist is empty
--2474:2: aspacem   (0,4,5) /usr/home/paulf/scratch/valgrind/memcheck/memcheck-amd64-freebsd
--2474:2: aspacem     0: RSVN 0000000000-0003ffffff     64m ----- SmFixed
--2474:2: aspacem     1:      0004000000-0037ffffff    832m
--2474:2: aspacem     2: FILE 0038000000-00380c4fff  806912 r---- d=0x696e301b i=2438781 o=0       (0,4)
--2474:2: aspacem     3: FILE 00380c5000-0038274fff 1769472 r-x-- d=0x696e301b i=2438781 o=802816  (0,4)
--2474:2: aspacem     4: FILE 0038275000-0038275fff    4096 rw--- d=0x696e301b i=2438781 o=2568192 (0,4)
--2474:2: aspacem     5: ANON 0038276000-003a852fff     37m rw---

Where has number 4, the RW segment gone?

parse_procselfmaps could be at fault?
Comment 1 Paul Floyd 2023-08-29 20:36:36 UTC
First a quick overview of Valgrind startup.

Each tool has 3 parts
- the tool exe e.g. memcheck-amd64-freebsd
- core preload vgpreload_core-amd64-freebsd.so
- tool preload e.g. vgpreload_memcheck-amd64-freebsd.so

Despite those shared libraries the tool does not link with anything. Generally the way that it 'links' with anything is to parse the debuginfo as files get mmap'd and then redirect any interesting functions. The tool is running when the preloads and the guest and dependent shared libraries get loaded so it can trigger from mmap system calls to do the debuginfo parsing,

The exception to that is the tool itself. Obviously it can't trigger anything from its own mmap to memory. Instead it uses /proc (on Solaris and Linux) and sysctl KERN_PROC_VMMAP on FreeBSD. In order for the debuginfo parsing to be triggered it needs to see that a "standard" ELF binary has been loaded. Normally that means 3 ELF PT_LOAD segments for binaries built with GNU bfd-ld (RO, RX and RW) and 4 segments for binaries built with LLVM lld (RO, RX and 2xRW). The RO is ignored so really what it is looking for is either RX+RW or RX+2xRW. Since the Valgrind tools do not link with anything they don't have the extra RW PT_LOAD stuff like GOT-PLT. If it doesn't see that last RW segment there's no debuginfo reading and things start to go wrong.

From what I see 14/15 now have some optimization that allows the RW PT_LOAD to be marked as swap rather than mmap'd into memory. I've tried poking around, but I can't see anything in places I consider likely (rtld, mmap, lld).  Details below.

Here's the output for 13.2

paulf> objdump -p .in_place/memcheck-amd64-freebsd

.in_place/memcheck-amd64-freebsd:     file format elf64-x86-64-freebsd

Program Header:
    PHDR off    0x0000000000000040 vaddr 0x0000000038000040 paddr 0x0000000038000040 align 2**3
         filesz 0x0000000000000188 memsz 0x0000000000000188 flags r--
    LOAD off    0x0000000000000000 vaddr 0x0000000038000000 paddr 0x0000000038000000 align 2**12
         filesz 0x00000000000c44ac memsz 0x00000000000c44ac flags r--
    LOAD off    0x00000000000c44b0 vaddr 0x00000000380c54b0 paddr 0x00000000380c54b0 align 2**12
         filesz 0x00000000001af7cf memsz 0x00000000001af7cf flags r-x
    LOAD off    0x0000000000273c80 vaddr 0x0000000038275c80 paddr 0x0000000038275c80 align 2**12
         filesz 0x0000000000000a90 memsz 0x00000000025dd010 flags rw-
EH_FRAME off    0x0000000000093180 vaddr 0x0000000038093180 paddr 0x0000000038093180 align 2**2
         filesz 0x00000000000069fc memsz 0x00000000000069fc flags r--
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**0
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
    NOTE off    0x00000000000001c8 vaddr 0x00000000380001c8 paddr 0x00000000380001c8 align 2**2
         filesz 0x0000000000000030 memsz 0x0000000000000030 flags r--

And for 14.0

paulf@freebsd:~/valgrind $ objdump -p .in_place/memcheck-amd64-freebsd

.in_place/memcheck-amd64-freebsd:       file format elf64-x86-64

Program Header:
    PHDR off    0x0000000000000040 vaddr 0x0000000038000040 paddr 0x0000000038000040 align 2**3
         filesz 0x0000000000000188 memsz 0x0000000000000188 flags r--
    LOAD off    0x0000000000000000 vaddr 0x0000000038000000 paddr 0x0000000038000000 align 2**12
         filesz 0x00000000000c26fc memsz 0x00000000000c26fc flags r--
    LOAD off    0x00000000000c2700 vaddr 0x00000000380c3700 paddr 0x00000000380c3700 align 2**12
         filesz 0x00000000001acbaf memsz 0x00000000001acbaf flags r-x
    LOAD off    0x000000000026f2b0 vaddr 0x00000000382712b0 paddr 0x00000000382712b0 align 2**12
         filesz 0x0000000000000a90 memsz 0x00000000025dcfe0 flags rw-
EH_FRAME off    0x00000000000939a0 vaddr 0x00000000380939a0 paddr 0x00000000380939a0 align 2**2
         filesz 0x00000000000069ec memsz 0x00000000000069ec flags r--
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**64
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
    NOTE off    0x00000000000001c8 vaddr 0x00000000380001c8 paddr 0x00000000380001c8 align 2**2
         filesz 0x0000000000000030 memsz 0x0000000000000030 flags r--


On 14
procstat -v gives

  PID              START                END PRT  RES PRES REF SHD FLAG TP PATH
 4649         0x38000000         0x380c3000 r--  195 2544   2   0 CN--- vn /home/paulf/valgrind/memcheck/memcheck-amd64-freebsd
 4649         0x380c3000         0x38271000 r-x  430 2544   2   0 CN--- vn /home/paulf/valgrind/memcheck/memcheck-amd64-freebsd
 4649         0x38271000         0x3a84f000 rw-   10   10   1   0 ----- sw
 4649        0x838821000        0x858801000 ---    0    0   0   0 ----- gd
 4649        0x858801000        0x858821000 rw-    1    1   1   0 ---D- sw
 4649        0x85947a000        0x85947b000 r-x    1    1  25   0 ----- ph
 4649     0x7ffffffff000     0x800000000000 ---    0    0   0   0 ----- gd


This is the line that seems to be the problem:

 4649         0x38271000         0x3a84f000 rw-   10   10   1   0 ----- sw

The RW PT_LOAD is marked as swap.

The procstat -v output for 13.2 is

paulf> procstat -v 5310
  PID              START                END PRT  RES PRES REF SHD FLAG TP PATH
 5310         0x38000000         0x380c5000 r--  197 2552   3   0 CN--- vn /usr/home/paulf/scratch/valgrind/memcheck/memcheck-amd64-freebsd
 5310         0x380c5000         0x38275000 r-x  432 2552   3   0 CN--- vn /usr/home/paulf/scratch/valgrind/memcheck/memcheck-amd64-freebsd
 5310         0x38275000         0x38276000 rw-    1 2552   3   0 CN--- vn /usr/home/paulf/scratch/valgrind/memcheck/memcheck-amd64-freebsd
 5310         0x38276000         0x3a853000 rw-   11   11   1   0 ----- df
 5310        0x838cd8000        0x858cb8000 ---    0    0   0   0 ----- gd
 5310        0x858cb8000        0x858cd8000 rw-    2    2   1   0 ---D- df
 5310     0x7ffffffff000     0x800000000000 r-x    1    1  88   0 ----- ph

But "none" works OK.

paulf@freebsd:~/valgrind $ procstat -v 7622
  PID              START                END PRT  RES PRES REF SHD FLAG  TP PATH
 7622         0x38000000         0x380ab000 r--  171 2168   9   1 CN--- vn /home/paulf/valgrind/none/none-amd64-freebsd
 7622         0x380ab000         0x38218000 r-x  365    0   1   0 C---- vn /home/paulf/valgrind/none/none-amd64-freebsd
 7622         0x38218000         0x38219000 rw-    1 2168   9   1 CN--- vn /home/paulf/valgrind/none/none-amd64-freebsd
 7622         0x38219000         0x397dd000 rw-    1    1   1   0 ----- sw 
 7622     0x7fffdffff000     0x7ffffffdf000 ---    0    0   0   0 ----- gd 
 7622     0x7ffffffdf000     0x7ffffffff000 rw-    1    1   1   0 ---D- sw 
 7622     0x7ffffffff000     0x800000000000 r-x    1    1  26   0 ----- ph

So it looks like the rw map simply isn't loaded.

Some ktraces
  7601 valgrind NAMI  "/home/paulf/valgrind/./.in_place/memcheck-amd64-freebsd"
  7601 memcheck-amd64-free RET   execve JUSTRETURN
  7601 memcheck-amd64-free CALL  __sysctlbyname(0x3803ec36,0x24,0x3a384518,0x3a384520,0,0)
  7601 memcheck-amd64-free SCTL  "security.bsd.unprivileged_proc_debug"
  7601 memcheck-amd64-free RET   __sysctlbyname 0
  7601 memcheck-amd64-free CALL  getpid
  7601 memcheck-amd64-free RET   getpid 7601/0x1db1
  7601 memcheck-amd64-free CALL  __sysctl(0x3a3843d0,0x4,0x39782a70,0x3a3843c0,0,0)
  7601 memcheck-amd64-free SCTL  "kern.proc.vmmap.7601"
  7601 memcheck-amd64-free RET   __sysctl 0
  7601 memcheck-amd64-free CALL  mmap(0x1002001000,0x400000,0x7<PROT_READ|PROT_WRITE|PROT_EXEC>,0x1012<MAP_PRIVATE|MAP_FIXED|MAP_ANON>,0xfffffffffffff
fff,0)
  7601 memcheck-amd64-free RET   mmap 68753035264/0x1002001000

and

  7597 valgrind NAMI  "/home/paulf/valgrind/./.in_place/none-amd64-freebsd"
  7597 none-amd64-freebsd RET   execve JUSTRETURN
  7597 none-amd64-freebsd CALL  __sysctlbyname(0x3803688d,0x24,0x39312da8,0x39312db0,0,0)
  7597 none-amd64-freebsd SCTL  "security.bsd.unprivileged_proc_debug"
  7597 none-amd64-freebsd RET   __sysctlbyname 0
  7597 none-amd64-freebsd CALL  getpid
  7597 none-amd64-freebsd RET   getpid 7597/0x1dad
  7597 none-amd64-freebsd CALL  __sysctl(0x39312c60,0x4,0x38711300,0x39312c50,0,0)
  7597 none-amd64-freebsd SCTL  "kern.proc.vmmap.7597"
  7597 none-amd64-freebsd RET   __sysctl 0
  7597 none-amd64-freebsd CALL  mmap(0x1002001000,0x400000,0x7<PROT_READ|PROT_WRITE|PROT_EXEC>,0x1012<MAP_PRIVATE|MAP_FIXED|MAP_ANON>,0xffffffffffffff
ff,0)
  7597 none-amd64-freebsd RET   mmap 68753035264/0x1002001000

Not very interesting.
Comment 2 Paul Floyd 2023-08-31 06:02:16 UTC
The mapping of the tool RW PT_LOAD may be a red herring (but still a FreeBSD bug).

If I add a global array to memcheck to make the segment larger than 1 page then it seems to show up correctly.

Other differences that I'm seeing wrt 13.2:
Some extra memory mappings.

These two extra rx anon mapping:
--PID:2: aspacem    10: ANON 085990b000-085990bfff    4096 r-x--
--PID:2: aspacem    11:      085990c000-1001ffffff  31366m

 Need to investigate further.
A few changes to the order of redirs.
Comment 3 Paul Floyd 2023-08-31 06:02:31 UTC
The last syscalls

SYSCALL[1610,1](  5) sys_open ( 0x4825008(/lib/libc.so.7), 3145728 ) --> [async] ... 
SYSCALL[1610,1](  5) ... [async] --> Success(0x3) 
SYSCALL[1610,1](551) sys_fstat ( 3, 0x1ffbfff460 )[sync] --> Success(0x0) 
SYSCALL[1610,1](556) sys_fstatfs ( 3, 0x1ffbfff540 )[sync] --> Success(0x0) 
SYSCALL[1610,1](477) sys_mmap ( 0x0, 4096, 1, 262146, 3, 0x0) --> [pre-success] Success(0x4841000) 
SYSCALL[1610,1](477) sys_mmap ( 0x0, 4194304, 0, 8192, 4294967295, 0x0) --> [pre-success] Success(0x485a000) 
SYSCALL[1610,1](477) sys_mmap ( 0x485a000, 540672, 1, 393234, 3, 0x0) --> [pre-success] Success(0x485a000) 
SYSCALL[1610,1](477) sys_mmap ( 0x48de000, 1347584, 5, 393234, 3, 0x83000) --> [pre-success] Success(0x48de000) 
SYSCALL[1610,1](477) sys_mmap ( 0x4a27000, 40960, 3, 262162, 3, 0x1cb000) --> [pre-success] Success(0x4a27000) 
SYSCALL[1610,1](477) sys_mmap ( 0x4a31000, 28672, 3, 262162, 3, 0x1d4000) --> [pre-success] Success(0x4a31000) 
SYSCALL[1610,1](477) sys_mmap ( 0x4a38000, 2236416, 3, 4114, 4294967295, 0x0) --> [pre-success] Success(0x4a38000) 
SYSCALL[1610,1]( 73) sys_munmap ( 0x4841000, 4096 )[sync] --> Success(0x0) 
SYSCALL[1610,1](  6) sys_close ( 3 )[sync] --> Success(0x0) 
SYSCALL[1610,1]( 74) sys_mprotect ( 0x4844000, 4096, 1 )[sync] --> Success(0x0) 
SYSCALL[1610,1]( 74) sys_mprotect ( 0x4858000, 4096, 1 )[sync] --> Success(0x0) 
SYSCALL[1610,1]( 74) sys_mprotect ( 0x4a27000, 36864, 1 )[sync] --> Success(0x0) 
SYSCALL[1610,1](165) sys_sysarch ( 129, 0x1ffbfffec0 )sys_amd64_set_fsbase ( 0x1ffbfffec0 ) --> [pre-success] Success(0x483a120) 
SYSCALL[1610,1](340) sys_sigprocmask ( 1, 0x401fba4, 0x1ffbfffe78 ) --> [pre-success] Success(0x0) 
SYSCALL[1610,1]( 74) sys_mprotect ( 0x4a27000, 36864, 3 )[sync] --> Success(0x0) 
SYSCALL[1610,1](340) sys_sigprocmask ( 3, 0x401fdc4, 0x0 ) --> [pre-success] Success(0x0) 
SYSCALL[1610,1](340) sys_sigprocmask ( 1, 0x401fba4, 0x1ffbfffde8 ) --> [pre-success] Success(0x0) 
SYSCALL[1610,1](340) sys_sigprocmask ( 3, 0x401fdc4, 0x0 ) --> [pre-success] Success(0x0) 
SYSCALL[1610,1](340) sys_sigprocmask ( 1, 0x401fba4, 0x1ffbfffde8 ) --> [pre-success] Success(0x0) 
SYSCALL[1610,1](340) sys_sigprocmask ( 3, 0x401fdc4, 0x0 ) --> [pre-success] Success(0x0) 
SYSCALL[1610,1](340) sys_sigprocmask ( 1, 0x401fba4, 0x1ffbfffde8 ) --> [pre-success] Success(0x0) 
SYSCALL[1610,1](340) sys_sigprocmask ( 3, 0x401fdc4, 0x0 ) --> [pre-success] Success(0x0) 

valgrind: m_redir.c:1212 (Addr vgPlain_redir_do_lookup(Addr, Bool *)): Assertion 'iFuncWrapper' failed.
Comment 4 Paul Floyd 2023-08-31 09:17:17 UTC
Trying to see where the redir is going wrong and thinking that there is an iFunc.

On RHEL, for instance, libc contains plenty:
nm /lib64/libc.so.6 | grep " i "
...
000000000008b100 i strcat
...

Just came accross this
https://maskray.me/blog/2021-01-18-gnu-indirect-function

And according to https://github.com/freebsd/freebsd-src/blob/main/lib/libc/amd64/string/stpcpy.S and https://github.com/freebsd/freebsd-src/blob/main/lib/libc/amd64/amd64_archlevel.h FreeBSD now has GNU indirect functions.

That means that vg_preloaded.c needs to change from

#elif defined(VGO_freebsd)

// nothing specific currently

#elif defined(VGO_solaris)

to containing something like the Linux version:

void * VG_NOTIFY_ON_LOAD(ifunc_wrapper) (void)
{
    OrigFn fn;
    Addr result = 0;
    Addr fnentry;

    /* Call the original indirect function and get it's result */
    VALGRIND_GET_ORIG_FN(fn);
    CALL_FN_W_v(result, fn);

    fnentry = result;

    VALGRIND_DO_CLIENT_REQUEST_STMT(VG_USERREQ__ADD_IFUNC_TARGET,
                                    fn.nraddr, fnentry, 0, 0, 0);
    return (void*)result;
}
Comment 5 Paul Floyd 2023-08-31 12:23:49 UTC
commit c934430d56c2add25002ea8e321bd8bdab80fc99 (HEAD -> master, origin/master, origin/HEAD)
Author: Paul Floyd <pjfloyd@wanadoo.fr>
Date:   Thu Aug 31 15:32:21 2023 +0200

    Bug 473870 - FreeBSD 14 applications fail early at startup
    
    FreeBSD recently started adding some functions using @gnu_indirect_function,
    specifically strpcmp which was causing this crash.
    
    When running and encountering this ifunc Valgrind looked for the
    ifunc_handler. But there wasn't one for FreeBSD so Valgrind asserted.