Bug 481679 - posix_spawn under valgrind succeeds when it would fail natively due to a missing executable
Summary: posix_spawn under valgrind succeeds when it would fail natively due to a miss...
Status: REPORTED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.22.0
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-02-22 15:21 UTC by Godmar Back
Modified: 2024-02-22 19:55 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
posix_spawn_valgrind_test.c test case (442 bytes, text/x-csrc)
2024-02-22 15:21 UTC, Godmar Back
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Godmar Back 2024-02-22 15:21:37 UTC
Created attachment 166013 [details]
posix_spawn_valgrind_test.c test case

SUMMARY
Under valgrind, posix_spawn returns success even though it fails due to a missing executable name

STEPS TO REPRODUCE
1.  Compile the attached program.
2. Run the program natively, observe:
posix_spawn: No such file or directory
3. Run the program under valgrind

OBSERVED RESULT

==85857== Memcheck, a memory error detector
==85857== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==85857== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==85857== Command: ./posix_spawn_valgrind_test
==85857== 
==85858== 
==85858== HEAP SUMMARY:
==85858==     in use at exit: 0 bytes in 0 blocks
==85858==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==85858== 
==85858== All heap blocks were freed -- no leaks are possible
==85858== 
==85858== For lists of detected and suppressed errors, rerun with: -s
==85858== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
PID of child: 85858
==85857== 
==85857== HEAP SUMMARY:
==85857==     in use at exit: 0 bytes in 0 blocks
==85857==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==85857== 
==85857== All heap blocks were freed -- no leaks are possible
==85857== 
==85857== For lists of detected and suppressed errors, rerun with: -s
==85857== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


EXPECTED RESULT

valgrind should properly emulate the call.

SOFTWARE/OS VERSIONS
Ubuntu 22.04 LTS stock

ADDITIONAL INFORMATION

posix_spawn is obviously a tricky function to emulate.  Its specification is also unclear as to what errors are returned and gives the implementation leeway. For instance, any failure during the exec step may be reported as initial success, but then the child exiting with 127.  On Linux, however, it uses vfork, and the parent waits until it knows whether the child's exec was successful or not.  This way, it can report a failure during the exec step directly to the caller rather than reporting success.

Note that (at least in the version I'm looking at), the POSIX_SPAWN_USEVFORK flag is not used - the implementation always uses `clone()` with  CLONE_VFORK set.

This may be related to https://bugs.kde.org/show_bug.cgi?id=373192

Is there any way to implement the correct semantics here, perhaps with a proper emulation of vfork?
It would be of tremendous help to people debugging shells under valgrind.
Comment 1 Paul Floyd 2024-02-22 19:55:33 UTC
Valgrind doesn't support clone3 that posix_spawn uses on Linux (and on FreeBSD, posix_spawn uses rfork which Valgrind only supports with the posix_spawn options).

Sticking to Linux, since Valgrind returns ENONSYS for clone3, glibc falls back to using clone.

I'll need to look more at the specifics as to what is happening. However, fork and clone syscalls are very tricky to emulate. For instance, the CLONE_CLEAR_SIGHAND flag would involve turning the flag off initially - the child valgrind needs to keep its signal handlers - and then finding some way to get the child valgrind to clear the signal handlers in its guest.