Created attachment 166013 [details] posix_spawn_valgrind_test.c test case SUMMARY Under valgrind, posix_spawn returns success even though it fails due to a missing executable name STEPS TO REPRODUCE 1. Compile the attached program. 2. Run the program natively, observe: posix_spawn: No such file or directory 3. Run the program under valgrind OBSERVED RESULT ==85857== Memcheck, a memory error detector ==85857== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==85857== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==85857== Command: ./posix_spawn_valgrind_test ==85857== ==85858== ==85858== HEAP SUMMARY: ==85858== in use at exit: 0 bytes in 0 blocks ==85858== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==85858== ==85858== All heap blocks were freed -- no leaks are possible ==85858== ==85858== For lists of detected and suppressed errors, rerun with: -s ==85858== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) PID of child: 85858 ==85857== ==85857== HEAP SUMMARY: ==85857== in use at exit: 0 bytes in 0 blocks ==85857== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated ==85857== ==85857== All heap blocks were freed -- no leaks are possible ==85857== ==85857== For lists of detected and suppressed errors, rerun with: -s ==85857== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) EXPECTED RESULT valgrind should properly emulate the call. SOFTWARE/OS VERSIONS Ubuntu 22.04 LTS stock ADDITIONAL INFORMATION posix_spawn is obviously a tricky function to emulate. Its specification is also unclear as to what errors are returned and gives the implementation leeway. For instance, any failure during the exec step may be reported as initial success, but then the child exiting with 127. On Linux, however, it uses vfork, and the parent waits until it knows whether the child's exec was successful or not. This way, it can report a failure during the exec step directly to the caller rather than reporting success. Note that (at least in the version I'm looking at), the POSIX_SPAWN_USEVFORK flag is not used - the implementation always uses `clone()` with CLONE_VFORK set. This may be related to https://bugs.kde.org/show_bug.cgi?id=373192 Is there any way to implement the correct semantics here, perhaps with a proper emulation of vfork? It would be of tremendous help to people debugging shells under valgrind.
Valgrind doesn't support clone3 that posix_spawn uses on Linux (and on FreeBSD, posix_spawn uses rfork which Valgrind only supports with the posix_spawn options). Sticking to Linux, since Valgrind returns ENONSYS for clone3, glibc falls back to using clone. I'll need to look more at the specifics as to what is happening. However, fork and clone syscalls are very tricky to emulate. For instance, the CLONE_CLEAR_SIGHAND flag would involve turning the flag off initially - the child valgrind needs to keep its signal handlers - and then finding some way to get the child valgrind to clear the signal handlers in its guest.