Bug 445011 - SIGCHLD is sent when valgrind uses debuginfod-find
Summary: SIGCHLD is sent when valgrind uses debuginfod-find
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: unspecified
Platform: Fedora RPMs Linux
: NOR crash
Target Milestone: ---
Assignee: Aaron Merey
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-05 12:01 UTC by vladislavs.sokurenko
Modified: 2022-04-07 20:16 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
small program to reproduce the issue (798 bytes, text/x-csrc)
2021-11-05 12:01 UTC, vladislavs.sokurenko
Details
small program to reproduce the issue that only set signal and sleep (805 bytes, text/x-csrc)
2021-11-05 12:55 UTC, vladislavs.sokurenko
Details
patch (9.53 KB, patch)
2022-01-26 01:38 UTC, Aaron Merey
Details

Note You need to log in before you can comment on or make changes to this bug.
Description vladislavs.sokurenko 2021-11-05 12:01:29 UTC
Created attachment 143235 [details]
small program to reproduce the issue

SUMMARY
SIGCHLD is sent when valgrind is used, this makes application think that some child process has unexpectedly crashed. Note that no such signal is sent when older valgrind is used or when valgrind is not used

valgrind-3.18.1 (all versions after 3.0.16)
gcc (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)
Linux localhost.localdomain 5.14.15-300.fc35.x86_64 #1 SMP Wed Oct 27 15:53:39 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Fedora release 35 (Thirty Five)

STEPS TO REPRODUCE
1. gcc sigchld.c; valgrind --leak-check=full --trace-children=yes --track-origins=yes --read-var-info=yes --leak-resolution=high ./a.out

OBSERVED RESULT
==274919== Memcheck, a memory error detector
==274919== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==274919== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==274919== Command: ./a.out
==274919== 
One child process died (PID:274926,exitcode/signal:0). Exiting ...
One child process died (PID:274927,exitcode/signal:0). Exiting ...
==274919== 
==274919== HEAP SUMMARY:
==274919==     in use at exit: 0 bytes in 0 blocks
==274919==   total heap usage: 192 allocs, 192 frees, 27,887 bytes allocated
==274919== 
==274919== All heap blocks were freed -- no leaks are possible
==274919== 
==274919== For lists of detected and suppressed errors, rerun with: -s
==274919== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)


EXPECTED RESULT
No SIGCHLD is sent so application does not assume that some child process has exited.
Comment 1 Tom Hughes 2021-11-05 12:08:24 UTC
Well there must be a child that has exited - valgrind isn't going to suddenly create new processes unless the client program requests it and it's not going to just invent signals.
Comment 2 Tom Hughes 2021-11-05 12:16:00 UTC
Ah it's the children that are created to run /usr/bin/debuginfod-find and fetch debug information.
Comment 3 vladislavs.sokurenko 2021-11-05 12:49:41 UTC
(In reply to Tom Hughes from comment #2)
> Ah it's the children that are created to run /usr/bin/debuginfod-find and
> fetch debug information.

Thank you for quick response that explains it, actually now simply setting signal handler and placing sleep in main results in this issue.
The only reason it was happening on getservbyname() is that for some reason it takes long time to execute this function when using valgrind.
Please let me know if I can help with anything.
Comment 4 vladislavs.sokurenko 2021-11-05 12:55:10 UTC
Created attachment 143243 [details]
small program to reproduce the issue  that only set signal and sleep
Comment 5 Mark Wielaard 2021-11-05 15:52:11 UTC
O, that is an interesting side-effect of using debuginfod-find. valgrind probably should make sure SIGCHILDs delivered for these aren't propagated to the actual program running under valgrind. Easiest workaround is running with DEBUGINFOD_URLS="" or unset DEBUGINFOD_URLS.
Comment 6 vladislavs.sokurenko 2021-11-05 16:07:48 UTC
Thanks, I confirm that setting DEBUGINFOD_URLS="" helps.
Comment 7 Tom Hughes 2021-11-12 12:42:48 UTC
I did some looking at this and I think probably m_signals.c needs to provide a way to register pids which should be ignored and not passed on to the client when SIGCHLD arrives.

Then either VG_(fork) or readelf.c could register the pids - currently the only direct user of VG_(fork) is readelf.c for starting the debuginfo fetcher.

The only other use of VG_(fork) is to implement VG_(spawn) on non-solaris systems and in turn that is only used by VG_(system) which does make some attempt to deal with SIGCHLD though I'm not sure it's correct - aside from anything else it changes the CHLD handler after the fork has already been done so there would appear to be a race condition.

Users of VG_(system) are the PDB and MACHO debug readers which use it to run external tools and the gdb server, so it's possible those can also trigger false SIGCHLD reports to the client.
Comment 8 Aaron Merey 2022-01-26 01:38:09 UTC
Created attachment 145943 [details]
patch

I've attached a patch for this bug.
Comment 9 Mark Wielaard 2022-04-07 20:16:17 UTC
Hi Aaron,

Very nice patch. The pipe trick is cool.
Note that I believe this should also work fine for non-gnu-linux arches.
But I left the #if defined(VGO_linux) in deliver_signal since none of the others
try to use debuginfod-find.

Pushed as:

commit 2ad93350446a49a5b0093548b63d43195d99d4ae
Author: Aaron Merey <amerey@redhat.com>
Date:   Tue Jan 25 20:24:18 2022 -0500

    Bug 445011: SIGCHLD is sent when valgrind uses debuginfod-find
    
    Valgrind fork+execs debuginfod-find in order to perform debuginfod
    queries. Any SIGCHLD debuginfod-find sends upon termination can
    mistakenly be delivered to the client running under valgrind.
    
    To prevent this, record in a hash table the PID of each process
    valgrind forks for internal use. Do not send SIGCHLD to the client
    if it is from a PID in this hash table.
    
    https://bugs.kde.org/show_bug.cgi?id=445011