Since glibc 2.41 there are extra frames inserted before doing a syscall to support proper thread cancellation. This breaks various suppressions and regtests involving checking syscall arguments. As example the memcheck/test/sendmsg Before glibc 2.41 ==1929378== Syscall param sendmsg(msg) points to uninitialised byte(s) ==1929378== at 0x4971514: sendmsg (sendmsg.c:28) ==1929378== by 0x40128B: main (sendmsg.c:46) ==1929378== Address 0x1ffefff640 is on thread 1's stack ==1929378== in frame #1, created by main (sendmsg.c:13) After it looks like: ==2670784== Syscall param sendmsg(msg) points to uninitialised byte(s) ==2670784== at 0x48D9AE6: __internal_syscall_cancel (cancellation.c:64) ==2670784== by 0x48D9B03: __syscall_cancel (cancellation.c:75) ==2670784== by 0x49628F0: sendmsg (sendmsg.c:28) ==2670784== by 0x4005CB: main (sendmsg.c:46) ==2670784== Address 0x1ffeffff40 is on thread 1's stack ==2670784== in frame #3, created by main (sendmsg.c:13) There is also __syscall_cancel_arch which shows up in some gdb_server testcases. Mailinglist discussion: https://inbox.sourceware.org/libc-alpha/4954a5131faf35cbe4d88ac7729a1fa3ba4b2cb8.camel@klomp.org/T/#t Proposal is to filter out those extra top frames early when the platform is VGO_linux and we are handling PRE/POST syscalls. There is still an open question whether there is any impact from these functions doing tail calls, which might hide the actual caller frame. This should be solved on the glibc side.
This gdb_server tests part (for x86_64) seems simple to fix: commit f3f30becff5851b0d0b2caa7e96e661c7889f7d1 Author: Mark Wielaard <mark@klomp.org> Date: Fri Mar 28 13:44:35 2025 +0100 filter_gdb.in: __syscall_cancel_arch is just in a syscall Since glibc 2.41 some extra syscall_cancel frames are inserted before that actual syscall is made. Just filter out __syscall_cancel_arch from the gdb output and replace it with "in syscall ..." to make the regtest .exp match. https://bugs.kde.org/show_bug.cgi?id=502126 diff --git a/gdbserver_tests/filter_gdb.in b/gdbserver_tests/filter_gdb.in index 2bef9f3ee57b..e2b329a60483 100755 --- a/gdbserver_tests/filter_gdb.in +++ b/gdbserver_tests/filter_gdb.in @@ -134,6 +134,9 @@ s/^>[> ]*// # anonymise a 'general' system calls stack trace part s/in _dl_sysinfo_int80 () from \/lib\/ld-linux.so.*/in syscall .../ +# in __syscall_cancel_arch is just in a syscall +s/in __syscall_cancel_arch .*/in syscall .../ + # anonymise kill syscall. s/in kill ().*$/in syscall .../ Also pushed to VALGRIND_3_24_BRANCH. This fixes: gdbserver_tests/mcinfcallWSRU (stderrB) gdbserver_tests/nlcontrolc (stdoutB) gdbserver_tests/nlvgdbsigqueue (stdoutB) Failures that still need some tweaks to the valgrind side: memcheck/tests/sendmsg (stderr) none/tests/fdbaduse (stderr) none/tests/fdleak_cmsg_supp (stderr) none/tests/fdleak_creat_sup (stderr) none/tests/fdleak_ipv4 (stderr) none/tests/file_dclose (stderr) none/tests/file_dclose_sup (stderr) none/tests/socket_close (stderr) none/tests/use_after_close (stderr)
Created attachment 179822 [details] Skip syscall_cancel frames Proposed patch that for VGO_linux skips __syscall_cancel_arch, __internal_syscall_cancel and __syscall_cancel if a backtrace is requested while handling a syscall. Tested on x86_64, ppc64le and s390x where it seems to work as intended. Also tested in i386, where there is another frame __libc_do_syscall is in the way and it looks like there is some tail call which prevents getting a backtrace with the actual glibc function that called the syscall. Testing on aarch64 also seems to miss the calling frame, but works otherwise.
More gdb tests filtering: commit ddcb3aa3ed3188cd28c193225245a76e928b850b Author: Mark Wielaard <mark@klomp.org> Date: Sun Mar 30 13:08:55 2025 +0200 filter_gdb.in: filter out __libc_do_syscall On i386 and armhf __libc_do_syscall might be used to invoke a syscall. Replace __libc_do_syscall with "in syscall ..." and filter out possible extra (assembly) source file lines containing libc-do-syscall.S from the gdb output. https://bugs.kde.org/show_bug.cgi?id=502126 Also pushed to VALGRIND_3_24_BRANCH
There is still the arm64 issue of syscall_cancel tail calls obscuring the call stack. But that is a glibc issue: https://inbox.sourceware.org/libc-alpha/874izmtu4w.fsf@oldenburg.str.redhat.com/ The valgrind side is done.