The setrlimit05 LTP testcase kills valgrind. Attached patch fixes the problem for me: 41$ valgrind --version valgrind-3.26.0.GIT 41$ /usr/local/bin/valgrind --tool=none /home/mcermak/WORK/ltp/ltp/testcases/kernel/syscalls/setrlimit/setrlimit05 ==3409713== Nulgrind, the minimal Valgrind tool ==3409713== Copyright (C) 2002-2024, and GNU GPL'd, by Nicholas Nethercote et al. ==3409713== Using Valgrind-3.26.0.GIT and LibVEX; rerun with -h for copyright info ==3409713== Command: /home/mcermak/WORK/ltp/ltp/testcases/kernel/syscalls/setrlimit/setrlimit05 ==3409713== tst_test.c:1945: TINFO: LTP version: 20250130-253-g4a0e3a8fa tst_test.c:1949: TINFO: Tested kernel: 6.12.9-200.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jan 9 16:05:40 UTC 2025 x86_64 tst_kconfig.c:88: TINFO: Parsing kernel config '/lib/modules/6.12.9-200.fc41.x86_64/config' tst_test.c:1765: TINFO: Overall timeout per run is 0h 00m 30s --3409716-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --3409716-- si_code=2; Faulting address: 0x483F000; sp: 0x100288dd20 valgrind: the 'impossible' happened: Killed by fatal signal host stacktrace: ==3409716== at 0x58086D85: vgSysWrap_linux_sys_prlimit64_before (syswrap-linux.c:2303) ==3409716== by 0x58015401: vgPlain_client_syscall (syswrap-main.c:2410) ==3409716== by 0x5801197A: handle_syscall (scheduler.c:1214) ==3409716== by 0x58013856: vgPlain_scheduler (scheduler.c:1588) ==3409716== by 0x5807EE7A: run_a_thread_NORETURN (syswrap-linux.c:102) sched status: running_tid=1 Thread 1: status = VgTs_Runnable syscall 302 (lwpid 3409716) ==3409716== at 0x493D125: setrlimit (setrlimit64.c:40) ==3409716== by 0x401F36: verify_setrlimit (setrlimit05.c:29) ==3409716== by 0x40B91B: fork_testrun.isra.0 (tst_test.c:1592) ==3409716== by 0x40DC63: tst_run_tcases (tst_test.c:1963) ==3409716== by 0x401E0D: main (tst_test.h:729) client stack range: [0x1FFEFFC000 0x1FFF000FFF] client SP: 0x1FFEFFF7B8 valgrind stack range: [0x100278E000 0x100288DFFF] top usage: 10600 of 1048576 Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks. setrlimit05.c:60: TFAIL: child exited with 1 ==3409715== Summary: passed 0 failed 1 broken 0 skipped 0 warnings 0 ==3409713== 41$
Created attachment 181390 [details] possible patch
Almost correct. But in this particular case there is a tricky logic issue where the else clause should be taken only if the ARG is not NULL (not when it contains an inaccessible argument). Things are a little complicated here because we explicitly use SET_STATUS_Success( 0 ) and SET_STATUS_Failure( VKI_EINVAL ) to pretend the syscall succeeded or failed but if we call those macros we do things ourselves and don't actually call the syscall. So I think the correct way is having this nested if statement instread of && all conditions together: if (ARG3) { if (ML_(safe_to_deref)( (void*)(Addr)ARG3, sizeof(struct vki_rlimit64) )) { if (((struct vki_rlimit64 *)(Addr)ARG3)->rlim_cur > ((struct vki_rlimit64 *)(Addr)ARG3)->rlim_max) { SET_STATUS_Failure( VKI_EINVAL ); } } } else if (ARG1 == 0 || ARG1 == VG_(getpid)()) { You should be able to test with the same setrlimit05 LTP testcase if he logic is correct.
Created attachment 181502 [details] possible patch Indeed, with your update the testcase works as expected: 41$ ./vg-in-place --tool=none --trace-syscalls=no ./auxprogs/auxchecks/ltp-full-20250130/testcases/kernel/syscalls/setrlimit/setrlimit05 ==3870247== Nulgrind, the minimal Valgrind tool ==3870247== Copyright (C) 2002-2024, and GNU GPL'd, by Nicholas Nethercote et al. ==3870247== Using Valgrind-3.26.0.GIT and LibVEX; rerun with -h for copyright info ==3870247== Command: ./auxprogs/auxchecks/ltp-full-20250130/testcases/kernel/syscalls/setrlimit/setrlimit05 ==3870247== tst_test.c:1900: TINFO: LTP version: VALGRIND_3_25_0-35-gc0791f593 tst_test.c:1904: TINFO: Tested kernel: 6.12.9-200.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jan 9 16:05:40 UTC 2025 x86_64 tst_kconfig.c:88: TINFO: Parsing kernel config '/lib/modules/6.12.9-200.fc41.x86_64/config' tst_test.c:1722: TINFO: Overall timeout per run is 0h 00m 30s setrlimit05.c:37: TPASS: setrlimit() failed as expected: EFAULT (14) ==3870254== ==3870253== Summary: passed 1 failed 0 broken 0 skipped 0 warnings 0 ==3870247== 41$ strace -qq -eprlimit64 --raw=all --signal='!all' -f auxprogs/auxchecks/ltp-full-20250130/testcases/kernel/syscalls/setrlimit/setrlimit05 prlimit64(0, 0x3, 0, 0x7ffc030e89c0) = 0 tst_test.c:1900: TINFO: LTP version: VALGRIND_3_25_0-35-gc0791f593 tst_test.c:1904: TINFO: Tested kernel: 6.12.9-200.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jan 9 16:05:40 UTC 2025 x86_64 tst_kconfig.c:88: TINFO: Parsing kernel config '/lib/modules/6.12.9-200.fc41.x86_64/config' tst_test.c:1722: TINFO: Overall timeout per run is 0h 00m 30s [pid 3870266] prlimit64(0, 0x7, 0x7f676f208000, 0) = -1 EFAULT (Bad address) setrlimit05.c:37: TPASS: setrlimit() failed as expected: EFAULT (14) Summary: passed 1 failed 0 broken 0 skipped 0 warnings 0 41$ ./auxprogs/auxchecks/ltp-full-20250130/testcases/kernel/syscalls/setrlimit/setrlimit05 tst_test.c:1900: TINFO: LTP version: VALGRIND_3_25_0-35-gc0791f593 tst_test.c:1904: TINFO: Tested kernel: 6.12.9-200.fc41.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Jan 9 16:05:40 UTC 2025 x86_64 tst_kconfig.c:88: TINFO: Parsing kernel config '/lib/modules/6.12.9-200.fc41.x86_64/config' tst_test.c:1722: TINFO: Overall timeout per run is 0h 00m 30s setrlimit05.c:37: TPASS: setrlimit() failed as expected: EFAULT (14) Summary: passed 1 failed 0 broken 0 skipped 0 warnings 0 41$
This looks good. Pushed as: commit 859d267a456c2921772f0c957bf24f463c51bd93 Author: Martin Cermak <mcermak@redhat.com> Date: Mon May 19 11:45:04 2025 +0200 PR504341: Prevent LTP setrlimit05 syscall test from crashing valgrind Prevent ltp/testcases/kernel/syscalls/setrlimit/setrlimit05 testcase from crashing valgrind when passing 0xffffffffffff as ARG3 and then trying to dereference it. https://bugs.kde.org/show_bug.cgi?id=504341 It might be good as a followup to also wrap the other accesses of ARG3 and ARG4 as vki_rlimit64 pointers in ML_(safe_to_deref) checks. The trick there is that they occur after the SET_STATUS_Success( 0 ); call. So they should probably call SET_STATUS_Failure( VKI_FAULT ); themselves if the safe_to_deref check fails.
Oops. Logic still is wrong make regtest shows two regressions: ================================================= ./valgrind-new/none/tests/rlimit64_nofile.stderr.diff ================================================= --- rlimit64_nofile.stderr.exp 2025-05-20 00:13:00.432065010 +0000 +++ rlimit64_nofile.stderr.out 2025-05-20 00:24:45.392038888 +0000 @@ -1,2 +1,3 @@ +setrlimit64 changing hardlimit must return -1 ================================================= ./valgrind-new/none/tests/rlimit_nofile.stderr.diff ================================================= --- rlimit_nofile.stderr.exp 2025-05-20 00:13:00.432065010 +0000 +++ rlimit_nofile.stderr.out 2025-05-20 00:24:45.522038886 +0000 @@ -1,2 +1,3 @@ +setrlimit changing hardlimit must return -1 See also https://sourceforge.net/p/valgrind/mailman/message/59185817/
So on the mailinglist Matthias zzam@gentoo.org suggested to do the ML_(safe_to_deref) checks up front and immediately flag VKI_EFAULT. That way the logic later in the PRE handler can stay the same. So if we revert the change, then that would look like: diff --git a/coregrind/m_syswrap/syswrap-linux.c b/coregrind/m_syswrap/syswrap-linux.c index d4653d027396..500bc43acf8e 100644 --- a/coregrind/m_syswrap/syswrap-linux.c +++ b/coregrind/m_syswrap/syswrap-linux.c @@ -2295,10 +2295,20 @@ PRE(sys_prlimit64) vki_pid_t, pid, unsigned int, resource, const struct rlimit64 *, new_rlim, struct rlimit64 *, old_rlim); - if (ARG3) + if (ARG3) { PRE_MEM_READ( "rlimit64(new_rlim)", ARG3, sizeof(struct vki_rlimit64) ); - if (ARG4) + if (!ML_(safe_to_deref)((void*)(Addr)ARG3, sizeof(struct vki_rlimit64))) { + SET_STATUS_Failure(VKI_EFAULT); + return; + } + } + if (ARG4) { PRE_MEM_WRITE( "rlimit64(old_rlim)", ARG4, sizeof(struct vki_rlimit64) ); + if (!ML_(safe_to_deref)((void*)(Addr)ARG4, sizeof(struct vki_rlimit64))) { + SET_STATUS_Failure(VKI_EFAULT); + return; + } + } if (ARG3 && ((struct vki_rlimit64 *)(Addr)ARG3)->rlim_cur Testing that now against both regtest and ltpchecks
commit f51744ed2d1f07c814b72853ca946da3c94de0f1 Author: Mark Wielaard <mark@klomp.org> Date: Tue May 20 12:09:13 2025 +0200 PRE(sys_prlimit64): Check ARG3 and ARG4 ML_(safe_to_deref) up front The previous commit 859d267a456c "PR504341: Prevent LTP setrlimit05 syscall test from crashing valgrind" changed the checking logic of the PRE handler changing the if-else control flow. Do the ARG3 and ARG4 ML_(safe_to_deref) checking up front and return EFAULT early so the later checking logic doesn't need to change. https://bugs.kde.org/show_bug.cgi?id=504341 Suggested-by: Matthias <zzam@gentoo.org>