Bug 509407 - LTP testcase fchownat03 fails under valgrind --tool=none on ppc64le
Summary: LTP testcase fchownat03 fails under valgrind --tool=none on ppc64le
Status: REPORTED
Alias: None
Product: valgrind
Classification: Developer tools
Component: vex (other bugs)
Version First Reported In: 3.25 GIT
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2025-09-12 07:59 UTC by mcermak
Modified: 2025-09-12 08:01 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
Patch against LTP sources adding some debug logs (1.26 KB, patch)
2025-09-12 07:59 UTC, mcermak
Details
testcase binary (xz'd) (289.14 KB, application/x-xz)
2025-09-12 08:00 UTC, mcermak
Details
--trace-flags=10000001 log (xz'd) (409.95 KB, application/x-xz)
2025-09-12 08:01 UTC, mcermak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description mcermak 2025-09-12 07:59:16 UTC
Created attachment 184903 [details]
Patch against LTP sources adding some debug logs

Linux test project has a testcase testcases/kernel/syscalls/fchownat/fchownat03.
It needs to be run as root.  On ppc64le it does pass just fine on its own. However,
when it is run under valgrind --tool=none it segfaults.  Note that --tool=none is
important here.  Other tools like memcheck don't demonstrate this problem.
LTP sources: https://github.com/linux-test-project/ltp .  LTP commit: 5a03d7653 .
An example test run without valgrind:

> el9 ppc64le # pwd
> /root/ltp/testcases/kernel/syscalls/fchownat
> el9 ppc64le # ./fchownat03
> tst_buffers.c:57: TINFO: Test is using guarded buffers
> tst_tmpdir.c:316: TINFO: Using /tmp/LTP_fch1ZtHIc as tmpdir (xfs filesystem)
> tst_test.c:1175: TINFO: Mounting (null) to /tmp/LTP_fch1ZtHIc/ro_mntpoint fstyp=tmpfs flags=21
> tst_test.c:2008: TINFO: LTP version: 20250530-193-g5a03d7653
> tst_test.c:2011: TINFO: Tested kernel: 5.14.0-613.el9.ppc64le #1 SMP Sat Sep 6 11:25:15 EDT 2025 ppc64le
> tst_kconfig.c:88: TINFO: Parsing kernel config '/lib/modules/5.14.0-613.el9.ppc64le/config'
> tst_test.c:1829: TINFO: Overall timeout per run is 0h 00m 30s
> fchownat03.c:75: TPASS: fchownat(3, eaccess/eaccess, 65534, 0, 0) : EACCES (13)
> fchownat03.c:75: TPASS: fchownat(-1, testfile, 65534, 0, 0) : EBADF (9)
> fchownat03.c:75: TPASS: fchownat(3, Invalid address, 65534, 0, 0) : EFAULT (14)
> fchownat03.c:75: TPASS: fchownat(3, testfile, 65534, 0, 9999) : EINVAL (22)
> fchownat03.c:75: TPASS: fchownat(3, testfile_eloop, 65534, 0, 0) : ELOOP (40)
> fchownat03.c:75: TPASS: fchownat(3, aaaa..., 65534, 0, 0) : ENAMETOOLONG (36)
> fchownat03.c:75: TPASS: fchownat(3, /tmp/does/not/exist, 65534, 0, 0) : ENOENT (2)
> fchownat03.c:75: TPASS: fchownat(4, testfile, 65534, 0, 0) : ENOTDIR (20)
> fchownat03.c:75: TPASS: fchownat(3, /dev/null, 65534, 0, 0) : EPERM (1)
> fchownat03.c:75: TPASS: fchownat(3, ro_mntpoint/file, 65534, 0, 0) : EROFS (30)
> 
> Summary:
> passed   10
> failed   0
> broken   0
> skipped  0
> warnings 0
> el9 ppc64le #

Now a test run under valgrind --tool=none demonstrating the segv:

> el9 ppc64le # vg-in-place --tool=none ./fchownat03
> ==62103== Nulgrind, the minimal Valgrind tool
> ==62103== Copyright (C) 2002-2024, and GNU GPL'd, by Nicholas Nethercote et al.
> ==62103== Using Valgrind-3.26.0.GIT and LibVEX; rerun with -h for copyright info
> ==62103== Command: ./fchownat03
> ==62103==
> tst_buffers.c:57: TINFO: Test is using guarded buffers
> ==62103==
> ==62103== Process terminating with default action of signal 11 (SIGSEGV): dumping core
> ==62103==  Bad permissions for mapped region at address 0x4310000
> ==62103==    at 0x41908CC: __strcpy_power9 (in /usr/lib64/libc.so.6)
> ==62103==    by 0x1001EEFF: tst_buffers_alloc (string_fortified.h:79)
> ==62103==    by 0x100052BF: main (tst_test.h:738)
> ==62103==
> Segmentation fault (core dumped)
> el9 ppc64le #

This test allocates a memory region.  Then it mprotects part of it.  And
finally it tries to write something to the "allowed" part of the region, which
is supposed to work smoothly out.  On most architectures this works just fine.
But on ppc64le, the write is done by __strcpy_power9(), which seems to use
vector instruction stxvl, which *seems* to be writing more bytes than expected,
stepping at the mprotected part of the region, and causing a segv.  I've
prepared a patch against the LTP sources, demonstrating it (attached).

> el9 ppc64le # vg-in-place --tool=none ./fchownat03
> ==63362== Nulgrind, the minimal Valgrind tool
> ==63362== Copyright (C) 2002-2024, and GNU GPL'd, by Nicholas Nethercote et al.
> ==63362== Using Valgrind-3.26.0.GIT and LibVEX; rerun with -h for copyright info
> ==63362== Command: ./fchownat03
> ==63362==
> ../include/tst_safe_macros_inline.h:250: TINFO: YYY mmapped addr range    0x40a0000 ... 0x40b0000
> tst_buffers.c:57: TINFO: Test is using guarded buffers
> ../include/tst_safe_macros_inline.h:250: TINFO: YYY mmapped addr range    0x4300000 ... 0x4320000
> tst_buffers.c:65: TINFO: YYY mprotected addr range 0x4310000 ... 0x4320000
> tst_buffers.c:152: TINFO: YYY attempting to write 15 characters to address 0x430fff0 so that the end address would be 0x430ffff
> ==63362==
> ==63362== Process terminating with default action of signal 11 (SIGSEGV): dumping core
> ==63362==  Bad permissions for mapped region at address 0x4310000
> ==63362==    at 0x41908CC: __strcpy_power9 (in /usr/lib64/libc.so.6)
> ==63362==    by 0x1001EEF7: tst_strdup (string_fortified.h:79)
> ==63362==    by 0x100193F3: tst_run_tcases (tst_test.c:1467)
> ==63362==    by 0x100052BF: main (tst_test.h:738)
> ==63362==
> Segmentation fault (core dumped)
> el9 ppc64le #

As shown above, test is attempting to write 15 characters to address 0x430fff0
so that the end address would be 0x430ffff.  Such write should be allowed,
because the mprotected region starts at 0x4310000.  But on ppc64le this bombs
out with Segmentation fault.  I'm using the following command to get right
before the segv happens:

gdb -q -ex 'set remote exec-file ./fchownat03' -ex 'set sysroot /' -ex 'target extended-remote | vgdb --multi --vargs -q --tool=none' ./fchownat03 -ex 'b tst_strdup' -ex r -ex 'b __strcpy_power9' -ex c -ex 'layout asm' -ex si -ex si ...

With this, I can trigger the segv with the next si:

>    0x41908b8 <__strcpy_power9+184> b       0x4190850 <__strcpy_power9+80>                                                                                                                                                                                         
>    0x41908bc <__strcpy_power9+188> nop                                                                                                                                                                                                                           
>    0x41908c0 <__strcpy_power9+192> vctzlsbb r8,v6                                                                                                                                                                                                                
>    0x41908c4 <__strcpy_power9+196> addi    r9,r8,1                                                                                                                                                                                                               
>    0x41908c8 <__strcpy_power9+200> sldi    r9,r9,56                                                                                                                                                                                                              
>   >0x41908cc <__strcpy_power9+204> stxvl   vs32,r11,r9

(gdb) si
Program received signal SIGSEGV, Segmentation fault.
__strcpy_power9 () at ../sysdeps/powerpc/powerpc64/le/power9/strcpy.S:119
(gdb)

To show the conversion into IR, and final assembly, I've used the following
command to generate the attached log.xz:

> vg-in-place --tool=none --vex-guest-chase=no --trace-flags=10000001 --trace-notbelow=1816 /root/ltp/testcases/kernel/syscalls/fchownat/fchownat03

Specifically it shows:

> 0x41908CC:  stxvl 32,r11,r9
> 
>       ------ IMark(0x41908CC, 4, 0) ------
>       t61 = Add64(GET:I64(104),GET:I64(88))
>       t60 = GET:V128(784)
>       t70 = Shr64(GET:I64(88),0x38:I8)
>       t69 = 64to8(t70)
>       t67 = 1Sto8(CmpLT64U(Shr64(t70,0x4:I8),0x1:I64))
>       t68 = 64HLtoV128(1Sto64(CmpEQ64(t70,0x0:I64)),1Sto64(CmpEQ64(t70,0x0:I64)))
>       t71 = And8(t69,t67)
>       t66 = 64to8(Mul64(Sub64(0x10:I64,8Uto64(t71)),0x8:I64))
>       t72 = GET:I64(104)
>       t73 = 64HLtoV128(LDle:I64(Add64(t72,0x8:I64)),LDle:I64(t72))
>       t75 = OrV128(AndV128(t68,V128{0x0000}),AndV128(ShrV128(V128{0xFFFF},t66),NotV128(t68)))
>       t74 = OrV128(AndV128(t60,t75),AndV128(NotV128(t75),t73))
>       t62 = Shr64(V128HIto64(t74),0x20:I8)
>       t63 = And64(V128HIto64(t74),0xFFFFFFFF:I64)
>       t64 = Shr64(V128to64(t74),0x20:I8)
>       t65 = And64(V128to64(t74),0xFFFFFFFF:I64)
>       STle(Add64(t72,0x0:I64)) = 64to32(t65)
>       STle(Add64(t72,0x4:I64)) = 64to32(t64)
>       STle(Add64(t72,0x8:I64)) = 64to32(t63)
>       STle(Add64(t72,0xC:I64)) = 64to32(t62)
>       PUT(1296) = 0x41908D0:I64

I'm also attaching the testcase binary (with my patch).

Mark Wielaard says that this is possibly a similar problem to https://bugs.kde.org/show_bug.cgi?id=430354 .
Comment 1 mcermak 2025-09-12 08:00:22 UTC
Created attachment 184904 [details]
testcase binary (xz'd)
Comment 2 mcermak 2025-09-12 08:01:28 UTC
Created attachment 184905 [details]
--trace-flags=10000001 log (xz'd)