Bug 453084

Summary: Misleading error with memmove: Source and destination overlap in memcpy_chk
Product: [Developer tools] valgrind Reporter: mliska
Component: memcheckAssignee: Julian Seward <jseward>
Status: RESOLVED DUPLICATE    
Severity: normal CC: mark, sam, siddhesh, tom
Priority: NOR    
Version: 3.18.1   
Target Milestone: ---   
Platform: Other   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=402833
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Fedora dockerfile

Description mliska 2022-04-27 12:26:21 UTC
Created attachment 148408 [details]
Fedora dockerfile

The following fails for me, __memcpy_chk is used instead of __memmove_chk:

$ cat drpm.c
#define SIZE 32

int size = SIZE;
char data[SIZE];
int zero = 0;

int main() {
  __builtin___memmove_chk(data + zero, data, size, size);
  return 0;
}

$ gcc drpm.c -g -fuse-ld=gold -lrpm && valgrind ./a.out
...
==2468== Source and destination overlap in memcpy_chk(0x204060, 0x204060, 32)
==2468==    at 0x4850ED2: __memcpy_chk (vg_replace_strmem.c:1617)
==2468==    by 0x20117A: main (drpm.c:8)

Can be reproduce with Fedora dockerfile as well:

$ podman build -t test -f Dockerfile-fedora
STEP 1/4: FROM fedora:latest
STEP 2/4: RUN dnf --nogpgcheck -y install rpm-devel gcc wget valgrind binutils-gold
--> Using cache 27002ecd458a78d4860b4b774a3474868fafd2f49f50822e28b94878792cc2aa
--> 27002ecd458
STEP 3/4: RUN wget https://gist.github.com/marxin/b5041eb13da32cfd4b4b701a54e8c165/raw/1c993107b290fa4af9780953f394c3e5e5b8b656/drpm.c
--> Using cache fdafad79d772f8e9c8f31f1037ef779505c8ce9d03a72f271d1f6b88ebe90f19
--> fdafad79d77
STEP 4/4: RUN gcc drpm.c -g -fuse-ld=gold -lrpm && valgrind --error-exitcode=123 ./a.out
==1== Memcheck, a memory error detector
==1== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==1== Command: ./a.out
==1== 
==1== Source and destination overlap in memcpy_chk(0x402040, 0x402040, 32)
==1==    at 0x484F292: __memcpy_chk (vg_replace_strmem.c:1723)
==1==    by 0x40062A: main (drpm.c:8)
==1== 
==1== 
==1== HEAP SUMMARY:
==1==     in use at exit: 0 bytes in 0 blocks
==1==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==1== 
==1== All heap blocks were freed -- no leaks are possible
==1== 
==1== For lists of detected and suppressed errors, rerun with: -s
==1== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Error: error building at STEP "RUN gcc drpm.c -g -fuse-ld=gold -lrpm && valgrind --error-exitcode=123 ./a.out": error while running runtime: exit status 123
Comment 1 mliska 2022-04-27 12:27:03 UTC
May be related to PR402833.
Comment 2 Tom Hughes 2022-04-27 12:47:00 UTC
Well which function did the compiler actually use for that?

It doesn't really matter what you wrote in the source - if the compiler has decided that it's safe to replace it with memcpy (because it knows the memcpy implementation is overlap safe) then it might do that but valgrind has no way to know that so will warn about it.

I'd try and reproduce it to see for myself but it's likely to be very environment sensitive and you haven't even said what version of Fedora you're using.
Comment 3 mliska 2022-04-27 12:50:54 UTC
(In reply to Tom Hughes from comment #2)
> Well which function did the compiler actually use for that?
> 
> It doesn't really matter what you wrote in the source - if the compiler has
> decided that it's safe to replace it with memcpy (because it knows the
> memcpy implementation is overlap safe) then it might do that but valgrind
> has no way to know that so will warn about it.

No, it's not the case:

$ objdump -S ./a.out
...
int main() {
  201146:	55                   	push   %rbp
  201147:	48 89 e5             	mov    %rsp,%rbp
...
  201176:	e8 95 fe ff ff       	call   201010 <__memmove_chk@plt>

> 
> I'd try and reproduce it to see for myself but it's likely to be very
> environment sensitive and you haven't even said what version of Fedora
> you're using.

Fedora 35 (latest), please see the provided Dockerfile.
Comment 4 Tom Hughes 2022-04-27 12:51:16 UTC
I've managed to reproduce it on Fedora 34 and it does seem to be linked to use __memmove_chk and __memmove_chk and __memcpy_chkdo seem to be different symbols in glibc (sometimes the two are the same if memcpy is known to be overlap safe) so it does seem a bit odd.
Comment 5 Tom Hughes 2022-04-27 12:52:19 UTC
Sorry I meant I reproduced in Fedora 35 and without having to figure out docker which is a totally unnecessary distraction.
Comment 6 Tom Hughes 2022-04-27 12:58:09 UTC
I think this (from --trace-redir=yes) explains is:

--3163534-- REDIR: 0x4a5f6c0 (libc.so.6:__memmove_chk) redirected to 0x48391ea (_vgnU_ifunc_wrapper)
--3163534-- Adding redirect for indirect function 0x4a5f6c0 from 0x493ff50 -> 0x484eba0
--3163534-- REDIR: 0x493ff50 (libc.so.6:__memcpy_chk_sse2_unaligned) redirected to 0x484f220 (__memcpy_chk)

So __memmove_chk is actually an IFUNC that, on first call, resolves to the optimal implementation for the current preprocessor.

In this case it chose __memcpy_chk_sse2_unaligned presumably because it knows that is in fact overlap safe, so as far as valgrind is concerned the call did in fact wind up going to memcpy and hence got checked.

I'm not sure there's a huge amount we can do to avoid this sadly.
Comment 7 mliska 2022-04-27 13:15:26 UTC
> So __memmove_chk is actually an IFUNC that, on first call, resolves to the
> optimal implementation for the current preprocessor.
> 
> In this case it chose __memcpy_chk_sse2_unaligned presumably because it
> knows that is in fact overlap safe, so as far as valgrind is concerned the
> call did in fact wind up going to memcpy and hence got checked.

Are you sure about it? If I use gdb, then it's resolved to the following function:

Breakpoint 1, __memmove_chk_avx_unaligned () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:206
206	../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Directory not empty.
(gdb) bt
#0  __memmove_chk_avx_unaligned () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:206
#1  0x000000000040069b in main () at drpm.c:8

which is proved if I put breakpoint to __memmove_chk_ifunc:

│  > 0x7ffff7e1ed50 <__memmove_chk_ifunc+290>        ret

(gdb) p /x $rax
$2 = 0x7ffff7e88820

(gdb) p & __memmove_chk_avx_unaligned
$3 = (<text variable, no debug info> *) 0x7ffff7e88820 <__memmove_chk_avx_unaligned>
Comment 8 mliska 2022-04-27 13:39:44 UTC
> In this case it chose __memcpy_chk_sse2_unaligned presumably because it
> knows that is in fact overlap safe

Note that ifunc resolver does not operate on the actual arguments (of the first invocation). So it cannot assume that an overlap is safe.
Comment 9 Siddhesh Poyarekar 2022-04-27 16:11:38 UTC
(In reply to Tom Hughes from comment #6)
> In this case it chose __memcpy_chk_sse2_unaligned presumably because it
> knows that is in fact overlap safe, so as far as valgrind is concerned the
> call did in fact wind up going to memcpy and hence got checked.

The problem is that __memcpy_chk_sse2_unaligned and __memmove_chk_sse2_unaligned are aliases to the same function:

```
$ nm /lib64/libc.so.6 | grep "__mem\(cpy\|move\)_chk_sse2_unaligned$"
00000000000b9c50 t __memcpy_chk_sse2_unaligned
00000000000b9c50 t __memmove_chk_sse2_unaligned
```

Maybe valgrind could see if the function has a *memmove* alias before it triggers the warning?

(In reply to mliska from comment #7)
> Are you sure about it? If I use gdb, then it's resolved to the following
> function:
> 
> Breakpoint 1, __memmove_chk_avx_unaligned () at
> ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:206
> 206	../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S: Directory not
> empty.

Again, the memcpy and memmove variants are aliases and one just happened to be selected over another, or maybe gdb has some smarts to make it happen, I don't know:

```
$ nm /lib64/libc.so.6 | grep "__mem\(cpy\|move\)_chk_avx_unaligned$"
00000000001828a0 t __memcpy_chk_avx_unaligned
00000000001828a0 t __memmove_chk_avx_unaligned
```
Comment 10 Mark Wielaard 2023-05-29 12:14:20 UTC
(In reply to mliska from comment #1)
> May be related to PR402833.

Yes, it is, both are about the valgrind ifunc resolver aliasing two memcpy/memmove variants, causing the memmove variant to trigger the overlap check (or the other way around, the memcpy variant not triggering the overlap check as it should).

*** This bug has been marked as a duplicate of bug 402833 ***