492255 – Hangs before main() on any code compiled with clang -fsanitize=memory

Bug 492255 - Hangs before main() on any code compiled with clang -fsanitize=memory

Summary: Hangs before main() on any code compiled with clang -fsanitize=memory

Status:	RESOLVED NOT A BUG

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	memcheck (other bugs)
Version First Reported In:	3.20.0
Platform:	Debian unstable Linux

Importance:	NOR grave
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2024-08-27 08:53 UTC by Marko Mäkelä
Modified:	2024-08-29 07:04 UTC (History)
CC List:	2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Marko Mäkelä 2024-08-27 08:53:11 UTC

SUMMARY

Valgrind gets into a seemingly infinite loop when executing a trivial program that was compiled with clang -fsanitize=memory (MemorySanitizer).

STEPS TO REPRODUCE

1. echo "int main(){return 0;}" > m.c
2. clang -fsanitize=memory m.c
3. valgrind ./a.out

OBSERVED RESULT

==1186257== Memcheck, a memory error detector
==1186257== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1186257== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==1186257== Command: ./a.out
==1186257== 
==1186257== Warning: set address range perms: large range [0x10000000000, 0x100000000000) (defined)

This is followed by 100% CPU usage in Valgrind, inside a call stack that includes multiple avl_insert() inside vgSysWrap_amd64_linux_sys_mmap_before().

EXPECTED RESULT

Valgrind should refuse to run the program, similar to when -fsanitize=address is used:

==1186156== Memcheck, a memory error detector
==1186156== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==1186156== Using Valgrind-3.20.0 and LibVEX; rerun with -h for copyright info
==1186156== Command: ./a.out
==1186156== 
==1186156==Shadow memory range interleaves with an existing memory mapping. ASan cannot proceed correctly. ABORTING.
==1186156==ASan shadow was supposed to be located in the [0x00007fff7000-0x10007fff7fff] range.
==1186156==This might be related to ELF_ET_DYN_BASE change in Linux 4.12.
==1186156==See https://github.com/google/sanitizers/issues/856 for possible workarounds.
==1186156==Process memory map follows:
…
==1186156==End of process memory map.
==1186156== 
==1186156== HEAP SUMMARY:
==1186156==     in use at exit: 0 bytes in 0 blocks
==1186156==   total heap usage: 86 allocs, 86 frees, 2,737 bytes allocated
==1186156== 
==1186156== All heap blocks were freed -- no leaks are possible
==1186156== 
==1186156== For lists of detected and suppressed errors, rerun with: -s
==1186156== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

SOFTWARE/OS VERSIONS

dpkg --status valgrind|grep Version
Version: 1:3.20.0-2.1
dpkg --status clang-18|grep Version
Version: 1:18.1.8-9

Comment 1 Paul Floyd 2024-08-27 13:09:08 UTC

Don't try to run ASAN and Valgrind together. It is not supported.

Comment 2 Marko Mäkelä 2024-08-28 08:07:30 UTC

Paul Floyd, it looks like you misread the report.

As the EXPECTED RESULT section explains, Invoking Valgrind would already result in an instant failure on -fsanitize=address compiled code. I agree that that is not a bug, but this report is about  something else.

This report is about MemorySanitizer (clang -fsanitize=memory). Instead of instantly failing, Valgrind would start to spend excessive CPU time on processing an mmap() system call that is part of the MemorySanitizer initialization.

I filed this bug on the request of the maintainer of https://www.lysator.liu.se/~nisse/nettle/. The configure script in Nettle 3.10 attempts to invoke valgrind on a trivial program, and there is no way to disable that check (other than by removing valgrind from $PATH). I need to compile Nettle with -fsanitize=memory because it is a dependency of something that needs to be tested with -fsanitize=memory: https://jira.mariadb.org/browse/MDEV-20377

Comment 3 Paul Floyd 2024-08-28 09:00:32 UTC

I mistyped. Don't use Valgrind with either MSAN or ASAN. It isn't supported with either.

Comment 4 Paul Floyd 2024-08-28 09:24:06 UTC

You should be able to detect an msan linked exe. Try something like

nm exe | grep __msan_init

Comment 5 Niels Möller 2024-08-28 13:01:31 UTC

Hi! I see absolutely no problem with valgrind not supporting msan or  asan executables in any meaningful way.

But it would be very helpful if valgrind rejected unsupported executables with an error message.

My context is the the GNU Nettle crypto library. When possible, I use valgrind in my testsuite, to detect side channel leakage due to memory accesses or branches depending on secret data (which valgrind does very well, also covering assembly code the C compiler is unaware of). But I can't just use valgrind whenever it is installed. E.g., in a cross compiling setup, where the testsuite runs executables under test via binfmt and qemu, I can't use valgrind. That's why I have a configure test to check if valgrind works on the executables produced by whatever compiler the user has configured for the build. But then there's a problem if that configure test hangs forever for certain compiler configurations.

One could maybe use nm to check for a list of symbols indicating potential trouble, or use the timeout command to kill the process if it appears to hang, but since those tools aren't universally available, using them in the configure script would be a real hassle with additional tests for those tools, fallbacks if they aren't found, etc.

Is it unreasonable to expect that valgrind terminates with an error message if passed inputs that it doesn't support? Would it be difficult to implement, at least for well known classes of unsupported input executables?

Comment 6 Paul Floyd 2024-08-28 16:45:38 UTC

(In reply to Niels Möller from comment #5)
> Hi! I see absolutely no problem with valgrind not supporting msan or  asan
> executables in any meaningful way.
> 
> But it would be very helpful if valgrind rejected unsupported executables
> with an error message.
...
> Is it unreasonable to expect that valgrind terminates with an error message
> if passed inputs that it doesn't support? Would it be difficult to
> implement, at least for well known classes of unsupported input executables?

What are we going to try to detect? If the binary isn't stripped then we could look for ASAN or MSAN symbols.

Do they have some equivalent of RUNNING_ON_VALGRIND that we could check?

Comment 7 Niels Möller 2024-08-29 07:04:06 UTC

In my opinion, even a crude check for msan/asan symbols would be an improvement.

Then I don't understand why valgrind appears to get into some kind of infinite loop. If the msan executable violates some implicit assumption  made by valgrind, a more robust fix would need to turn that into an explicit check.