SUMMARY ``` [satishk@tcn3 temp]$ valgrind ./core_avx2.out valgrind: mmap(0x400000, 4096) failed in UME with error 22 (Invalid argument). valgrind: this can be caused by executables with very large text, data or bss segments. [satishk@tcn3 temp]$ size core_avx2.out text data bss dec hex filename 1831 560 1 2392 958 core_avx2.out [satishk@tcn3 temp]$ ``` STEPS TO REPRODUCE 1. running valgrind on any executable is resulting in this. 2. 3. OBSERVED RESULT ``` [satishk@tcn3 temp]$ valgrind -d ./core_avx2.out --1960444:1:debuglog DebugLog system started by Stage 1, level 1 logging requested --1960444:1:launcher no tool requested, defaulting to 'memcheck' --1960444:1:launcher selected platform 'amd64-linux' --1960444:1:launcher launching /sw/arch/RHEL9/EB_production/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux --1960444:1:debuglog DebugLog system started by Stage 2 (main), level 1 logging requested --1960444:1: main Welcome to Valgrind version 3.24.0 debug logging --1960444:1: main Checking current stack is plausible --1960444:1: main Checking initial stack was noted --1960444:1: main Starting the address space manager --1960444:1: main Address space manager is running --1960444:1: main Starting the dynamic memory manager --1960444:1:mallocfr newSuperblock at 0x1002001000 (pszB 4194272) owner VALGRIND/core --1960444:1:mallocfr deferred_reclaimSuperblock at 0x1002001000 (pszB 4194272) (prev 0x0) owner VALGRIND/core --1960444:1: main Dynamic memory manager is running --1960444:1: main Initialise m_debuginfo --1960444:1: main VG_(libdir) = /sw/arch/RHEL9/EB_production/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind --1960444:1: main Getting launcher's name ... --1960444:1: main ... /gpfs/admin/_hpc/sw/arch/AMD-ZEN2/RHEL9/EB_production/2024/software/Valgrind/3.24.0-gompi-2024a/bin/valgrind --1960444:1: main Get hardware capabilities ... --1960444:1: cache Autodetected cache info is sensible --1960444:1: cache Cache info: --1960444:1: cache #levels = 3 --1960444:1: cache #caches = 4 --1960444:1: cache cache #0: --1960444:1: cache kind = data --1960444:1: cache level = 1 --1960444:1: cache size = 32768 bytes --1960444:1: cache linesize = 64 bytes --1960444:1: cache assoc = 8 --1960444:1: cache cache #1: --1960444:1: cache kind = insn --1960444:1: cache level = 1 --1960444:1: cache size = 32768 bytes --1960444:1: cache linesize = 64 bytes --1960444:1: cache assoc = 8 --1960444:1: cache cache #2: --1960444:1: cache kind = unified --1960444:1: cache level = 2 --1960444:1: cache size = 524288 bytes --1960444:1: cache linesize = 64 bytes --1960444:1: cache assoc = 8 --1960444:1: cache cache #3: --1960444:1: cache kind = unified --1960444:1: cache level = 3 --1960444:1: cache size = 268435456 bytes --1960444:1: cache linesize = 64 bytes --1960444:1: cache assoc = 1 --1960444:1: main ... arch = AMD64, hwcaps = amd64-cx16-lzcnt-rdtscp-sse3-ssse3-avx-avx2-bmi-f16c-rdrand-rdseed-fma --1960444:1: main Getting the working directory at startup --1960444:1: main ... /gpfs/home5/satishk/temp --1960444:1: main Split up command line --1960444:1: main (early_) Process Valgrind's command line options --1960444:1: main Create initial image --1960444:1: initimg Loading client valgrind: mmap(0x400000, 4096) failed in UME with error 22 (Invalid argument). valgrind: this can be caused by executables with very large text, data or bss segments. ``` EXPECTED RESULT SOFTWARE/OS VERSIONS Linux/KDE Plasma: ``` [satishk@tcn3 temp]$ cat /etc/os-release NAME="Red Hat Enterprise Linux" VERSION="9.4 (Plow)" ID="rhel" ID_LIKE="fedora" VERSION_ID="9.4" PLATFORM_ID="platform:el9" PRETTY_NAME="Red Hat Enterprise Linux 9.4 (Plow)" ANSI_COLOR="0;31" LOGO="fedora-logo-icon" CPE_NAME="cpe:/o:redhat:enterprise_linux:9::baseos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9" BUG_REPORT_URL="https://bugzilla.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 9" REDHAT_BUGZILLA_PRODUCT_VERSION=9.4 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="9.4" [satishk@tcn3 temp]$ uname -r 5.14.0-427.31.1.el9_4.x86_64 ``` ADDITIONAL INFORMATION: Even though the executable is a bare and a dummy program with one function valgrind crashes with that error. Valgrind was built with Easybuild. The configure statement for the installation: ``` == 2024-11-14 11:22:09,009 run.py:260 INFO running cmd: ./configure --prefix=/sw/arch/RHEL9/EB_production/2024/software/Valgrind/3.24.0-gompi-2024a --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --with-mpicc="$MPICC" ```
In this case it's a small (1 4kbyte page) mmap to the default load address 0x400000 that is failing. All fairly banal. Does the machine have anything like security hardening that might be causing the mmap to fail?
Please could you run with -v -v -v -v -d -d -d and redirect stderr to a file then attach that file here?
Created attachment 176088 [details] Requested stderr into a file Sorry for the late reply. PFA .
--13790:2: aspacem (0,4,7) /gpfs/home5/satishk/.local/easybuild/RHEL9/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux --13790:2: aspacem 0: RSVN 0000000000-00003fffff 4194304 ----- SmFixed --13790:2: aspacem 1: FILE 0000400000-0000400fff 4096 r---- d=0x031 i=149280198 o=0 (0,4) --13790:2: aspacem 2: RSVN 0000401000-0003ffffff 59m ----- SmFixed --13790:2: aspacem 3: 0004000000-0057ffffff 1344m --13790:2: aspacem 4: FILE 0058000000-00581cffff 1900544 r-x-- d=0x031 i=149280198 o=4096 (0,4) --13790:2: aspacem 5: FILE 00581d0000-0058287fff 753664 r---- d=0x031 i=149280198 o=1904640 (0,4) --13790:2: aspacem 6: FILE 0058288000-005828afff 12288 rw--- d=0x031 i=149280198 o=2658304 (0,4) Something has gone horribly wrong there. When the Valgrind tools get built they use a linker flag to change to load address of the tool to be 0x58000000 rather than the default for non-PIE 0x400000. But in your case a 4k read only block is getting loaded at 0x400000. To me that means that either your exes are not being built properly or that you have something like a security tool that is forcing the first load at 0x400000. Could you run objdump -p on /home/satishk/.local/easybuild/RHEL9/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux and post the output here? I get [paulf@archlinux .in_place]$ objdump -p memcheck-amd64-linux memcheck-amd64-linux: file format elf64-x86-64 Program Header: LOAD off 0x0000000000000000 vaddr 0x0000000058000000 paddr 0x0000000058000000 align 2**12 filesz 0x000000000000028c memsz 0x000000000000028c flags r-- LOAD off 0x0000000000001000 vaddr 0x0000000058001000 paddr 0x0000000058001000 align 2**12 filesz 0x00000000001fda6e memsz 0x00000000001fda6e flags r-x LOAD off 0x00000000001ff000 vaddr 0x00000000581ff000 paddr 0x00000000581ff000 align 2**12 filesz 0x00000000000ab76c memsz 0x00000000000ab76c flags r-- LOAD off 0x00000000002aaf00 vaddr 0x00000000582abf00 paddr 0x00000000582abf00 align 2**12 filesz 0x0000000000004cbc memsz 0x0000000001a10550 flags rw- You can see that the first LOAD is at 0x0000000058000000 as expected.
``` [satishk@tcn3 ~]$ objdump -p ~/.local/easybuild/RHEL9/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux /home/satishk/.local/easybuild/RHEL9/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux: file format elf64-x86-64 Program Header: LOAD off 0x0000000000000000 vaddr 0x0000000000400000 paddr 0x0000000000400000 align 2**12 filesz 0x0000000000000ce4 memsz 0x0000000000000ce4 flags r-- LOAD off 0x0000000000001000 vaddr 0x0000000058000000 paddr 0x0000000058000000 align 2**12 filesz 0x00000000001cf107 memsz 0x00000000001cf107 flags r-x LOAD off 0x00000000001d1000 vaddr 0x00000000581d0000 paddr 0x00000000581d0000 align 2**12 filesz 0x00000000000b76c0 memsz 0x00000000000b76c0 flags r-- LOAD off 0x0000000000289000 vaddr 0x0000000058288000 paddr 0x0000000058288000 align 2**12 filesz 0x0000000000002bbc memsz 0x0000000001a0e450 flags rw- NOTE off 0x0000000000000200 vaddr 0x0000000000400200 paddr 0x0000000000400200 align 2**3 filesz 0x0000000000000030 memsz 0x0000000000000030 flags r-- NOTE off 0x0000000000000230 vaddr 0x0000000000400230 paddr 0x0000000000400230 align 2**2 filesz 0x0000000000000ab4 memsz 0x0000000000000ab4 flags r-- 0x6474e553 off 0x0000000000000200 vaddr 0x0000000000400200 paddr 0x0000000000400200 align 2**3 filesz 0x0000000000000030 memsz 0x0000000000000030 flags r-- STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4 filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw- [satishk@tcn3 ~]$ ```
It looks like your Valgrind is being incorrectly built. Can you try building Valgrind yourself rather than using Easybuild? Either a tarball from https://valgrind.org/downloads/current.html Or from git head https://valgrind.org/downloads/repository.html You should have no difficulty on RedHat 9, it is one of the best supported platforms.
@Paul which linker flag are you talking about specifically? I can try and look for this in the easybuild log file. In the mean time I also performed a build from source and that works. So indeed the easybuild one is built incorrectly. One built from source: ``` [satishk@tcn3 valgrind]$ objdump -p memcheck-amd64-linux memcheck-amd64-linux: file format elf64-x86-64 Program Header: LOAD off 0x0000000000000000 vaddr 0x0000000058000000 paddr 0x0000000058000000 align 2**12 filesz 0x0000000000000fd8 memsz 0x0000000000000fd8 flags r-- LOAD off 0x0000000000001000 vaddr 0x0000000058001000 paddr 0x0000000058001000 align 2**12 filesz 0x00000000001c7e6e memsz 0x00000000001c7e6e flags r-x LOAD off 0x00000000001c9000 vaddr 0x00000000581c9000 paddr 0x00000000581c9000 align 2**12 filesz 0x00000000000ba94c memsz 0x00000000000ba94c flags r-- LOAD off 0x0000000000284000 vaddr 0x0000000058284000 paddr 0x0000000058284000 align 2**12 filesz 0x0000000000002bbc memsz 0x0000000001a0e450 flags rw- NOTE off 0x0000000000000200 vaddr 0x0000000058000200 paddr 0x0000000058000200 align 2**3 filesz 0x0000000000000030 memsz 0x0000000000000030 flags r-- NOTE off 0x0000000000000230 vaddr 0x0000000058000230 paddr 0x0000000058000230 align 2**2 filesz 0x0000000000000da8 memsz 0x0000000000000da8 flags r-- 0x6474e553 off 0x0000000000000200 vaddr 0x0000000058000200 paddr 0x0000000058000200 align 2**3 filesz 0x0000000000000030 memsz 0x0000000000000030 flags r-- STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4 filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw- [satishk@tcn3 valgrind]$ ``` Trying this out on the same executable: ``` [satishk@tcn3 bin]$ ./valgrind ~/temp/core_avx2.out ==367863== Memcheck, a memory error detector ==367863== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al. ==367863== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info ==367863== Command: /home/satishk/temp/core_avx2.out ==367863== 100 200 300 400 500 0 0 0 ==367863== ==367863== HEAP SUMMARY: ==367863== in use at exit: 124 bytes in 4 blocks ==367863== total heap usage: 38 allocs, 34 frees, 9,050 bytes allocated ==367863== ==367863== LEAK SUMMARY: ==367863== definitely lost: 0 bytes in 0 blocks ==367863== indirectly lost: 0 bytes in 0 blocks ==367863== possibly lost: 0 bytes in 0 blocks ==367863== still reachable: 124 bytes in 4 blocks ==367863== suppressed: 0 bytes in 0 blocks ==367863== Rerun with --leak-check=full to see details of leaked memory ==367863== ==367863== For lists of detected and suppressed errors, rerun with: -s ==367863== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) [satishk@tcn3 bin]$ ```
It depends if you are using GNU bfd ld or another linker like lld or gold. See below for the description in configure.ac We try, in this order, --image-base -Ttext-segment and -Ttext Can you tell which one Easybuild is using? It should be in config.log, something like this {failing --image-base test} configure:12669: checking if the linker accepts -Wl,-Ttext-segment configure:12678: /path/to/bin/gcc -o conftest -static -nodefaultlibs -nostartfiles -Wl,-Ttext-segment=0x58000000 -Werror conftest.c >&5 configure:12678: $? = 0 configure:12683: result: yes configure.ac extract: # We want to use use the -Ttext-segment option to the linker. # GNU (bfd) ld supports this directly. Newer GNU gold linkers # support it as an alias of -Ttext. Sadly GNU (bfd) ld's -Ttext # semantics are NOT what we want (GNU gold -Ttext is fine). # # For GNU (bfd) ld -Ttext-segment chooses the base at which ELF headers # will reside. -Ttext aligns just the .text section start (but not any # other section). # # LLVM ld.lld 10.0 changed the semantics of its -Ttext. See "Breaking changes" # in https://releases.llvm.org/10.0.0/tools/lld/docs/ReleaseNotes.html # The --image-base option (since version 6.0?) provides the semantics needed. # -Ttext-segment generates an error, but -Ttext now more closely # follows the GNU (bfd) ld's -Ttext. # # So test first for --image-base support, and if that fails then # for -Ttext-segment which is supported by all bfd ld versions # and use that if it exists. If it doesn't exist it must be an older # version of gold and we can fall back to using -Ttext which has the # right semantics.
The easybuild log regarding the linker: ``` checking if the linker accepts -Wl,--image-base... no checking if the linker accepts -Wl,-Ttext-segment... no configure: ld -Ttext used, need to strip build-id NOTEs. checking if the linker accepts -Wl,--build-id=none... yes ``` In the log, I also see both ld.bfd and ld.gold recognized.
Can you tell why Easybuild gets it wrong? Does Easybuild use its own 'environment' that includes gold, but outside Easybuild there is just GNU ld?
It indeed may be using the gold linker within the Easybuild env. Are you saying that valgrind should NOT be linked with ld.gold at all?
Maybe. I need to install gold and do some tests with it.
I just tried with gold on Fedora 41 (and mold while I was it it). No problem. This looks like an Easybuild issue, so I'm closing this as worksforme.
Thanks Paul for checking this out. I will further try to debug as to what is happening within the Easybuild environment.