Bug 496468 - valgrind: mmap(0x400000, 8192) failed in UME with error 22 (Invalid argument). valgrind: this can be caused by executables with very large text, data or bss segments.
Summary: valgrind: mmap(0x400000, 8192) failed in UME with error 22 (Invalid argument)...
Status: RESOLVED WORKSFORME
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.24 GIT
Platform: RedHat Enterprise Linux Linux
: NOR crash
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-11-19 16:33 UTC by Satish Santhosh
Modified: 2024-11-27 15:32 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Requested stderr into a file (6.02 KB, text/plain)
2024-11-24 20:02 UTC, Satish Santhosh
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Satish Santhosh 2024-11-19 16:33:26 UTC
SUMMARY
```
[satishk@tcn3 temp]$ valgrind ./core_avx2.out
valgrind: mmap(0x400000, 4096) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.
[satishk@tcn3 temp]$ size core_avx2.out
   text    data     bss     dec     hex filename
   1831     560       1    2392     958 core_avx2.out
[satishk@tcn3 temp]$
```


STEPS TO REPRODUCE
1. running valgrind on any executable is resulting in this.
2. 
3. 

OBSERVED RESULT
```
[satishk@tcn3 temp]$ valgrind -d ./core_avx2.out
--1960444:1:debuglog DebugLog system started by Stage 1, level 1 logging requested
--1960444:1:launcher no tool requested, defaulting to 'memcheck'
--1960444:1:launcher selected platform 'amd64-linux'
--1960444:1:launcher launching /sw/arch/RHEL9/EB_production/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux
--1960444:1:debuglog DebugLog system started by Stage 2 (main), level 1 logging requested
--1960444:1:    main Welcome to Valgrind version 3.24.0 debug logging
--1960444:1:    main Checking current stack is plausible
--1960444:1:    main Checking initial stack was noted
--1960444:1:    main Starting the address space manager
--1960444:1:    main Address space manager is running
--1960444:1:    main Starting the dynamic memory manager
--1960444:1:mallocfr newSuperblock at 0x1002001000 (pszB 4194272)  owner VALGRIND/core
--1960444:1:mallocfr deferred_reclaimSuperblock at 0x1002001000 (pszB 4194272)  (prev 0x0) owner VALGRIND/core
--1960444:1:    main Dynamic memory manager is running
--1960444:1:    main Initialise m_debuginfo
--1960444:1:    main VG_(libdir) = /sw/arch/RHEL9/EB_production/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind
--1960444:1:    main Getting launcher's name ...
--1960444:1:    main ... /gpfs/admin/_hpc/sw/arch/AMD-ZEN2/RHEL9/EB_production/2024/software/Valgrind/3.24.0-gompi-2024a/bin/valgrind
--1960444:1:    main Get hardware capabilities ...
--1960444:1:   cache Autodetected cache info is sensible
--1960444:1:   cache Cache info:
--1960444:1:   cache   #levels = 3
--1960444:1:   cache   #caches = 4
--1960444:1:   cache      cache #0:
--1960444:1:   cache         kind = data
--1960444:1:   cache         level = 1
--1960444:1:   cache         size = 32768 bytes
--1960444:1:   cache         linesize = 64 bytes
--1960444:1:   cache         assoc = 8
--1960444:1:   cache      cache #1:
--1960444:1:   cache         kind = insn
--1960444:1:   cache         level = 1
--1960444:1:   cache         size = 32768 bytes
--1960444:1:   cache         linesize = 64 bytes
--1960444:1:   cache         assoc = 8
--1960444:1:   cache      cache #2:
--1960444:1:   cache         kind = unified
--1960444:1:   cache         level = 2
--1960444:1:   cache         size = 524288 bytes
--1960444:1:   cache         linesize = 64 bytes
--1960444:1:   cache         assoc = 8
--1960444:1:   cache      cache #3:
--1960444:1:   cache         kind = unified
--1960444:1:   cache         level = 3
--1960444:1:   cache         size = 268435456 bytes
--1960444:1:   cache         linesize = 64 bytes
--1960444:1:   cache         assoc = 1
--1960444:1:    main ... arch = AMD64, hwcaps = amd64-cx16-lzcnt-rdtscp-sse3-ssse3-avx-avx2-bmi-f16c-rdrand-rdseed-fma
--1960444:1:    main Getting the working directory at startup
--1960444:1:    main ... /gpfs/home5/satishk/temp
--1960444:1:    main Split up command line
--1960444:1:    main (early_) Process Valgrind's command line options
--1960444:1:    main Create initial image
--1960444:1: initimg Loading client
valgrind: mmap(0x400000, 4096) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.

```

EXPECTED RESULT


SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
```
[satishk@tcn3 temp]$ cat /etc/os-release
NAME="Red Hat Enterprise Linux"
VERSION="9.4 (Plow)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="9.4"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Red Hat Enterprise Linux 9.4 (Plow)"
ANSI_COLOR="0;31"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:redhat:enterprise_linux:9::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9"
BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_BUGZILLA_PRODUCT_VERSION=9.4
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.4"
[satishk@tcn3 temp]$ uname -r
5.14.0-427.31.1.el9_4.x86_64
```
ADDITIONAL INFORMATION: Even though the executable is a bare and a dummy program with one function valgrind crashes with that error. Valgrind was built with Easybuild.
The configure statement for the installation:
```
== 2024-11-14 11:22:09,009 run.py:260 INFO running cmd:  ./configure --prefix=/sw/arch/RHEL9/EB_production/2024/software/Valgrind/3.24.0-gompi-2024a  --build=x86_64-pc-linux-gnu  --host=x86_64-pc-linux-gnu  --with-mpicc="$MPICC" 
```
Comment 1 Paul Floyd 2024-11-19 20:56:59 UTC
In this case it's a small (1 4kbyte page) mmap to the default load address 0x400000 that is failing. All fairly banal.

Does the machine have anything like security hardening that might be causing the mmap to fail?
Comment 2 Paul Floyd 2024-11-23 18:17:09 UTC
Please could you run with -v -v -v -v -d -d -d and redirect stderr to a file then attach that file here?
Comment 3 Satish Santhosh 2024-11-24 20:02:02 UTC
Created attachment 176088 [details]
Requested stderr into a file

Sorry for the late reply. PFA .
Comment 4 Paul Floyd 2024-11-24 20:38:38 UTC
--13790:2: aspacem   (0,4,7) /gpfs/home5/satishk/.local/easybuild/RHEL9/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux
--13790:2: aspacem     0: RSVN 0000000000-00003fffff 4194304 ----- SmFixed
--13790:2: aspacem     1: FILE 0000400000-0000400fff    4096 r---- d=0x031 i=149280198 o=0       (0,4)
--13790:2: aspacem     2: RSVN 0000401000-0003ffffff     59m ----- SmFixed
--13790:2: aspacem     3:      0004000000-0057ffffff   1344m
--13790:2: aspacem     4: FILE 0058000000-00581cffff 1900544 r-x-- d=0x031 i=149280198 o=4096    (0,4)
--13790:2: aspacem     5: FILE 00581d0000-0058287fff  753664 r---- d=0x031 i=149280198 o=1904640 (0,4)
--13790:2: aspacem     6: FILE 0058288000-005828afff   12288 rw--- d=0x031 i=149280198 o=2658304 (0,4)

Something has gone horribly wrong there. When the Valgrind tools get built they use a linker flag to change to load address of the tool to be 0x58000000 rather than the default for non-PIE 0x400000. 

But in your case a 4k read only block is getting loaded at 0x400000. To me that means that either your exes are not being built properly or that you have something like a security tool that is forcing the first load at 0x400000.

Could you run objdump -p on /home/satishk/.local/easybuild/RHEL9/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux and post the output here?

I get

[paulf@archlinux .in_place]$ objdump -p memcheck-amd64-linux 

memcheck-amd64-linux:     file format elf64-x86-64

Program Header:
    LOAD off    0x0000000000000000 vaddr 0x0000000058000000 paddr 0x0000000058000000 align 2**12
         filesz 0x000000000000028c memsz 0x000000000000028c flags r--
    LOAD off    0x0000000000001000 vaddr 0x0000000058001000 paddr 0x0000000058001000 align 2**12
         filesz 0x00000000001fda6e memsz 0x00000000001fda6e flags r-x
    LOAD off    0x00000000001ff000 vaddr 0x00000000581ff000 paddr 0x00000000581ff000 align 2**12
         filesz 0x00000000000ab76c memsz 0x00000000000ab76c flags r--
    LOAD off    0x00000000002aaf00 vaddr 0x00000000582abf00 paddr 0x00000000582abf00 align 2**12
         filesz 0x0000000000004cbc memsz 0x0000000001a10550 flags rw-

You can see that the first LOAD is at 0x0000000058000000 as expected.
Comment 5 Satish Santhosh 2024-11-25 13:10:34 UTC
```
[satishk@tcn3 ~]$ objdump -p ~/.local/easybuild/RHEL9/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux 

/home/satishk/.local/easybuild/RHEL9/2024/software/Valgrind/3.24.0-gompi-2024a/libexec/valgrind/memcheck-amd64-linux:     file format elf64-x86-64

Program Header:
    LOAD off    0x0000000000000000 vaddr 0x0000000000400000 paddr 0x0000000000400000 align 2**12
         filesz 0x0000000000000ce4 memsz 0x0000000000000ce4 flags r--
    LOAD off    0x0000000000001000 vaddr 0x0000000058000000 paddr 0x0000000058000000 align 2**12
         filesz 0x00000000001cf107 memsz 0x00000000001cf107 flags r-x
    LOAD off    0x00000000001d1000 vaddr 0x00000000581d0000 paddr 0x00000000581d0000 align 2**12
         filesz 0x00000000000b76c0 memsz 0x00000000000b76c0 flags r--
    LOAD off    0x0000000000289000 vaddr 0x0000000058288000 paddr 0x0000000058288000 align 2**12
         filesz 0x0000000000002bbc memsz 0x0000000001a0e450 flags rw-
    NOTE off    0x0000000000000200 vaddr 0x0000000000400200 paddr 0x0000000000400200 align 2**3
         filesz 0x0000000000000030 memsz 0x0000000000000030 flags r--
    NOTE off    0x0000000000000230 vaddr 0x0000000000400230 paddr 0x0000000000400230 align 2**2
         filesz 0x0000000000000ab4 memsz 0x0000000000000ab4 flags r--
0x6474e553 off    0x0000000000000200 vaddr 0x0000000000400200 paddr 0x0000000000400200 align 2**3
         filesz 0x0000000000000030 memsz 0x0000000000000030 flags r--
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-

[satishk@tcn3 ~]$ 

```
Comment 6 Paul Floyd 2024-11-25 15:27:10 UTC
It looks like your Valgrind is being incorrectly built. Can you try building Valgrind yourself rather than using Easybuild?
Either a tarball from https://valgrind.org/downloads/current.html
Or from git head https://valgrind.org/downloads/repository.html

You should have no difficulty on RedHat 9, it is one of the best supported platforms.
Comment 7 Satish Santhosh 2024-11-25 16:07:58 UTC
@Paul which linker flag are you talking about specifically? I can try and look for this in the easybuild log file. In the mean time I also performed a build from source and that works. So indeed the easybuild one is built incorrectly.

One built from source:
```
[satishk@tcn3 valgrind]$ objdump -p memcheck-amd64-linux 

memcheck-amd64-linux:     file format elf64-x86-64

Program Header:
    LOAD off    0x0000000000000000 vaddr 0x0000000058000000 paddr 0x0000000058000000 align 2**12
         filesz 0x0000000000000fd8 memsz 0x0000000000000fd8 flags r--
    LOAD off    0x0000000000001000 vaddr 0x0000000058001000 paddr 0x0000000058001000 align 2**12
         filesz 0x00000000001c7e6e memsz 0x00000000001c7e6e flags r-x
    LOAD off    0x00000000001c9000 vaddr 0x00000000581c9000 paddr 0x00000000581c9000 align 2**12
         filesz 0x00000000000ba94c memsz 0x00000000000ba94c flags r--
    LOAD off    0x0000000000284000 vaddr 0x0000000058284000 paddr 0x0000000058284000 align 2**12
         filesz 0x0000000000002bbc memsz 0x0000000001a0e450 flags rw-
    NOTE off    0x0000000000000200 vaddr 0x0000000058000200 paddr 0x0000000058000200 align 2**3
         filesz 0x0000000000000030 memsz 0x0000000000000030 flags r--
    NOTE off    0x0000000000000230 vaddr 0x0000000058000230 paddr 0x0000000058000230 align 2**2
         filesz 0x0000000000000da8 memsz 0x0000000000000da8 flags r--
0x6474e553 off    0x0000000000000200 vaddr 0x0000000058000200 paddr 0x0000000058000200 align 2**3
         filesz 0x0000000000000030 memsz 0x0000000000000030 flags r--
   STACK off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
         filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-

[satishk@tcn3 valgrind]$ 

```
Trying this out on the same executable:
```
[satishk@tcn3 bin]$ ./valgrind ~/temp/core_avx2.out 
==367863== Memcheck, a memory error detector
==367863== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==367863== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info
==367863== Command: /home/satishk/temp/core_avx2.out
==367863== 
100 200 300 400 500 0 0 0
==367863== 
==367863== HEAP SUMMARY:
==367863==     in use at exit: 124 bytes in 4 blocks
==367863==   total heap usage: 38 allocs, 34 frees, 9,050 bytes allocated
==367863== 
==367863== LEAK SUMMARY:
==367863==    definitely lost: 0 bytes in 0 blocks
==367863==    indirectly lost: 0 bytes in 0 blocks
==367863==      possibly lost: 0 bytes in 0 blocks
==367863==    still reachable: 124 bytes in 4 blocks
==367863==         suppressed: 0 bytes in 0 blocks
==367863== Rerun with --leak-check=full to see details of leaked memory
==367863== 
==367863== For lists of detected and suppressed errors, rerun with: -s
==367863== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
[satishk@tcn3 bin]$ 
```
Comment 8 Paul Floyd 2024-11-25 17:28:20 UTC
It depends if you are using GNU bfd ld or another linker like lld or gold.

See below for the description in configure.ac
We try, in this order, --image-base -Ttext-segment and -Ttext

Can you tell which one Easybuild is using? It should be in config.log, something like this

{failing --image-base test}
configure:12669: checking if the linker accepts -Wl,-Ttext-segment
configure:12678: /path/to/bin/gcc -o conftest -static -nodefaultlibs -nostartfiles -Wl,-Ttext-segment=0x58000000 -Werror   conftest.c  >&5
configure:12678: $? = 0
configure:12683: result: yes


configure.ac extract:

# We want to use use the -Ttext-segment option to the linker.
# GNU (bfd) ld supports this directly. Newer GNU gold linkers
# support it as an alias of -Ttext. Sadly GNU (bfd) ld's -Ttext
# semantics are NOT what we want (GNU gold -Ttext is fine).
#
# For GNU (bfd) ld -Ttext-segment chooses the base at which ELF headers
# will reside. -Ttext aligns just the .text section start (but not any
# other section).
#
# LLVM ld.lld 10.0 changed the semantics of its -Ttext. See "Breaking changes"
# in https://releases.llvm.org/10.0.0/tools/lld/docs/ReleaseNotes.html
# The --image-base option (since version 6.0?) provides the semantics needed.
# -Ttext-segment generates an error, but -Ttext now more closely
# follows the GNU (bfd) ld's -Ttext.
#
# So test first for --image-base support, and if that fails then
# for -Ttext-segment which is supported by all bfd ld versions
# and use that if it exists. If it doesn't exist it must be an older
# version of gold and we can fall back to using -Ttext which has the
# right semantics.
Comment 9 Satish Santhosh 2024-11-25 20:08:47 UTC
The easybuild log regarding the linker:

```
checking if the linker accepts -Wl,--image-base... no
checking if the linker accepts -Wl,-Ttext-segment... no
configure: ld -Ttext used, need to strip build-id NOTEs.
checking if the linker accepts -Wl,--build-id=none... yes
```
In the log, I also see both ld.bfd and ld.gold recognized.
Comment 10 Paul Floyd 2024-11-25 20:12:48 UTC
Can you tell why Easybuild gets it wrong?

Does Easybuild use its own 'environment' that includes gold, but outside Easybuild there is just GNU ld?
Comment 11 Satish Santhosh 2024-11-26 11:40:14 UTC
It indeed may be using the gold linker within the Easybuild env. Are you saying that valgrind should NOT be linked with ld.gold at all?
Comment 12 Paul Floyd 2024-11-26 12:54:39 UTC
Maybe. I need to install gold and do some tests with it.
Comment 13 Paul Floyd 2024-11-27 08:28:06 UTC
I just tried with gold on Fedora 41 (and mold while I was it it). No problem.

This looks like an Easybuild issue, so I'm closing this as worksforme.
Comment 14 Satish Santhosh 2024-11-27 15:32:33 UTC
Thanks Paul for checking this out. I will further try to debug as to what is happening within the Easybuild environment.