Bug 389412 - Failed assertion in readelf.c, line 697 for clang binaries with coverage information
Summary: Failed assertion in readelf.c, line 697 for clang binaries with coverage info...
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.13.0
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-25 14:32 UTC by Peter Klotz
Modified: 2021-02-03 11:25 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Sources to generate an executable which triggers the assertion (2.14 KB, application/octet-stream)
2020-01-09 15:08 UTC, Christian Maurer
Details
example sections as stated in comment 3 (6.49 KB, text/plain)
2020-01-09 15:09 UTC, Christian Maurer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Klotz 2018-01-25 14:32:19 UTC
Approximately 1% of our internal regression tests show a failed assertion when run with Valgrind 3.13.0:

  "valgrind: m_debuginfo/readelf.c:697 (get_elf_symbol_info): Assertion 'in_rx' failed."

This only happens if the code is compiled with clang (5.0.1 on RHEL 7 x86_64) together with coverage information options ("-fprofile-instr-generate -fcoverage-mapping"). Valgrind runs fine without the coverage options or if gcc is used. The problem occurs with the Google gold linker and also with the LLVM lld linker.

We tracked it down to a single find_rx_mapping() call that returns 0 (via the "return NULL" at the end of the method).

Call: find_rx_mapping(DebugInfo=0x100287C470, lo=3ef0e8, hi=42a0f7)

In a "readelf -aW" output the symbol causing the assertion seems to be "__llvm_coverage_mapping":

  9458: 00000000003ef0e8 0x3b010 OBJECT  LOCAL  DEFAULT   33 __llvm_coverage_mapping
  9459: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS /.../SourceFile.cpp

Since variable "in_text" in get_elf_symbol_info() is true, the assertion "vg_assert(in_rx);" fails.

Here a "--trace-symtab=yes" output from Valgrind:

----------------------
raw symbol [9453]: LOC FUN : svma 0x000042de90, sz   18  __cxx_global_var_init.68
    rec(t) [9453]:            val 0x000042de90, sz   18  __cxx_global_var_init.68
raw symbol [9454]: LOC FUN : svma 0x000042deb0, sz   18  __cxx_global_var_init.69
    rec(t) [9454]:            val 0x000042deb0, sz   18  __cxx_global_var_init.69
raw symbol [9455]: LOC FUN : svma 0x000042d6f0, sz   26  __cxx_global_var_init.7
    rec(t) [9455]:            val 0x000042d6f0, sz   26  __cxx_global_var_init.7
raw symbol [9456]: LOC FUN : svma 0x000042d710, sz   26  __cxx_global_var_init.8
    rec(t) [9456]:            val 0x000042d710, sz   26  __cxx_global_var_init.8
raw symbol [9457]: LOC FUN : svma 0x000042d730, sz   26  __cxx_global_var_init.9
    rec(t) [9457]:            val 0x000042d730, sz   26  __cxx_global_var_init.9
raw symbol [9458]: LOC OBJ : svma 0x00003ef0e8, sz 241680  __llvm_coverage_mapping

valgrind: m_debuginfo/readelf.c:697 (get_elf_symbol_info): Assertion 'in_rx' failed.
----------------------

Please let me know, if any additional information or further testing is needed.

Regards, Peter.
Comment 1 Julian Seward 2018-08-06 08:28:58 UTC
This may well have been fixed by recent commits to the debuginfo reader
(within the past month).  Can you try again?
Comment 2 Peter Klotz 2018-08-20 15:58:09 UTC
Hi Julian

A test with the git snapshot from today showed that the assertion is still triggered (just the line number has changed):

valgrind: m_debuginfo/readelf.c:715 (get_elf_symbol_info): Assertion 'in_rx' failed.

Regards, Peter
Comment 3 Christian Maurer 2020-01-09 15:06:23 UTC
Hi Julian,
Hi Peter,

i am able to reproduce this error with a test program and hopefully can offer some insights.

According to https://llvm.org/docs/CoverageMappingFormat.html#function-record section "LLVM IR Representation"
"The coverage mapping data is stored in the LLVM IR using a single global constant structure variable called __llvm_coverage_mapping with the __llvm_covmap section specifier."

I constructed an executable (out of a small main.cpp and a generated large mylib.cpp), which triggers the assertion.

--- START data from executable ---
readelf -S ValgrindTestGenerated
[output: see llvmcov_analysisOffendingSections_090120.txt for __llvm, relevant entries:]
There are 45 section headers, starting at offset 0x2995150:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [14] .text             PROGBITS         00000000002c7000  000c7000
       0000000000db6e77  0000000000000000  AX       0     0     16
  [33] __llvm_covmap     PROGBITS         0000000000000000  0101f328
       000000000035fd24  0000000000000000           0     0     8

readelf -sw ValgrindTestGenerated | grep cov
Symbol table '.symtab' contains 80343 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
    16: 0000000000000000   132 OBJECT  LOCAL  DEFAULT   33 __llvm_coverage_mapping
 40024: 0000000000000088 0x35fc9c OBJECT  LOCAL  DEFAULT   33 __llvm_coverage_mapping
--- END data from executable ---

Situation in the elf executable:

As stated in the documentation of the "LLVM IR Representation" (link given above) we can see, that the __llvm_covmap symbol point to regions in the __llvm_covmap section.

Note that the last __llvm_coverage_mapping (referencing mylib.o) ends at 000000000035fd24 [0000000000000088 + 0x35fc9c] )

Also note that the last __llvm_coverage_mapping 
* starts greater than 0
* starts smaller than .text
* ends in .text ( 00000000002c7000 < 0000000000000088 + 0x35fc9c < 00000000002c7000 + 0000000000db6e77 )
=> these are the error conditions!

I used the attached program (see kdebug389412_package.zip) to generate an executable, which fulfills these conditions and triggers the assertion when run with valgrind.

--- START generating the executable ---
mkdir ValgrindTestGenerated
cp var/main.cpp ValgrindTestGenerated
cp var/mylib.h ValgrindTestGenerated
# note: 10000 worked for me, to provide the conditions stated above
ValgrindTest 10000 > ValgrindTestGenerated/mylib.cpp

compile and link the sources in "ValgrindTestGenerated" including the options -fprofile-instr-generate -fcoverage-mapping 
--- END generating the executable ---

I used vg-in-place to run the executable and had to use --max-stackframe 
./vg-in-place  --max-stackframe=17200080 --tool=memcheck --num-callers=25 --leak-check=full --child-silent-after-fork=yes --error-limit=no <pathtotheexecutable>

--- START output with some debug info added ---
in_rx is not set, assertion will occur!
text-present:Y
text-size: 14380663
text-svma: 0x2c7000
sym-svma: 0x88
text-bias: 0
sym-avmas-out.main: 0x88
sym-size-out: 3538076

valgrind: m_debuginfo/readelf.c:727 (get_elf_symbol_info): Assertion 'in_rx' failed.
--- END output with some debug info added ---


Kind regards,
Christian
Comment 4 Christian Maurer 2020-01-09 15:08:52 UTC
Created attachment 125001 [details]
Sources to generate an executable which triggers the assertion

see: comment 3
Comment 5 Christian Maurer 2020-01-09 15:09:42 UTC
Created attachment 125002 [details]
example sections as stated in comment 3
Comment 6 Peter Klotz 2021-02-03 11:25:48 UTC
After switching to clang 11 the problem disappeared.

It is our assumption that the way how older clang versions (particularly 5 and 7) used to write the debug information into the binary was a contributing factor.