Bug 452802 - Handle lld 9+ split RW PT_LOAD segments correctly
Summary: Handle lld 9+ split RW PT_LOAD segments correctly
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.20 GIT
Platform: Compiled Sources Linux
: NOR major
Target Milestone: ---
Assignee: Paul Floyd
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-04-20 15:45 UTC by Paul Floyd
Modified: 2022-06-09 21:32 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Patch for reading debuginfo correctly with lld 9+ (15.61 KB, patch)
2022-05-08 14:22 UTC, Paul Floyd
Details
Patch for reading debuginfo correctly with lld 9+ (44.49 KB, patch)
2022-06-07 22:00 UTC, Paul Floyd
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Floyd 2022-04-20 15:45:27 UTC
Since llvm 9.0.0, lld has produced split RW PT_LOAD segments

ld.bfd and llvm ld  before 9.0.0

One RW segment containing
RW PT_LOAD(PT_GNU_RELRO(.data.rel.ro .bss.rel.ro) .data .bss)

llvm lld after 9.0.0

Two RW segments containing
RW PT_LOAD(PT_GNU_RELRO(.data.rel.ro .bss.rel.ro))
RW PT_LOAD(.data. .bss)

Valgrind is hard coded to only read debuginfo for one RW PT_LOAD segment and ignores the second.

See debuginfo.c line 1269 at the time of writing which contains

line 1269

/* PJF this is true for the 2nd RW PT_LOAD which finds di->have_dinfo from the 1st */
if (di->have_dinfo) {
   if (debug)
      VG_(dmsg)("di_notify_mmap-4x: "
         "ignoring mapping because we already read debuginfo "
         "for DebugInfo* %p\n", di);
      return 0;
}

I see two possible solutions
1. Add full handling of multiple RW PT_LOAD segments
2. Merge contiguous RW PT_LOAD segments so that they appear like ld.bfd single RW PT_LOAD segemnts.

I'll start with 2 as it sounds easier to me.
Comment 1 Paul Floyd 2022-04-24 20:23:56 UTC
Looking at the segment handling code, maybe_merge_nsegments should be doing this.

readelf -e says

Program Headers:
  Type            Offset                               VirtAddr                         PhysAddr
                       FileSiz                               MemSiz                          Flg    Align
  PHDR           0x0000000000000040 0x0000000000200040 0x0000000000200040
                       0x0000000000000268 0x0000000000000268  R      0x8
  INTERP         0x00000000000002a8 0x00000000002002a8 0x00000000002002a8
                       0x0000000000000015 0x0000000000000015  R      0x1
      [Requesting program interpreter: /libexec/ld-elf.so.1]
  LOAD           0x0000000000000000 0x0000000000200000 0x0000000000200000
                       0x00000000000005b4 0x00000000000005b4  R      0x1000
  LOAD           0x00000000000005c0 0x00000000002015c0 0x00000000002015c0
                       0x00000000000003b0 0x00000000000003b0  R E    0x1000
  LOAD           0x0000000000000970 0x0000000000202970 0x0000000000202970
                       0x0000000000000190 0x0000000000000190  RW     0x1000
  LOAD           0x0000000000000b00 0x0000000000203b00 0x0000000000203b00
                       0x0000000000000048 0x0000000000000058  RW     0x1000
  DYNAMIC        0x00000000000009a0 0x00000000002029a0 0x00000000002029a0
                       0x0000000000000160 0x0000000000000160  RW     0x8
  GNU_RELRO      0x0000000000000970 0x0000000000202970 0x0000000000202970
                       0x0000000000000190 0x0000000000000690  R      0x1
  GNU_EH_FRAME   0x0000000000000564 0x0000000000200564 0x0000000000200564
                       0x0000000000000014 0x0000000000000014  R      0x4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                       0x0000000000000000 0x0000000000000000  RW     0
  NOTE           0x00000000000002c0 0x00000000002002c0 0x00000000002002c0
                       0x0000000000000048 0x0000000000000048  R      0x4


I added some debug printfs to maybe_merge_nsegments and that gives me

DEBUG: maybe_merge_nsegments R 1 W 1 X 0
DEBUG: maybe_merge_nsegments matching RWX dev and ino
DEBUG: maybe_merge_nsegments s1 start 202000
DEBUG: maybe_merge_nsegments s1 end 202fff
DEBUG: maybe_merge_nsegments s1 offset 0
DEBUG: maybe_merge_nsegments s2 start 203000
DEBUG: maybe_merge_nsegments s2 end 203fff
DEBUG: maybe_merge_nsegments s2 offset 0
DEBUG: maybe_merge_nsegments s1 offset + delta 1000

That's for the varinfo5 executable in memcheck/tests. There's a similar block for the link loader /libexec/ld-elf.so.1

I'm struggling to see how that corresponds to what readelf shows.

  LOAD           0x0000000000000970 0x0000000000202970 0x0000000000202970
                       0x0000000000000190 0x0000000000000190  RW     0x1000
  LOAD           0x0000000000000b00 0x0000000000203b00 0x0000000000203b00
                       0x0000000000000048 0x0000000000000058  RW     0x1000

start = PhysAddr - ofset

I really don't understand where the offsets come from. Are they both being rounded down to zero?

The condition for merging segments is

         if (s1->hasR == s2->hasR 
             && s1->hasW == s2->hasW && s1->hasX == s2->hasX
             && s1->dev == s2->dev && s1->ino == s2->ino
             && s2->offset == s1->offset
                              + ((Long)s2->start) - ((Long)s1->start) ) {

OK, same perms, same file,

But then the offset test?

0 == 0 + =0x203000 - 0x202000

That's false.

What would work is

            && s2->start == s1->end + 1 ) {

Or possibly some way of taking into consideration the value of 'Align'

mapelf has

         res = VG_(am_mmap_file_fixed_client)(
                  VG_PGROUNDDN(addr),
                  VG_PGROUNDUP(bss)-VG_PGROUNDDN(addr),
                  prot, /*VKI_MAP_FIXED|VKI_MAP_PRIVATE, */
                  e->fd, VG_PGROUNDDN(off)
               );

which shows off being rounded down to the nearest pagesize (4098 or 0x1000).
Comment 2 Paul Floyd 2022-04-27 08:42:00 UTC
I have a theory, not really based on a firm understanding or thorough debugging.

For this

         res = VG_(am_mmap_file_fixed_client)(
                  VG_PGROUNDDN(addr),
                  VG_PGROUNDUP(bss)-VG_PGROUNDDN(addr),
                  prot, /*VKI_MAP_FIXED|VKI_MAP_PRIVATE, */
                  e->fd, VG_PGROUNDDN(off)
               );

Should 'off' be the offset of the segment only, or the offset of the segment from the start of the merged?

The readelf output gives the offsets as 0x970 and 0xb00. They both round down to 0 with VG_PGROUNDDN.

If the offset of the second ought to be the sum of the two then that would round down to 0x1000.

And I think that would resolve a lot of problems that I've seen (merging, symbol lookup in shared libraries).
Comment 3 Paul Floyd 2022-04-27 09:34:16 UTC
What I'm going to try is, in elf.c


for (i = 0; i < e->e.e_phnum; i++) {
      ESZ(Phdr) *ph = &e->p[i];
      ESZ(Phdr) *phprev = NULL;
      ESZ(Addr) addr, bss, brkaddr;
      ESZ(Off) off;
      ESZ(Word) filesz;
      ESZ(Word) memsz;
      unsigned prot = 0;

      if (ph->p_type != PT_LOAD)
         continue;

       if (i)
            phprev = &e->p[i-1];

      if (ph->p_flags & PF_X) prot |= VKI_PROT_EXEC;
      if (ph->p_flags & PF_W) prot |= VKI_PROT_WRITE;
      if (ph->p_flags & PF_R) prot |= VKI_PROT_READ;

      addr    = ph->p_vaddr+base;
      off     = ph->p_offset;
      if (phprev && ph->p_flags == phprev->p_flags)
           off += phprev->p_offset;
Comment 4 Paul Floyd 2022-04-27 11:34:48 UTC
Needs more work. Maybe store 'offset' and a 'mergeoffset'
Comment 5 Paul Floyd 2022-04-27 12:24:08 UTC
Or more precisely 'offset' and 'merge_remainder' or something like that.

where 'merge_remainder' would be 

PGRNDDN(off - PGRNDDN(off) + prevoff - PGRNDDN(prefoff))

(unless there is a page remainder macro).

This would either be 0 or 1 page.
Comment 6 Paul Floyd 2022-04-28 20:15:55 UTC
I see a similar problem on Linux (llvm 12,0,1 on Fedora 34).

memcheck/tests/varinfo5 fails, mainly failing to read debuginfo form the .so to get variable names.
Comment 7 Paul Floyd 2022-05-06 06:46:46 UTC
After barking up the wrong tree for some time (merging RW PT_LOADs), I think that I now see what I need to do.

1. ULong VG_(di_notify_mmap)( Addr a, Bool allow_SkFileV, Int use_fd )

This is probably the trickier one. It needs to handle 3 possible cases
A. only 1 RW PT_LOAD, behaviour as present.
B. first of 2 RW PT_LOADs, add di to XA and return 0
C. second of 2 RW PT OADs, add di to XA and return di_notify_ACHIEVE_ACCEPT_STATE ( di )

For B I can look at the next NSegment VG_(am_find_nsegment)(seg->end+1);
And for C I can look at the previous NSegment VG_(am_find_nsegment)(seg->start-1);


2. Change Bool ML_(read_elf_debug_info) ( struct _DebugInfo* di )

This should be able to see when there are 2 RW dinfos, and act accordingly to look for .data .got.plt .bss in the second one.
Comment 8 Paul Floyd 2022-05-06 22:08:26 UTC
No, not quite.

It seems like Valgrind loads the exe (and normally the interpreter, which is does with VG_(load_ELF) and readelf. I think that these read the phdrs and shdrs, and pre-allocates the NSegements. That means that when a PT_LOAD gets mmap'd, if there is a subsequent mmap the NSeggment is already in place.

For other libs (libc.so, varinfo50so.so for example) it's ld.so that is running and loading them. This is doing the mmaps of the PT_LOADs one at a time. So the NSegment lookahead does not work.

If I'm going to to lookahead, that means I'd have to do something like parsing the ELF in the shared lib and working out if there will be further RW PT_LOAD. Sounds hard.

Plan B. Do as now, then see the 2nd PT_LOAD and try to clean up the mess.
Comment 9 Paul Floyd 2022-05-07 17:02:58 UTC
I've gone with plan A. 

In ULong VG_(di_notify_mmap) I replaced the call to ML_(is_elf_object_file) with a call to a new function ML_(check_elf_and_get_rw_loads) .

This is an ultra-stripped down version of ML_(read_elf_debug_info) that just
1. calls ML_(is_elf_object_file) and then
2. iterates over the PHDRs counting RW PT_LOADs

Also modified ML_(read_elf_debug_info) to count RW mappings, look for a second one if the count is two and to use this second rw mapping if present for bss data and got.plt

The good news is that this fixes all of the failures in varinfo5 like this

- Address 0x........ is 7 bytes inside data symbol "static_local_undef.XXXX"
+ Address 0x........ is in a rw- mapped file /usr/home/paulf/scratch/valgrind/memcheck/tests/varinfo5so.so segment

(not reading debuginfo in shared libraries correctly)

The bad news is that I'm getting 3 new failures

< == 759 tests, 13 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
---
> == 759 tests, 16 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
2212a2220
> drd/tests/annotate_hb_race               (stderr)
2217a2226,2227
> none/tests/mmap_fcntl_bug                (stderr)
> none/tests/vgprintf                      (stderr)

--pid-- di_notify_mmap-0:
--pid-- di_notify_mmap-1: 0x200000-0x200fff r--
--pid-- di_notify_mmap-2: /usr/home/paulf/scratch/valgrind/none/tests/vgprintf
--pid-- di_notify_mmap-3: is_rx_map 0, is_rw_map 0, is_ro_map 1
--pid-- di_notify_mmap-4: noting details in DebugInfo* at 0x40247C220
--pid-- di_notify_mmap-6: no dinfo loaded /usr/home/paulf/scratch/valgrind/none/tests/vgprintf (no rx or no rw mapping)
--pid-- di_notify_mmap-0:
--pid-- di_notify_mmap-1: 0x201000-0x201fff r-x
--pid-- di_notify_mmap-2: /usr/home/paulf/scratch/valgrind/none/tests/vgprintf
--pid-- di_notify_mmap-3: is_rx_map 1, is_rw_map 0, is_ro_map 0
--pid-- di_notify_mmap-4: noting details in DebugInfo* at 0x40247C220
--pid-- di_notify_mmap-6: no dinfo loaded /usr/home/paulf/scratch/valgrind/none/tests/vgprintf (no rx or no rw mapping)
--pid-- di_notify_mmap-0:
--pid-- di_notify_mmap-1: 0x202000-0x203fff rw-
--pid-- di_notify_mmap-2: /usr/home/paulf/scratch/valgrind/none/tests/vgprintf
--pid-- di_notify_mmap-3: is_rx_map 0, is_rw_map 1, is_ro_map 0
--pid-- di_notify_mmap-4: noting details in DebugInfo* at 0x40247C220
--pid-- di_notify_mmap-6: no dinfo loaded /usr/home/paulf/scratch/valgrind/none/tests/vgprintf (no rx or no rw mapping)

Here is the problem. Why no second notify_mmap for the other RW PT_LOAD?

--pid-- di_notify_mmap-0:
--pid-- di_notify_mmap-1: 0x4000000-0x4005fff r--
--pid-- di_notify_mmap-2: /libexec/ld-elf.so.1
--pid-- di_notify_mmap-3: is_rx_map 0, is_rw_map 0, is_ro_map 1
--pid-- di_notify_mmap-4: noting details in DebugInfo* at 0x40247C7C0
--pid-- di_notify_mmap-6: no dinfo loaded /libexec/ld-elf.so.1 (no rx or no rw mapping)
Comment 10 Paul Floyd 2022-05-07 20:15:19 UTC
OK, this is to do with how the ELF exe or shared lib gets loaded. There are 3 cases.
1. the tool itself, already loaded and we look at the maps and work backwards to get NSegments and dinfo.
2. The client (and interpreter). We read the ELF headers and do the loading afterwards. That means the NSegments are done before the mmaping and dinfo loading occurs.
3. Any client shared libraries. These get loaded by the interpreter when the client is running under the host, we intercept the mmap calls and add NSegments and mmap/dinfo.

The problem is in point 2 when there is a merge of the NSegements. That will result in just one call to VG_(di_notify_mmap) for the merged segement. And VG_(di_notify_mmap) needs two calls when the ELF headers contain 2 RW PT_LOADs.

Maybe I need to use both (looking at NSegments for client load, looking at the ELF headers for a shared lib load). In that case I need to know which context VG_(di_notify_mmap) is being called from. Mayb the elf header e_type could do that.
Comment 11 Paul Floyd 2022-05-08 12:39:46 UTC
Almost there.

none/tests/mmap_fcntl_bug is failing 

mmap_fcntl_bug: Child got lock, we must have dropped it (TEST FAILED)

And this happens whether the client was built with GCC of clang.

It seems that the extra fopen/fclose that I'm doing is interfering with the fcntl to lock the file.

I need to see if I can use pread (a bit like the existing code does with a 1k stack buffer). Or else fseek back to where the fd was before.
Comment 12 Paul Floyd 2022-05-08 14:22:02 UTC
Created attachment 148662 [details]
Patch for reading debuginfo correctly with lld 9+
Comment 13 Paul Floyd 2022-05-08 17:16:55 UTC
On FreeBSD 13 amd64, OK with both clang and gcc.
On Linux Fedora 34 amd64 also OK.
Comment 14 Paul Floyd 2022-05-08 19:35:19 UTC
On Fedora 34, git head and "./configure CC=clang CXX=clang++ LDFLAGS=-fuse-ld=lld" with thispatch the results look fairly similar to what I get on FreeBSD.

I didn't get all of the problems that I had with FreeBSD .got.plt errors on Helgrind and DRD, but I think that is because my Fedora 34 was all or mostly built with GCC. That means that libc, libpthread etc don't have the split RW PT_LOADS. On the other hand my FreeBSD 13 is all or mostly built with clang and the system libs will have them split.

The situation might be different on ChromeOS and Android.
Comment 15 Mark Wielaard 2022-05-31 18:46:24 UTC
I wish I understood this part of the code better. But I also struggling
a bit. Let me try to talk us through the patch to see if we agree on
the why/what:

di_notify_mmap is triggered by an actual mmap (and some startup code,
for valgrind tools itself?). The problem we are dealing with is when
"enough" or "the right" parts of the file have been mapped by ld.so to
start searching for the debuginfo and setting up the offsets to map
debug addresses to addresses in memory where the code is mapped.

The goal of di_notify_mmap is to reach di_notify_ACHIEVE_ACCEPT_STATE
at which point the actual debuginfo is read. This is guarded by:

  if (di->fsm.have_rx_map && di->fsm.have_rw_map)

You patch changes that to:

   if (di->fsm.have_rx_map &&
       rw_load_count >= 1 &&
       di->fsm.have_rw_map == rw_load_count)

Now have_rw_map is slightly misnamed, but ok.

The rw_load_count comes from the new check_elf_and_get_rw_loads.
Which tries to get an elf image from the file descriptor and traverses
the phdrs looking for PT_LOAD RW mappings, with this funny counter
example:

   /*
    * Hold your horses
    * Just because The ELF file contains 2 RW PT_LOAD segments it
    * doesn't mean that Valgrind will also make 2 calls to
    * VG_(di_notify_mmap). If the stars are all aligned
    * (which usually means that the ELF file is the client
    * executable with the segment offset for the
    * second PT_LOAD falls exactly on 0x1000) then the NSegements
    * will get merged and VG_(di_notify_mmap) only gets called once. */
   if (*rw_load_count == 2 &&
       ehdr_m.e_type == ET_EXEC &&
       a_phdr.p_offset == VG_PGROUNDDN(a_phdr.p_offset) )
   {
      *rw_load_count = 1;
   }

I assume in that case ld.so merges the mmap request into one?

I cannot say I really understand everything here, but I understand why
you want to delay to actual debuginfo loading till all mapping are
there.

Is the above an accurate description of the proposed patch logic?
Comment 16 Paul Floyd 2022-05-31 19:12:47 UTC
(In reply to Mark Wielaard from comment #15)

For macos I don't know if any of these changes are applicable so I wanted to keep the old code if #if blocks.

I'll clean up unrelated changes and push them separately.

> I wish I understood this part of the code better. But I also struggling
> a bit. Let me try to talk us through the patch to see if we agree on
> the why/what:
> 
> di_notify_mmap is triggered by an actual mmap (and some startup code,
> for valgrind tools itself?). The problem we are dealing with is when
> "enough" or "the right" parts of the file have been mapped by ld.so to
> start searching for the debuginfo and setting up the offsets to map
> debug addresses to addresses in memory where the code is mapped.

Essentially that's it. "enough" is when the RX and the RW PT_LOAD(s) have been seen.


As I understand it, di_notify_mmap can be called in 2 situations

"HOST"
1a. For the tool exe and tool/core shared libs. These are already mmap'd when the host starts so we look at something like the /proc filesystem to get the mapping after the event and build up the NSegments from that.

1b. Then the host loads ld.so and the guest exe. This is done in the sequence load_client -> VG_(do_exec) -> VG_(do_exec_inner) -> exe_handlers->load_fn ( == VG_(load_ELF) ). This does the mmap'ing and creats the associated NSegments.

The NSegments may get merged, so there could be fewer mmaps and PT_LOADs than there are NSegemnts.

This is around line 1893 of m_main.c:

     for (i = 0; i < n_seg_starts; i++) {
        anu.ull = VG_(di_notify_mmap)( seg_starts[i], True/*allow_SkFileV*/,
                                       -1/*Don't use_fd*/);

"GUEST"
2. When the guest loads any further shared libs (libc, other dependencies, dlopens).

There are a few variations for syswraps/platforms, but the one that we are concerned with is syswrap-generic.c:


      /* Load symbols? */
      di_handle = VG_(di_notify_mmap)( (Addr)sr_Res(sres),
                                       False/*allow_SkFileV*/, (Int)arg5 );

In this case the NSegment could possibly be merged, but that is irrelevant because di_notify_mmap is being called based directy on the mmap result.

> The goal of di_notify_mmap is to reach di_notify_ACHIEVE_ACCEPT_STATE
> at which point the actual debuginfo is read. This is guarded by:
> 
>   if (di->fsm.have_rx_map && di->fsm.have_rw_map)
> 
> You patch changes that to:
> 
>    if (di->fsm.have_rx_map &&
>        rw_load_count >= 1 &&
>        di->fsm.have_rw_map == rw_load_count)
> 
> Now have_rw_map is slightly misnamed, but ok.

Yes, count_rw_map would be better.

> 
> The rw_load_count comes from the new check_elf_and_get_rw_loads.
> Which tries to get an elf image from the file descriptor and traverses
> the phdrs looking for PT_LOAD RW mappings, with this funny counter
> example:
> 
>    /*
>     * Hold your horses
>     * Just because The ELF file contains 2 RW PT_LOAD segments it
>     * doesn't mean that Valgrind will also make 2 calls to
>     * VG_(di_notify_mmap). If the stars are all aligned
>     * (which usually means that the ELF file is the client
>     * executable with the segment offset for the
>     * second PT_LOAD falls exactly on 0x1000) then the NSegements
>     * will get merged and VG_(di_notify_mmap) only gets called once. */
>    if (*rw_load_count == 2 &&
>        ehdr_m.e_type == ET_EXEC &&
>        a_phdr.p_offset == VG_PGROUNDDN(a_phdr.p_offset) )
>    {
>       *rw_load_count = 1;
>    }
> 
> I assume in that case ld.so merges the mmap request into one?

No it's Valgrind that merges the NSegements, as above.

> I cannot say I really understand everything here, but I understand why
> you want to delay to actual debuginfo loading till all mapping are
> there.
> 
> Is the above an accurate description of the proposed patch logic?

Yes, that's about it.

Lastly, I thought that the code to handle either 1 or 2 segments in ML_(read_elf_debug_info) is a bit messy.

Do you have any ideas how it could be done more cleanly?
Comment 17 Paul Floyd 2022-06-07 22:00:42 UTC
Created attachment 149540 [details]
Patch for reading debuginfo correctly with lld 9+

Added more comments, some extra checks and simplified some code.
Comment 18 Mark Wielaard 2022-06-09 15:59:58 UTC
(In reply to Paul Floyd from comment #16)
> > di_notify_mmap is triggered by an actual mmap (and some startup code,
> > for valgrind tools itself?). The problem we are dealing with is when
> > "enough" or "the right" parts of the file have been mapped by ld.so to
> > start searching for the debuginfo and setting up the offsets to map
> > debug addresses to addresses in memory where the code is mapped.
> 
> Essentially that's it. "enough" is when the RX and the RW PT_LOAD(s) have
> been seen.

OK.
  
> As I understand it, di_notify_mmap can be called in 2 situations
> 
> "HOST"
> 1a. For the tool exe and tool/core shared libs. These are already mmap'd
> when the host starts so we look at something like the /proc filesystem to
> get the mapping after the event and build up the NSegments from that.
> 
> 1b. Then the host loads ld.so and the guest exe. This is done in the
> sequence load_client -> VG_(do_exec) -> VG_(do_exec_inner) ->
> exe_handlers->load_fn ( == VG_(load_ELF) ). This does the mmap'ing and
> creats the associated NSegments.
> 
> The NSegments may get merged, so there could be fewer mmaps and PT_LOADs
> than there are NSegemnts.
> 
> This is around line 1893 of m_main.c:
> 
>      for (i = 0; i < n_seg_starts; i++) {
>         anu.ull = VG_(di_notify_mmap)( seg_starts[i], True/*allow_SkFileV*/,
>                                        -1/*Don't use_fd*/);
> 
> "GUEST"
> 2. When the guest loads any further shared libs (libc, other dependencies,
> dlopens).
> 
> There are a few variations for syswraps/platforms, but the one that we are
> concerned with is syswrap-generic.c:
> 
> 
>       /* Load symbols? */
>       di_handle = VG_(di_notify_mmap)( (Addr)sr_Res(sres),
>                                        False/*allow_SkFileV*/, (Int)arg5 );
> 
> In this case the NSegment could possibly be merged, but that is irrelevant
> because di_notify_mmap is being called based directy on the mmap result.

Thanks for this overview. I forgot about the "HOST" part.

> > The goal of di_notify_mmap is to reach di_notify_ACHIEVE_ACCEPT_STATE
> > at which point the actual debuginfo is read. This is guarded by:
> > 
> >   if (di->fsm.have_rx_map && di->fsm.have_rw_map)
> > 
> > You patch changes that to:
> > 
> >    if (di->fsm.have_rx_map &&
> >        rw_load_count >= 1 &&
> >        di->fsm.have_rw_map == rw_load_count)
> > 
> > Now have_rw_map is slightly misnamed, but ok.
> 
> Yes, count_rw_map would be better.

Thanks for updating that in the new patch.
 
> > The rw_load_count comes from the new check_elf_and_get_rw_loads.
> > Which tries to get an elf image from the file descriptor and traverses
> > the phdrs looking for PT_LOAD RW mappings, with this funny counter
> > example:
> > 
> >    /*
> >     * Hold your horses
> >     * Just because The ELF file contains 2 RW PT_LOAD segments it
> >     * doesn't mean that Valgrind will also make 2 calls to
> >     * VG_(di_notify_mmap). If the stars are all aligned
> >     * (which usually means that the ELF file is the client
> >     * executable with the segment offset for the
> >     * second PT_LOAD falls exactly on 0x1000) then the NSegements
> >     * will get merged and VG_(di_notify_mmap) only gets called once. */
> >    if (*rw_load_count == 2 &&
> >        ehdr_m.e_type == ET_EXEC &&
> >        a_phdr.p_offset == VG_PGROUNDDN(a_phdr.p_offset) )
> >    {
> >       *rw_load_count = 1;
> >    }
> > 
> > I assume in that case ld.so merges the mmap request into one?
> 
> No it's Valgrind that merges the NSegements, as above.

Right, this is the do_exec case. Thanks.
 
> Lastly, I thought that the code to handle either 1 or 2 segments in
> ML_(read_elf_debug_info) is a bit messy.
> 
> Do you have any ideas how it could be done more cleanly?

Yes, it is a bit messy, but I don't really have an idea to do is more cleanly.

I think the cleaned up patch is good to go.

But I am a little hesitant about the new varinfo5.stderr.exp-freebsd
Are you sure it is really correct now?

Are the differences really because of different debuginfo generated by clang vs gcc?
Or do we still get some addresses/offsets wrong?
Comment 19 Paul Floyd 2022-06-09 20:01:22 UTC
For the diffs, I see

1 scope problem
twice the following two
1 naming
1 static
1 paths + inlining

I haven't gone into it in detail. About half of the differences are reasonable.

I still want to keep this expected as it's the only one I know of that indicates problems reading debuginfo in shared libraries.