Bug 432215 - Add debuginfod functionality
Summary: Add debuginfod functionality
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (other bugs)
Version First Reported In: unspecified
Platform: Other Linux
: NOR wishlist
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-01-27 18:11 UTC by Aaron Merey
Modified: 2021-02-26 01:10 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
debuginfod.patch (4.14 KB, patch)
2021-01-27 18:11 UTC, Aaron Merey
Details
debuginfod.patch (13.37 KB, patch)
2021-02-09 00:43 UTC, Aaron Merey
Details
debuginfod.patch (16.77 KB, patch)
2021-02-19 04:36 UTC, Aaron Merey
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Aaron Merey 2021-01-27 18:11:39 UTC
Created attachment 135243 [details]
debuginfod.patch

Debuginfod is an client/server in elfutils that automatically distributes DWARF debuginfo to debugging tools across HTTP. Debuginfod functionality in valgrind would allow for debuginfo to be automatically downloaded when valgrind is otherwise unable to locate the file.

I've attached a patch that adds the necessary client code to valgrind. To avoid linking a debuginfod shared library we call out to the debuginfod-find command line tool which queries servers for the file in question and, if successful, provides valgrind with the path of the downloaded file. This debuginfo then goes on to be processed like any other.

For more information about debuginfod see https://sourceware.org/elfutils/Debuginfod.html
Comment 1 Mark Wielaard 2021-01-27 19:08:28 UTC
Nice. Adding Philippe to CC because he probably knows most about executing subprocesses from valgrind core.

My first comments would be:

- It would be nice to not even try if the system doesn't have debuginfod-find installed or if DEBUGINFOD_URLS is not set or empty. You can check the second with VG_(getenv)("DEBUGINFOD_URLS") and the first by taking out ML_(find_executable)("debuginfod-find")) from the new function and setting a static debuginfo_find_path variable in slightly earlier startup code. You would also set debuginfo_find_path to NULL if DEBUGINFOD_URLS wouldn't be set.

- I think you need to drain stderr first in a while loop before doing the VG_(waitpid)(pid, NULL, 0); and VG_(read)(out_fds[0], buf, BUF_SIZE); because based if the user has set DEBUGINFOD_PROGRESS or DEBUGINFOD_VERBOSE stderr can see a lot of output (stdout is always just the filename if anything was found).
Comment 2 Mark Wielaard 2021-01-27 19:25:49 UTC
Also I think all the VG_(umsg) need to be guarded by if (VG_(clo_verbosity) > 0) so that there is no extra output when running with valgrind -q.
Comment 3 Aaron Merey 2021-01-27 19:48:35 UTC
(In reply to Mark Wielaard from comment #1)
> - It would be nice to not even try if the system doesn't have
> debuginfod-find installed or if DEBUGINFOD_URLS is not set or empty. You can
> check the second with VG_(getenv)("DEBUGINFOD_URLS") and the first by taking
> out ML_(find_executable)("debuginfod-find")) from the new function and
> setting a static debuginfo_find_path variable in slightly earlier startup
> code. You would also set debuginfo_find_path to NULL if DEBUGINFOD_URLS
> wouldn't be set.

Agreed, this will reduce clutter being shown to the user in case debuginfod-find is installed but debuginfod has been disabled by unsetting DEBUGINFOD_URLS. Otherwise they could see multiple "Server query failed" messages unnecessarily.

> - I think you need to drain stderr first in a while loop before doing the
> VG_(waitpid)(pid, NULL, 0); and VG_(read)(out_fds[0], buf, BUF_SIZE);
> because based if the user has set DEBUGINFOD_PROGRESS or DEBUGINFOD_VERBOSE
> stderr can see a lot of output (stdout is always just the filename if
> anything was found).

Good point. It may be easier to simply redirect debuginfod-find's stderr to a temporary file to avoid any complications with the pipe filling up. The error message can instead be read from that file if needed.

(In reply to Mark Wielaard from comment #2)
> Also I think all the VG_(umsg) need to be guarded by if (VG_(clo_verbosity)
> > 0) so that there is no extra output when running with valgrind -q.

Ok will add.
Comment 4 Mark Wielaard 2021-01-28 09:39:07 UTC
(In reply to Aaron Merey from comment #3)
> (In reply to Mark Wielaard from comment #1)
> > - I think you need to drain stderr first in a while loop before doing the
> > VG_(waitpid)(pid, NULL, 0); and VG_(read)(out_fds[0], buf, BUF_SIZE);
> > because based if the user has set DEBUGINFOD_PROGRESS or DEBUGINFOD_VERBOSE
> > stderr can see a lot of output (stdout is always just the filename if
> > anything was found).
> 
> Good point. It may be easier to simply redirect debuginfod-find's stderr to
> a temporary file to avoid any complications with the pipe filling up. The
> error message can instead be read from that file if needed.

You might have to experiment with how that interacts with the DEBUGINFOD_PROGRESS output. It might be that VG_(umsg) and "interactive" stderr from the inferior don't really mix anyway.
Comment 5 Aaron Merey 2021-02-09 00:43:07 UTC
Created attachment 135525 [details]
debuginfod.patch

I revised the handling of debuginfod-find's error output. It is now written to a temporary file. Grep is used (via fork+exec) to pull the error message out of the temp file if needed. I also added a debuginfod testcase.
Comment 6 Mark Wielaard 2021-02-11 22:05:12 UTC
Random comment that isn't really about the actual patch:

+#include "../m_initimg/priv_initimg_pathscan.h"

I see you need this for ML_(find_executable). Maybe this should moved somewhere else so you don't need to import a priv header from another module. Maybe it can be moved into coregrind/m_libcproc.c and the files that need it then include pub_core_libcproc.h ?
Comment 7 Mark Wielaard 2021-02-11 22:11:15 UTC
The patch includes some changes to tests/vg_regtest.in. I believe they aren't needed anymore because you already submitted them in bug #432672. Or is the explicit clearing of DEBUGINFOD_URLS still needed?
Comment 8 Mark Wielaard 2021-02-11 22:25:04 UTC
+   if (VG_(getenv)("DEBUGINFOD_URLS") == NULL
+       || VG_(strcmp)("", VG_(getenv("DEBUGINFOD_URLS"))) == 0)
+      return NULL;
+
+   if ((path = ML_(find_executable)("debuginfod-find")) == NULL)
+      return NULL;

It would be nice to move this somewhere so that it only has to be checked ones.
Maybe something like:

static
const HChar* debuginfod_find_path (void)
{
   static const HChar *path = (const HChar*) -1;
   if (path == (const HChar*) -1) {
      if (VG_(getenv)("DEBUGINFOD_URLS") == NULL
          || VG_(strcmp)("", VG_(getenv("DEBUGINFOD_URLS"))) == 0)
         path = NULL;
      else
         path = ML_(find_executable)("debuginfod-find");
   }
   return path;
}

So you can check as

   if ((path = debuginfod_find_path()) == NULL)
      return NULL;
Comment 9 Mark Wielaard 2021-02-11 22:41:20 UTC
Do we really need to redirect stderr to a temporary file and then use grep to filter out any errors? If possible it would be nicer if we could simply do one exec and simply drain stderr from the process till it is done and then read stdout, which is always just one line with the file path. Did you try that?
Comment 10 Aaron Merey 2021-02-19 04:36:50 UTC
Created attachment 135885 [details]
debuginfod.patch

(In reply to Mark Wielaard from comment #9)
> Do we really need to redirect stderr to a temporary file and then use grep
> to filter out any errors? If possible it would be nicer if we could simply
> do one exec and simply drain stderr from the process till it is done and
> then read stdout, which is always just one line with the file path. Did you
> try that?

So we need to drain stderr in case $DEBUGINFOD_PROGRESS or $DEBUGINFOD_VERBOSE
are set and possibly cause the pipe to fill up. But since this this output
doesn't work well with umesg() I just unset these environment variables in
the child process that exec's debuginfod-find. This way if there is an error
in debuginfod-find we are guaranteed a one line message in stderr which can be
printed with umesg(). In the valgrind manpage I mention that these environment
variables will be ignored. These changes plus the others you asked for are in
the revised attachment.
Comment 11 Mark Wielaard 2021-02-26 01:10:07 UTC
Thanks for adding the documentation, nice to get it in both the manual and the man page. I tested it on Fedora and Debian using DEBUGINFOD_URLS=https://debuginfod.elfutils.org/ and it seems to work.

I made a few small changes, added .gitignore entries, removed the extra warning in the check so it looks less scary and just produces an error and a prereq skip, turned the read into a while loop, not because I think the results will really be partial but to keep the length of what was read so we can check for \n at the end instead of needing to use strchr. I also moved the find_executable code up into pub_core_pathscan.h so it can be used from everywhere.

Pushed as:

commit fd4e3fb0ffa3f751f0e7f2e7f4639d4c3773305d (HEAD -> master, origin/master, origin/HEAD)
Author: Aaron Merey <amerey@redhat.com>
Date:   Thu Feb 18 22:58:25 2021 -0500

    PR432215 Add debuginfod functionality
    
    debuginfod is an HTTP server for distributing ELF/DWARF debugging
    information.  When a debuginfo file cannot be found locally, Valgrind
    is able to query debuginfod servers for the file using its build-id.
    
    readelf.c: Add debuginfod_find_debug_file(). Spawns a child process to
    exec `debuginfod-find` in order to query servers for the debuginfo
    file. Also add helper debuginfod_find_path().
    
    pub_core_pathscan.h: Moved from priv_initimg_pathscan.h in order to use
    VG_(find_executable)() in readelf.c.
    
    docs: Add information regarding debuginfod to valgrind.1
    
    memcheck/tests/linux: Add new test debuginfod-check.
    
    tests/vg_regtest.in: Clear $DEBUGINFOD_URLS before running any tests.
    
    https://bugs.kde.org/show_bug.cgi?id=432215