Bug 493433 - Add a new fds only mode to --track-fds (--modify-fds)
Summary: Add a new fds only mode to --track-fds (--modify-fds)
Status: CONFIRMED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.22 GIT
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Alexandra Hajkova
URL:
Keywords:
Depends on:
Blocks: 502359
  Show dependency treegraph
 
Reported: 2024-09-21 12:17 UTC by Mark Wielaard
Modified: 2025-04-04 14:20 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
patch (12.92 KB, patch)
2025-02-13 11:35 UTC, Alexandra Hajkova
Details
Emit pp_ExeContext after the file descriptor backtrace (2.01 KB, patch)
2025-03-03 11:47 UTC, Alexandra Hajkova
Details
patch (16.41 KB, patch)
2025-03-12 13:38 UTC, Alexandra Hajkova
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2024-09-21 12:17:21 UTC
Normally a newly recreated file descriptor gets the lowest available number.  This might cause old file descriptor numbers to be reused and hides bad file descriptor accesses (because the old number is new again).

It would be nice to have a mode where valgrind makes sure new file descriptors always get a new, not yet used before number. This might be implemented by making record_fd_open* return a new file descriptor (by dupping the given file descriptor only a higher never used before number and closing the one returned from the kernel) and returning that to the application.

We might have to make an exception for file descriptors 0, 1, 2 which programs might explicitly want to reassign.
Comment 1 Mark Wielaard 2024-11-06 15:58:27 UTC
This new mode (--track-fds=new ?) would help catch issues like the following:

#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>

int
main ()
{
  int oldfd = open ("foobar.txt", O_RDWR|O_CREAT, S_IRUSR | S_IWUSR);
  printf ("got foobar.txt as: %d\n", oldfd);
  /*... do something with oldfd ...*/
  close (oldfd);

  /* Lets open another file... */
  int newfd = open ("foobad.txt", O_RDWR|O_CREAT, S_IRUSR | S_IWUSR);
  printf ("got foobad.txt as: %d\n", newfd);
  /* ... oops we are using the wrong fd (but same number...) */
  dprintf (oldfd, "some new text\n");

  close (newfd);
  return 0;
}
Comment 2 Alexandra Hajkova 2025-02-13 11:35:58 UTC
Created attachment 178266 [details]
patch

 Add --modify-fds=[no|high] option
    
    Normally a newly recreated file descriptor gets the lowest available
    number.  This might cause old file descriptor numbers to be reused
    and hides bad file descriptor accesses (because the old number is
    new again).
    
    When enabled, when the program opens a new file descriptor,
    the highest available file descriptor is returned instead of the
    lowest one.
    
    Add the none/tests/track_new.stderr.exp test to test this
    new option.
Comment 3 Mark Wielaard 2025-02-13 12:41:20 UTC
Nitpick:

 Bool   VG_(clo_run_cxx_freeres) = True;
 UInt   VG_(clo_track_fds)      = 0;
+UInt   VG_(clo_modify_fds)      = 0;
 Bool   VG_(clo_show_below_main)= False;
 Bool   VG_(clo_keep_debuginfo) = False;

The = signs don't line up.

+   (not yet seem before) file descriptor.  */

s/seem/seen/

@@ -5897,12 +5903,13 @@ PRE(sys_openat)
 POST(sys_openat)
 {
    vg_assert(SUCCESS);
+   POST_newFd_RES;
    if (!ML_(fd_allowed)(RES, "openat", tid, True)) {
       VG_(close)(RES);
       SET_STATUS_Failure( VKI_EMFILE );
    } else {
       if (VG_(clo_track_fds))
-         ML_(record_fd_open_with_given_name)(tid, RES, (HChar*)(Addr)ARG2);
+        ML_(record_fd_open_with_given_name)(tid, RES, (HChar*)(Addr)ARG2);
    }
 }

Unnecessary (and wrong) whitespace indentation change.


+extern UInt  VG_(clo_modify_fds);

Needs comment what the values are/mean.

--- /dev/null
+++ b/none/tests/track_new.stderr.exp
@@ -0,0 +1,12 @@
+File descriptor 1020 was closed already
+ Previously closed
+   at 0x........: close (in /...libc...)
+   by 0x........: main (track_new.c:11)

That (1020) seems an oddly specific fd number.

+ Originally opened
+   at 0x........: open (in /...libc...)
+   by 0x........: main (track_new.c:8)
+   at 0x........: write (in /...libc...)
+   by 0x........: __printf_buffer_flush_dprintf (in /...libc...)
+   by 0x........: __vdprintf_internal (in /...libc...)
+   by 0x........: dprintf (in /...libc...)
+   by 0x........: main (track_new.c:17)

The __ functions might be different on different glibc versions/arches?

--- /dev/null
+++ b/none/tests/track_new.stdout.exp
@@ -0,0 +1,2 @@
+got foobar.txt as: 1020
+got foobad.txt as: 1019

Again oddly specific fd numbers.

Given the new option/argument I would have expected some updates to the ./none/tests/cmdline*vgtests?

There should probably some documentation that this is against POSIX which guarantees a new file descriptor is always the lowest possible number.

POST_newFd_RES works for those syscalls that return the new fd as result (RES) of the syscall.
What about other syscalls/fcntls that return a new fd some other way?

grepping for record_fd_open I think I spot some syscalls missing:
sys_inotify_init, sys_inotify_init1, sys_mq_open, timerfd_create, pipe[2], ...
pipe2 is interesting since it returns 2 fds.

Should all these other syscalls also hava fd adjustments?
Comment 4 Alexandra Hajkova 2025-03-03 11:47:19 UTC
Created attachment 179068 [details]
Emit pp_ExeContext after the file descriptor backtrace
Comment 5 Mark Wielaard 2025-03-05 15:18:08 UTC
(In reply to Alexandra Hajkova from comment #4)
> Created attachment 179068 [details]
> Emit pp_ExeContext after the file descriptor backtrace

Thanks that looks good. Pushed to git trunk and the VALGRIND_3_24_BRANCH since it might be a small bug, the output is really confusing.

commit 838dc01d2c42f7f22785c771bd74bc4e595da444
Author: Alexandra Hájková <ahajkova@redhat.com>
Date:   Mon Mar 3 06:14:08 2025 -0500

    syswrap-generic: Emit pp_ExeContext after the file descriptor backtrace
    
    Adjust use_after_close test for the change.

VALGRIND_3_24_BRANCH
commit ec7335142384ec9da66871036803b96319b590eb
Author: Alexandra Hájková <ahajkova@redhat.com>
Date:   Mon Mar 3 06:14:08 2025 -0500

    syswrap-generic: Emit pp_ExeContext after the file descriptor backtrace
    
    Adjust use_after_close test for the change.
    
    (cherry picked from commit 838dc01d2c42f7f22785c771bd74bc4e595da444)
Comment 6 Alexandra Hajkova 2025-03-12 13:38:09 UTC
Created attachment 179341 [details]
patch

Add --modify-fds=[no|high] option
    
    Normally a newly recreated file descriptor gets the lowest number
    available. This might cause old file descriptor numbers to be reused
    and hides bad file descriptor accesses (because the old number is
    new again).
    
    When enabled, when the program opens a new file descriptor,
    the highest available file descriptor is returned instead of the
    lowest one.
    
    Add the none/tests/track_new.stderr.exp test to test this
    new option.
    use_after_close
    Adjust none/tests/filter_fdleak to filter the track_new.vgtest,
    because of the fdleak changes, the output of the use_after_close
    test also had to be adjusted. ALso adjust the none/tests/cmdline1
    and none/tests/cmdline2 output as the new --modify-fds=no|high is
    displayed.
    
    https://bugs.kde.org/show_bug.cgi?id=493433
Comment 7 Mark Wielaard 2025-03-27 14:02:57 UTC
There was a try build here:
https://builder.sourceware.org/buildbot/#/changes/70972

With the following issues:

- valgrind-try-fedora-x86_64

--- track_new.stderr.exp	2025-03-13 13:11:28.926443833 +0000
+++ track_new.stderr.out	2025-03-13 13:24:27.514375927 +0000
@@ -1,4 +1,4 @@
-File descriptor 1020 was closed already
+File descriptor 20464 was closed already
    at 0x........: write (in /...libc...)
    by 0x........: __printf_buffer_flush_dprintf (in /...libc...)
    by 0x........: __vdprintf_internal (in /...libc...)

- valgrind-try-fedora-arm64

track_new PASS

- valgrind-try-debian-ppc64

--- track_new.stderr.exp	2025-03-13 13:33:28.997830006 +0000
+++ track_new.stderr.out	2025-03-13 14:05:32.273155198 +0000
@@ -1,12 +1,12 @@
 File descriptor 1020 was closed already
-   at 0x........: write (in /...libc...)
-   by 0x........: __printf_buffer_flush_dprintf (in /...libc...)
-   by 0x........: __vdprintf_internal (in /...libc...)
-   by 0x........: dprintf (in /...libc...)
+   at 0x........: write (write.c:26)
+   by 0x........: __printf_buffer_flush_dprintf (iovdprintf.c:49)
+   by 0x........: __vdprintf_internal (iovdprintf.c:72)
+   by 0x........: dprintf@@GLIBC_2.4 (dprintf.c:30)
    by 0x........: main (track_new.c:15)
  Previously closed
-   at 0x........: close (in /...libc...)
+   at 0x........: close (close.c:27)
    by 0x........: main (track_new.c:10)
  Originally opened
-   at 0x........: open (in /...libc...)
+   at 0x........: open (open64.c:41)
    by 0x........: main (track_new.c:8)

- valgrind-try-fedora-s390x

--- track_new.stderr.exp	2025-03-13 13:39:02.530937442 +0000
+++ track_new.stderr.out	2025-03-13 13:50:56.870871483 +0000
@@ -2,7 +2,7 @@
    at 0x........: write (in /...libc...)
    by 0x........: __printf_buffer_flush_dprintf (in /...libc...)
    by 0x........: __vdprintf_internal (in /...libc...)
-   by 0x........: dprintf (in /...libc...)
+   by 0x........: dprintf@@GLIBC_2.4 (in /...libc...)
    by 0x........: main (track_new.c:15)
  Previously closed
    at 0x........: close (in /...libc...)
Comment 8 Mark Wielaard 2025-03-27 14:18:23 UTC
- valgrind-try-fedora-ppc64le

--- track_new.stderr.exp	2025-03-13 14:19:50.376095914 +0000
+++ track_new.stderr.out	2025-03-13 14:50:59.480715273 +0000
@@ -2,7 +2,7 @@
    at 0x........: write (in /...libc...)
    by 0x........: __printf_buffer_flush_dprintf (in /...libc...)
    by 0x........: __vdprintf_internal (in /...libc...)
-   by 0x........: dprintf (in /...libc...)
+   by 0x........: __dprintfieee128 (in /...libc...)
    by 0x........: main (track_new.c:15)
  Previously closed
    at 0x........: close (in /...libc...)
Comment 9 Mark Wielaard 2025-03-27 14:23:34 UTC
fedora-riscv
track_new:       valgrind   -q --track-fds=yes --modify-fds=high ./track_new
PASS
Comment 10 Alexandra Hajkova 2025-03-27 14:42:06 UTC
I've regression tested on x86_64 with Fedora 40 and Fedora 41 and the results were absolutely clean:
== 839 tests, 0 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
Comment 11 Alexandra Hajkova 2025-03-27 14:44:14 UTC
I've regression tested on aarch64 with Fedora 40 and Fedora 41 and there were no regressions but there were some failures:
gdbserver_tests/hgtls                    (stdoutB)		
memcheck/tests/dw4                       (stderr)		
memcheck/tests/varinfo2                  (stderr)		
memcheck/tests/varinfo4                  (stderr)		
memcheck/tests/varinfo5                  (stderr)		
memcheck/tests/varinfo6                  (stderr)		
memcheck/tests/varinforestrict           (stderr)		
helgrind/tests/hg05_race2                (stderr)		
helgrind/tests/tc20_verifywrap           (stderr)