Bug 471222 - support tracking of file descriptors being double closed
Summary: support tracking of file descriptors being double closed
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (other bugs)
Version First Reported In: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Alexandra Hajkova
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-19 12:35 UTC by Mark Wielaard
Modified: 2024-03-13 14:47 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
patch (39.97 KB, patch)
2024-02-27 14:15 UTC, Alexandra Hajkova
Details
patch (33.29 KB, patch)
2024-02-27 14:27 UTC, Alexandra Hajkova
Details
patch (29.49 KB, patch)
2024-02-28 09:03 UTC, Alexandra Hajkova
Details
patch (34.85 KB, patch)
2024-03-05 15:16 UTC, Alexandra Hajkova
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mark Wielaard 2023-06-19 12:35:45 UTC
valgrind already supports tracking file descriptors with --track-fds=yes.
This shows file descriptors that "leak", because they are never closed.

It would be really nice to extend this functionality to detecting double closes.

For example in the following a file descriptor is (accidentally) closed twice:

#include <unistd.h>
int fd;
int main ()
{
  fd = dup (1);
  if (fd != -1)
    close (fd);
  /* ... */
  close (fd);
  return 0;
}

These might seem innocent at first. But it isn't in if between the two closes (possibly in another thread) a new file descriptor is opened. Because that might take the same number as the one closed and then it might be closed on the second close.
Comment 1 Mark Wielaard 2024-01-12 13:55:22 UTC
Sasha has been working on this.
Comment 2 Alexandra Hajkova 2024-02-27 14:15:13 UTC
Created attachment 166136 [details]
patch
Comment 3 Alexandra Hajkova 2024-02-27 14:15:58 UTC
Submitting a patch:

[PATCH] syswrap-generic.c: Warn when file descriptor

is closed the second time.

Note that close_range is handled similar to closing each descriptor
individually. We might want to treat close_range more special.

Handle close_range so a double close by close_range isn't an error.
Add a new test case - none/tests/socket_close.vgtest.

https://bugs.kde.org/show_bug.cgi?id=471222

Co-Authored-By: Mark Wielaard <mark@klomp.org>
Comment 4 Alexandra Hajkova 2024-02-27 14:27:26 UTC
Created attachment 166140 [details]
patch
Comment 5 Paul Floyd 2024-02-27 17:50:18 UTC
Do you need a feature test for close_range (is it available on older Linux systems)?. I don't think that macOS or Solaris have it. On FreeBSD it's fairly recent, though it has been backported so I'm not certain exactly which versions have it. I need to check if it's in the upcoming 13.3 and whether it was in 13.2 or not.

+#include <linux/close_range.h>

^^^ will require conditional compilation. close_range is in unistd.h on FreeBSD.
Comment 6 Alexandra Hajkova 2024-02-28 09:03:39 UTC
Created attachment 166151 [details]
patch
Comment 7 Mark Wielaard 2024-02-28 10:41:20 UTC
I forgot about freebsd, sorry.
But I see the close_range wrappers are almost identical, so that is good.
We can adapt them together.

And freebsd already has ./memcheck/tests/freebsd/close_range.vgtest
Lets see if we can make that a generic testcase, I see there is already a close_range configure check.

I note that freebsd also has a closefrom syscall plus wrapper. But this doesn't seem to implement file descriptor tracking?
It should probably use the newly introduced record_fd_close_range.
Comment 8 Mark Wielaard 2024-02-28 11:27:47 UTC
BTW. In that testcase memcheck/tests/freebsd/close_range.c it has:

/* It looks like close_range was initially implemented for FreeBSD 13
 * but without CLOSE_RANGE_CLOEXEC
 * That implementation got back ported to FreeBSD 12.2
 * And then CLOSE_RANGE_CLOEXEC added to 13 but not backported
 * so 12 has close_range but not CLOSE_RANGE_CLOEXEC */
#if !defined(CLOSE_RANGE_CLOEXEC)
#define CLOSE_RANGE_CLOEXEC 1
#endif

Shouldn't that be 0 instead of 1?
Real CLOSE_RANGE_CLOEXEC is 4.
What does 1 mean in this case?
Comment 9 Paul Floyd 2024-02-28 11:34:56 UTC
> Shouldn't that be 0 instead of 1?
> Real CLOSE_RANGE_CLOEXEC is 4.
> What does 1 mean in this case?

Must have been my mistake it is definitely 4 (or 1<<2)

https://github.com/freebsd/freebsd-src/blob/865baeaf1abeb14327ad6a4a1f8ce722e242ff73/sys/sys/unistd.h#L210
Comment 10 Paul Floyd 2024-02-28 11:53:37 UTC
(In reply to Mark Wielaard from comment #7)

> I note that freebsd also has a closefrom syscall plus wrapper. But this
> doesn't seem to implement file descriptor tracking?
> It should probably use the newly introduced record_fd_close_range.

Agreed, I'll update the wrapper after this patch goes in.
Comment 11 Alexandra Hajkova 2024-03-05 15:16:17 UTC
Created attachment 166427 [details]
patch

A new version with more test cases added and more explanatory commit message:

syswrap-generic.c: Warn when file descriptor
    
    is closed the second time.
    
    We moved the record_fd_close call from POST to PRE sys_close handler,
    because the POST handler is only called on success. Even if the close
    syscall fails the file descriptor is still really closed/invalid.
    In the PRE handler the file descriptor is about to be closed, but hasn't
    been yet so we can capture also the description.
    
    This patch add new field fd_closed to OpenFd structure to record if
    the file descriptor was already closed.
    
    We now capture a backtrace when closing file descriptors to be able to
    print it in a case of a double close.  Always add '<' brackets '>' around
    "unbound" in the description for consistency.
    
    getsockdetails now takes and returns a buffer describing the socket
    because we want to record it, not just print it.
    
    Note that close_range is handled similar to closing each descriptor
    individually. But the case when the close_range is called with an
    infinite endis treated special. Add a new record_fd_close_range
    function which handles close_range with an infinite end so double
    close by close_range isn't an error because we don't want to loop
    over such a wide range.
    
    Add a new test cases:
     - none/tests/socket_close.vgtest - tests double closing a socket
     - none/tests/double_close_range.vgtest - uses close_range to double close the file descriptors
     - none/tests/file_dclose.vgtest - double closing regular file with regular close syscall
    
    https://bugs.kde.org/show_bug.cgi?id=471222
    
    Co-Authored-By: Mark Wielaard <mark@klomp.org>
Comment 12 Paul Floyd 2024-03-08 08:08:13 UTC
I'll test the new patch on FreeBSD tonight.
Comment 13 Paul Floyd 2024-03-10 09:59:30 UTC
I had to modify double_close_range.c and socket_close.c like this

#if defined(__linux__)
#include <linux/close_range.h>
#endif

I get 3 new failures

The first one has two things. Line number changes because of the Linux header. close/_close needs filtering.

paulf> cat none/tests/socket_close*diff
--- socket_close.stderr.exp     2024-03-10 10:30:00.201382000 +0100
+++ socket_close.stderr.out     2024-03-10 10:48:23.710309000 +0100
@@ -2,12 +2,12 @@
 close socket_fd 3
 and close the socket again 3
 File descriptor 3: AF_UNIX socket 3: <unknown> is already closed
-   at 0x........: close (in /...libc...)
-   by 0x........: main (socket_close.c:37)
+   at 0x........: _close (in /...libc...)
+   by 0x........: main (socket_close.c:39)
  Previously closed
-   at 0x........: close (in /...libc...)
-   by 0x........: main (socket_close.c:33)
+   at 0x........: _close (in /...libc...)
+   by 0x........: main (socket_close.c:35)
  Originally opened
    at 0x........: socket (in /...libc...)
-   by 0x........: open_socket (socket_close.c:14)
-   by 0x........: main (socket_close.c:28)
+   by 0x........: open_socket (socket_close.c:16)
+   by 0x........: main (socket_close.c:30)

The second one just needs a filter

paulf> cat none/tests/file_dclose*diff
--- file_dclose.stderr.exp      2024-03-10 10:30:00.199191000 +0100
+++ file_dclose.stderr.out      2024-03-10 10:47:29.153702000 +0100
@@ -1,13 +1,13 @@
 close 3
 time passes and we close 3 again
 File descriptor 3: file_dclose.tmp is already closed
-   at 0x........: close (in /...libc...)
+   at 0x........: _close (in /...libc...)
    by 0x........: closefile (file_dclose.c:16)
    by 0x........: main (file_dclose.c:32)
  Previously closed
-   at 0x........: close (in /...libc...)
+   at 0x........: _close (in /...libc...)
    by 0x........: main (file_dclose.c:28)
  Originally opened
-   at 0x........: creat (in /...libc...)
+   at 0x........: _openat (in /...libc...)
    by 0x........: openfile (file_dclose.c:10)
    by 0x........: main (file_dclose.c:25)

And the last one is new.

--- fdleak_ipv4.stderr.exp      2024-03-10 10:30:00.197868000 +0100
+++ fdleak_ipv4.stderr.out      2024-03-10 10:47:25.622064000 +0100
@@ -1,4 +1,16 @@
 
+File descriptor 4: internet is already closed
+   at 0x........: _close (in /...libc...)
+   by 0x........: client (fdleak_ipv4.c:70)
+   by 0x........: main (fdleak_ipv4.c:90)
+ Previously closed
+   at 0x........: _close (in /...libc...)
+   by 0x........: client (fdleak_ipv4.c:69)
+   by 0x........: main (fdleak_ipv4.c:90)
+ Originally opened
+   at 0x........: dup (in /...libc...)
+   by 0x........: client (fdleak_ipv4.c:68)
+   by 0x........: main (fdleak_ipv4.c:90)

Is that a difference in the behaviour of fork between Linux and FreeBSD, or a but in my fork() code? Not sure.

I can add the filters after you push the changes.
Comment 14 Paul Floyd 2024-03-10 10:08:13 UTC
On second thoughts, isn't the fdleak_ipv4 double close message expected?
Comment 15 Mark Wielaard 2024-03-13 14:31:50 UTC
I needed the following on a system that didn't have close_range:

diff --git a/none/tests/Makefile.am b/none/tests/Makefile.am
index f56eb7faab67..a37bb27257a6 100644
--- a/none/tests/Makefile.am
+++ b/none/tests/Makefile.am
@@ -282,8 +282,11 @@ check_PROGRAMS = \
        process_vm_readv_writev \
        sigprocmask \
        socket_close \
-       file_dclose \
-       double_close_range
+       file_dclose
+
+if HAVE_CLOSE_RANGE
+   check_PROGRAMS += double_close_range
+endif
 
 if HAVE_NESTED_FUNCTIONS
    check_PROGRAMS += nestedfns
Comment 16 Mark Wielaard 2024-03-13 14:35:05 UTC
(In reply to Paul Floyd from comment #13)
> I had to modify double_close_range.c and socket_close.c like this
> 
> #if defined(__linux__)
> #include <linux/close_range.h>
> #endif
> 
> I get 3 new failures
> 
> The first one has two things. Line number changes because of the Linux
> header. close/_close needs filtering.

I added that to none/tests/double_close_range.c
none/tests/socket_close.c doesn't need it, since it doesn't use any of the constants.
I did it so the expected line numbers stay the same (removed two unused headers, added an extra line of whitespace)
Comment 17 Mark Wielaard 2024-03-13 14:36:35 UTC
(In reply to Paul Floyd from comment #14)
> On second thoughts, isn't the fdleak_ipv4 double close message expected?

Yes, it is. It also failed for me locally. I added it to the none/tests/fdleak_ipv4.stderr.exp file.
Comment 18 Mark Wielaard 2024-03-13 14:47:11 UTC
commit 6a28e665f8dd7c031aef5aa0eaa4acbbd8ba27a9
Author: Alexandra Hájková <ahajkova@redhat.com>
Date:   Wed Feb 28 09:02:15 2024 +0100

    With --track-fds=yes warn when file descriptor is closed a second time
    
    We moved the record_fd_close call from POST to PRE sys_close handler,
    because the POST handler is only called on success. Even if the close
    syscall fails the file descriptor is still really closed/invalid.
    In the PRE handler the file descriptor is about to be closed, but hasn't
    been yet so we can capture also the description.
    
    This patch add new field fd_closed to OpenFd structure to record if
    the file descriptor was already closed.
    
    We now capture a backtrace when closing file descriptors to be able to
    print it in a case of a double close.  Always add '<' brackets '>' around
    "unbound" in the description for consistency.
    
    getsockdetails now takes and returns a buffer describing the socket
    because we want to record it, not just print it.
    
    Note that close_range is handled similar to closing each descriptor
    individually. But the case when the close_range is called with an
    infinite end (~0U) is treated special. Add a new record_fd_close_range
    function which handles close_range with an infinite end so double
    close by close_range isn't an error because we don't want to loop
    over such a wide range.
    
    Add a new test cases:
     - none/tests/socket_close.vgtest
       - tests double closing a socket
     - none/tests/double_close_range.vgtest
       - uses close_range to double close the file descriptors
     - none/tests/file_dclose.vgtest
       - double closing regular file with regular close syscall
    
    https://bugs.kde.org/show_bug.cgi?id=471222
    
    Co-Authored-By: Mark Wielaard <mark@klomp.org>