valgrind already supports tracking file descriptors with --track-fds=yes. This shows file descriptors that "leak", because they are never closed. It would be really nice to extend this functionality to detecting double closes. For example in the following a file descriptor is (accidentally) closed twice: #include <unistd.h> int fd; int main () { fd = dup (1); if (fd != -1) close (fd); /* ... */ close (fd); return 0; } These might seem innocent at first. But it isn't in if between the two closes (possibly in another thread) a new file descriptor is opened. Because that might take the same number as the one closed and then it might be closed on the second close.
Sasha has been working on this.
Created attachment 166136 [details] patch
Submitting a patch: [PATCH] syswrap-generic.c: Warn when file descriptor is closed the second time. Note that close_range is handled similar to closing each descriptor individually. We might want to treat close_range more special. Handle close_range so a double close by close_range isn't an error. Add a new test case - none/tests/socket_close.vgtest. https://bugs.kde.org/show_bug.cgi?id=471222 Co-Authored-By: Mark Wielaard <mark@klomp.org>
Created attachment 166140 [details] patch
Do you need a feature test for close_range (is it available on older Linux systems)?. I don't think that macOS or Solaris have it. On FreeBSD it's fairly recent, though it has been backported so I'm not certain exactly which versions have it. I need to check if it's in the upcoming 13.3 and whether it was in 13.2 or not. +#include <linux/close_range.h> ^^^ will require conditional compilation. close_range is in unistd.h on FreeBSD.
Created attachment 166151 [details] patch
I forgot about freebsd, sorry. But I see the close_range wrappers are almost identical, so that is good. We can adapt them together. And freebsd already has ./memcheck/tests/freebsd/close_range.vgtest Lets see if we can make that a generic testcase, I see there is already a close_range configure check. I note that freebsd also has a closefrom syscall plus wrapper. But this doesn't seem to implement file descriptor tracking? It should probably use the newly introduced record_fd_close_range.
BTW. In that testcase memcheck/tests/freebsd/close_range.c it has: /* It looks like close_range was initially implemented for FreeBSD 13 * but without CLOSE_RANGE_CLOEXEC * That implementation got back ported to FreeBSD 12.2 * And then CLOSE_RANGE_CLOEXEC added to 13 but not backported * so 12 has close_range but not CLOSE_RANGE_CLOEXEC */ #if !defined(CLOSE_RANGE_CLOEXEC) #define CLOSE_RANGE_CLOEXEC 1 #endif Shouldn't that be 0 instead of 1? Real CLOSE_RANGE_CLOEXEC is 4. What does 1 mean in this case?
> Shouldn't that be 0 instead of 1? > Real CLOSE_RANGE_CLOEXEC is 4. > What does 1 mean in this case? Must have been my mistake it is definitely 4 (or 1<<2) https://github.com/freebsd/freebsd-src/blob/865baeaf1abeb14327ad6a4a1f8ce722e242ff73/sys/sys/unistd.h#L210
(In reply to Mark Wielaard from comment #7) > I note that freebsd also has a closefrom syscall plus wrapper. But this > doesn't seem to implement file descriptor tracking? > It should probably use the newly introduced record_fd_close_range. Agreed, I'll update the wrapper after this patch goes in.
Created attachment 166427 [details] patch A new version with more test cases added and more explanatory commit message: syswrap-generic.c: Warn when file descriptor is closed the second time. We moved the record_fd_close call from POST to PRE sys_close handler, because the POST handler is only called on success. Even if the close syscall fails the file descriptor is still really closed/invalid. In the PRE handler the file descriptor is about to be closed, but hasn't been yet so we can capture also the description. This patch add new field fd_closed to OpenFd structure to record if the file descriptor was already closed. We now capture a backtrace when closing file descriptors to be able to print it in a case of a double close. Always add '<' brackets '>' around "unbound" in the description for consistency. getsockdetails now takes and returns a buffer describing the socket because we want to record it, not just print it. Note that close_range is handled similar to closing each descriptor individually. But the case when the close_range is called with an infinite endis treated special. Add a new record_fd_close_range function which handles close_range with an infinite end so double close by close_range isn't an error because we don't want to loop over such a wide range. Add a new test cases: - none/tests/socket_close.vgtest - tests double closing a socket - none/tests/double_close_range.vgtest - uses close_range to double close the file descriptors - none/tests/file_dclose.vgtest - double closing regular file with regular close syscall https://bugs.kde.org/show_bug.cgi?id=471222 Co-Authored-By: Mark Wielaard <mark@klomp.org>
I'll test the new patch on FreeBSD tonight.
I had to modify double_close_range.c and socket_close.c like this #if defined(__linux__) #include <linux/close_range.h> #endif I get 3 new failures The first one has two things. Line number changes because of the Linux header. close/_close needs filtering. paulf> cat none/tests/socket_close*diff --- socket_close.stderr.exp 2024-03-10 10:30:00.201382000 +0100 +++ socket_close.stderr.out 2024-03-10 10:48:23.710309000 +0100 @@ -2,12 +2,12 @@ close socket_fd 3 and close the socket again 3 File descriptor 3: AF_UNIX socket 3: <unknown> is already closed - at 0x........: close (in /...libc...) - by 0x........: main (socket_close.c:37) + at 0x........: _close (in /...libc...) + by 0x........: main (socket_close.c:39) Previously closed - at 0x........: close (in /...libc...) - by 0x........: main (socket_close.c:33) + at 0x........: _close (in /...libc...) + by 0x........: main (socket_close.c:35) Originally opened at 0x........: socket (in /...libc...) - by 0x........: open_socket (socket_close.c:14) - by 0x........: main (socket_close.c:28) + by 0x........: open_socket (socket_close.c:16) + by 0x........: main (socket_close.c:30) The second one just needs a filter paulf> cat none/tests/file_dclose*diff --- file_dclose.stderr.exp 2024-03-10 10:30:00.199191000 +0100 +++ file_dclose.stderr.out 2024-03-10 10:47:29.153702000 +0100 @@ -1,13 +1,13 @@ close 3 time passes and we close 3 again File descriptor 3: file_dclose.tmp is already closed - at 0x........: close (in /...libc...) + at 0x........: _close (in /...libc...) by 0x........: closefile (file_dclose.c:16) by 0x........: main (file_dclose.c:32) Previously closed - at 0x........: close (in /...libc...) + at 0x........: _close (in /...libc...) by 0x........: main (file_dclose.c:28) Originally opened - at 0x........: creat (in /...libc...) + at 0x........: _openat (in /...libc...) by 0x........: openfile (file_dclose.c:10) by 0x........: main (file_dclose.c:25) And the last one is new. --- fdleak_ipv4.stderr.exp 2024-03-10 10:30:00.197868000 +0100 +++ fdleak_ipv4.stderr.out 2024-03-10 10:47:25.622064000 +0100 @@ -1,4 +1,16 @@ +File descriptor 4: internet is already closed + at 0x........: _close (in /...libc...) + by 0x........: client (fdleak_ipv4.c:70) + by 0x........: main (fdleak_ipv4.c:90) + Previously closed + at 0x........: _close (in /...libc...) + by 0x........: client (fdleak_ipv4.c:69) + by 0x........: main (fdleak_ipv4.c:90) + Originally opened + at 0x........: dup (in /...libc...) + by 0x........: client (fdleak_ipv4.c:68) + by 0x........: main (fdleak_ipv4.c:90) Is that a difference in the behaviour of fork between Linux and FreeBSD, or a but in my fork() code? Not sure. I can add the filters after you push the changes.
On second thoughts, isn't the fdleak_ipv4 double close message expected?
I needed the following on a system that didn't have close_range: diff --git a/none/tests/Makefile.am b/none/tests/Makefile.am index f56eb7faab67..a37bb27257a6 100644 --- a/none/tests/Makefile.am +++ b/none/tests/Makefile.am @@ -282,8 +282,11 @@ check_PROGRAMS = \ process_vm_readv_writev \ sigprocmask \ socket_close \ - file_dclose \ - double_close_range + file_dclose + +if HAVE_CLOSE_RANGE + check_PROGRAMS += double_close_range +endif if HAVE_NESTED_FUNCTIONS check_PROGRAMS += nestedfns
(In reply to Paul Floyd from comment #13) > I had to modify double_close_range.c and socket_close.c like this > > #if defined(__linux__) > #include <linux/close_range.h> > #endif > > I get 3 new failures > > The first one has two things. Line number changes because of the Linux > header. close/_close needs filtering. I added that to none/tests/double_close_range.c none/tests/socket_close.c doesn't need it, since it doesn't use any of the constants. I did it so the expected line numbers stay the same (removed two unused headers, added an extra line of whitespace)
(In reply to Paul Floyd from comment #14) > On second thoughts, isn't the fdleak_ipv4 double close message expected? Yes, it is. It also failed for me locally. I added it to the none/tests/fdleak_ipv4.stderr.exp file.
commit 6a28e665f8dd7c031aef5aa0eaa4acbbd8ba27a9 Author: Alexandra Hájková <ahajkova@redhat.com> Date: Wed Feb 28 09:02:15 2024 +0100 With --track-fds=yes warn when file descriptor is closed a second time We moved the record_fd_close call from POST to PRE sys_close handler, because the POST handler is only called on success. Even if the close syscall fails the file descriptor is still really closed/invalid. In the PRE handler the file descriptor is about to be closed, but hasn't been yet so we can capture also the description. This patch add new field fd_closed to OpenFd structure to record if the file descriptor was already closed. We now capture a backtrace when closing file descriptors to be able to print it in a case of a double close. Always add '<' brackets '>' around "unbound" in the description for consistency. getsockdetails now takes and returns a buffer describing the socket because we want to record it, not just print it. Note that close_range is handled similar to closing each descriptor individually. But the case when the close_range is called with an infinite end (~0U) is treated special. Add a new record_fd_close_range function which handles close_range with an infinite end so double close by close_range isn't an error because we don't want to loop over such a wide range. Add a new test cases: - none/tests/socket_close.vgtest - tests double closing a socket - none/tests/double_close_range.vgtest - uses close_range to double close the file descriptors - none/tests/file_dclose.vgtest - double closing regular file with regular close syscall https://bugs.kde.org/show_bug.cgi?id=471222 Co-Authored-By: Mark Wielaard <mark@klomp.org>