Bug 381815 - Assertion 'newfd >= VG_(fd_hard_limit)' failed
Summary: Assertion 'newfd >= VG_(fd_hard_limit)' failed
Status: CONFIRMED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.14 SVN
Platform: macOS (DMG) macOS
: NOR normal
Target Milestone: ---
Assignee: Rhys Kidd
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-30 08:22 UTC by Kirill A. Korinsky
Modified: 2017-07-14 09:53 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
don't crash on huge limits (1.51 KB, patch)
2017-07-10 08:39 UTC, Kirill A. Korinsky
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kirill A. Korinsky 2017-06-30 08:22:54 UTC
Good day!

I've tried to run valgrind (trunk, revision 16457) on macOS 10.12.5 and it failed if I setup a huge number as file descriptors at ulimit.

For example:


➜  /tmp cat test.c      
int main() {
	return 0;
}
➜  /tmp clang test.c 
➜  /tmp ulimit -n 1024  
➜  /tmp valgrind ./a.out
==28447== Memcheck, a memory error detector
==28447== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==28447== Using Valgrind-3.14.0.SVN and LibVEX; rerun with -h for copyright info
==28447== Command: ./a.out
==28447== 
==28447== Syscall param msg->desc.port.name points to uninitialised byte(s)
==28447==    at 0x1003A734A: mach_msg_trap (in /usr/lib/system/libsystem_kernel.dylib)
==28447==    by 0x1003A6796: mach_msg (in /usr/lib/system/libsystem_kernel.dylib)
==28447==    by 0x1003A0485: task_set_special_port (in /usr/lib/system/libsystem_kernel.dylib)
==28447==    by 0x10053C10E: _os_trace_create_debug_control_port (in /usr/lib/system/libsystem_trace.dylib)
==28447==    by 0x10053C458: _libtrace_init (in /usr/lib/system/libsystem_trace.dylib)
==28447==    by 0x1000A59DF: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==28447==    by 0x100017A1A: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==28447==    by 0x100017C1D: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==28447==    by 0x1000134A9: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==28447==    by 0x100013440: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==28447==    by 0x100012523: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==28447==    by 0x1000125B8: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==28447==  Address 0x10488accc is on thread 1's stack
==28447==  in frame #2, created by task_set_special_port (???:)
==28447== 
==28447== 
==28447== HEAP SUMMARY:
==28447==     in use at exit: 18,307 bytes in 162 blocks
==28447==   total heap usage: 178 allocs, 16 frees, 24,451 bytes allocated
==28447== 
==28447== LEAK SUMMARY:
==28447==    definitely lost: 408 bytes in 8 blocks
==28447==    indirectly lost: 6,888 bytes in 8 blocks
==28447==      possibly lost: 72 bytes in 3 blocks
==28447==    still reachable: 32 bytes in 1 blocks
==28447==         suppressed: 10,907 bytes in 142 blocks
==28447== Rerun with --leak-check=full to see details of leaked memory
==28447== 
==28447== For counts of detected and suppressed errors, rerun with: -v
==28447== Use --track-origins=yes to see where uninitialised values come from
==28447== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 4)
➜  /tmp ulimit -n 262144
➜  /tmp valgrind ./a.out

valgrind: m_libcfile.c:68 (Int vgPlain_safe_fd(Int)): Assertion 'newfd >= VG_(fd_hard_limit)' failed.
[1]    28477 segmentation fault  valgrind ./a.out
➜  /tmp
Comment 1 Tom Hughes 2017-06-30 09:22:32 UTC
Well as far as I can see this means that macOS has not processed our fcntl correctly.

What that code does is basically this:

  newfd = VG_(fcntl)(oldfd, VKI_F_DUPFD, VG_(fd_hard_limit));
  ...
  vg_assert(newfd >= VG_(fd_hard_limit));

Now the first line explicitly asks it to duplicate to an FD that is greater than or equal to our fake hard limit (ie in the valgrind reserved space) and if the assertion fails then that means it must have given us an FD less than that.

The only other option I guess is that the fcntl errored as we haven't checked for that other than when deciding if to close the old descriptor.

Probably worth getting a system call trace (does macOS have strace/truss or equivalent?) so we can see what that call is actually returning.
Comment 2 Kirill A. Korinsky 2017-06-30 10:08:10 UTC
Thanks for update.

OS X doesn't have strace but it has dtruss that works only from root. and this bug doesn't reproduce from root user.

Let me take some time to find a way to get the trace.
Comment 3 Kirill A. Korinsky 2017-07-01 00:19:00 UTC
Tom, I have one idea.

Can you confirm that fd_hard_limit is getting from getrlimit? If so I think the right way to OS X is using getdtablesize and be sure that F_DUPFD argument less that returned value by getdtablesize.
Comment 4 Kirill A. Korinsky 2017-07-09 00:49:18 UTC
Hey,

I can't trace valgrind because it calls fcntl over this code: https://gist.github.com/catap/a20b8b1f46b0ba79b9f7297e73df0563

But I can run it in lldb ant provide some output to you.

With ulimit -n 1024:

➜  bin ulimit -n 1024
➜  bin env RETRACE_CONFIG=./retrace.conf.rnp VALGRIND_LAUNCHER=/Users/catap/Documents/Riboseinc/valgrind/bin/executable_path=./valgrind lldb -- ../lib/valgrind/memcheck-amd64-darwin /tmp/a.out
(lldb) target create "../lib/valgrind/memcheck-amd64-darwin"
Current executable set to '../lib/valgrind/memcheck-amd64-darwin' (x86_64).
(lldb) settings set -- target.run-args  "/tmp/a.out"
(lldb) b vgPlain_safe_fd
Breakpoint 1: where = memcheck-amd64-darwin`vgPlain_safe_fd + 18 at m_libcfile.c:59, address = 0x0000000258081452
(lldb) r
Process 14822 launched: '../lib/valgrind/memcheck-amd64-darwin' (x86_64)
Process 14822 stopped
* thread #1, stop reason = breakpoint 1.1
    frame #0: 0x0000000258081452 memcheck-amd64-darwin`vgPlain_safe_fd(oldfd=3) at m_libcfile.c:59
   56  	{
   57  	   Int newfd;
   58  	
-> 59  	   vg_assert(VG_(fd_hard_limit) != -1);
   60  	
   61  	   newfd = VG_(fcntl)(oldfd, VKI_F_DUPFD, VG_(fd_hard_limit));
   62  	   if (newfd != -1)
(lldb) n
Process 14822 stopped
* thread #1, stop reason = step over
    frame #0: 0x00000002580814ac memcheck-amd64-darwin`vgPlain_safe_fd(oldfd=3) at m_libcfile.c:61
   58  	
   59  	   vg_assert(VG_(fd_hard_limit) != -1);
   60  	
-> 61  	   newfd = VG_(fcntl)(oldfd, VKI_F_DUPFD, VG_(fd_hard_limit));
   62  	   if (newfd != -1)
   63  	      VG_(close)(oldfd);
   64  	
(lldb) n
Process 14822 stopped
* thread #1, stop reason = step over
    frame #0: 0x00000002580814ba memcheck-amd64-darwin`vgPlain_safe_fd(oldfd=3) at m_libcfile.c:62
   59  	   vg_assert(VG_(fd_hard_limit) != -1);
   60  	
   61  	   newfd = VG_(fcntl)(oldfd, VKI_F_DUPFD, VG_(fd_hard_limit));
-> 62  	   if (newfd != -1)
   63  	      VG_(close)(oldfd);
   64  	
   65  	   /* Set the close-on-exec flag for this fd. */
(lldb) p newfd
(Int) $0 = 1024
(lldb) ^D


With ulimit 262144:

➜  bin env RETRACE_CONFIG=./retrace.conf.rnp VALGRIND_LAUNCHER=/Users/catap/Documents/Riboseinc/valgrind/bin/executable_path=./valgrind lldb -- ../lib/valgrind/memcheck-amd64-darwin /tmp/a.out
(lldb) target create "../lib/valgrind/memcheck-amd64-darwin"
Current executable set to '../lib/valgrind/memcheck-amd64-darwin' (x86_64).
(lldb) settings set -- target.run-args  "/tmp/a.out"
(lldb) b vgPlain_safe_fd
Breakpoint 1: where = memcheck-amd64-darwin`vgPlain_safe_fd + 18 at m_libcfile.c:59, address = 0x0000000258081452
(lldb) r
Process 14653 launched: '../lib/valgrind/memcheck-amd64-darwin' (x86_64)
Process 14653 stopped
* thread #1, stop reason = breakpoint 1.1
    frame #0: 0x0000000258081452 memcheck-amd64-darwin`vgPlain_safe_fd(oldfd=3) at m_libcfile.c:59
   56  	{
   57  	   Int newfd;
   58  	
-> 59  	   vg_assert(VG_(fd_hard_limit) != -1);
   60  	
   61  	   newfd = VG_(fcntl)(oldfd, VKI_F_DUPFD, VG_(fd_hard_limit));
   62  	   if (newfd != -1)
(lldb) n
Process 14653 stopped
* thread #1, stop reason = step over
    frame #0: 0x00000002580814ac memcheck-amd64-darwin`vgPlain_safe_fd(oldfd=3) at m_libcfile.c:61
   58  	
   59  	   vg_assert(VG_(fd_hard_limit) != -1);
   60  	
-> 61  	   newfd = VG_(fcntl)(oldfd, VKI_F_DUPFD, VG_(fd_hard_limit));
   62  	   if (newfd != -1)
   63  	      VG_(close)(oldfd);
   64  	
(lldb) n
Process 14653 stopped
* thread #1, stop reason = step over
    frame #0: 0x00000002580814ba memcheck-amd64-darwin`vgPlain_safe_fd(oldfd=3) at m_libcfile.c:62
   59  	   vg_assert(VG_(fd_hard_limit) != -1);
   60  	
   61  	   newfd = VG_(fcntl)(oldfd, VKI_F_DUPFD, VG_(fd_hard_limit));
-> 62  	   if (newfd != -1)
   63  	      VG_(close)(oldfd);
   64  	
   65  	   /* Set the close-on-exec flag for this fd. */
(lldb) p newfd
(Int) $0 = -1
(lldb)
Comment 5 Kirill A. Korinsky 2017-07-10 08:39:27 UTC
Created attachment 106527 [details]
don't crash on huge limits
Comment 6 Kirill A. Korinsky 2017-07-10 09:12:14 UTC
Anyway, I just fixed it :)
Comment 7 Beimbet Daribayev 2017-07-12 06:10:10 UTC
Kirill A.Korinsky do you in Russian? )
Comment 8 Beimbet Daribayev 2017-07-12 06:13:02 UTC
I have a same problem. OS: Mac OS 10.12.5. What does this error/warning mean?

==66220== Syscall param msg->desc.port.name points to uninitialised byte(s)
==66220==    at 0x1004E834A: mach_msg_trap (in /usr/lib/system/libsystem_kernel.dylib)
==66220==    by 0x1004E7796: mach_msg (in /usr/lib/system/libsystem_kernel.dylib)
==66220==    by 0x1004E1485: task_set_special_port (in /usr/lib/system/libsystem_kernel.dylib)
==66220==    by 0x10067D10E: _os_trace_create_debug_control_port (in /usr/lib/system/libsystem_trace.dylib)
==66220==    by 0x10067D458: _libtrace_init (in /usr/lib/system/libsystem_trace.dylib)
==66220==    by 0x1000F89DF: libSystem_initializer (in /usr/lib/libSystem.B.dylib)
==66220==    by 0x10001FA1A: ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==66220==    by 0x10001FC1D: ImageLoaderMachO::doInitialization(ImageLoader::LinkContext const&) (in /usr/lib/dyld)
==66220==    by 0x10001B4A9: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==66220==    by 0x10001B440: ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==66220==    by 0x10001A523: ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) (in /usr/lib/dyld)
==66220==    by 0x10001A5B8: ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) (in /usr/lib/dyld)
==66220==  Address 0x104892cdc is on thread 1's stack
==66220==  in frame #2, created by task_set_special_port (???:)
Comment 9 Tom Hughes 2017-07-12 06:28:44 UTC
Actually that's not the same at all - in fact it's not even a bug (in valgrind) it's just a normal error report about a bug in your program where it is passing uninitialised data to a system call.

Please use the users mailing list if you need more help understanding what it means - it is not relevant to this bug.