Bug 385279 - [PATCH] unhandled syscall: mach:43 (mach_generate_activity_id)
Summary: [PATCH] unhandled syscall: mach:43 (mach_generate_activity_id)
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.14 SVN
Platform: macOS (DMG) macOS
: NOR normal
Target Milestone: ---
Assignee: Rhys Kidd
URL:
Keywords:
: 387045 395136 (view as bug list)
Depends on:
Blocks: 365327
  Show dependency treegraph
 
Reported: 2017-10-01 22:16 UTC by Rhys Kidd
Modified: 2018-06-13 16:54 UTC (History)
8 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Handle mach_generate_activity_id for Mac OS >= 10.12 (3.09 KB, patch)
2017-11-29 07:48 UTC, Louis Brunner
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Rhys Kidd 2017-10-01 22:16:41 UTC
SYSCALL[38163,1](unix:399) sys_close ( 3 )[sync] --> Success(0x0:0x0)
SYSCALL[38163,1](mach:43) unimplemented (by the kernel) syscall: mach:43! (ni_syscall)
 --> [pre-fail] Failure(0x4e) eq_SyscallStatus:
  {78 0 43}
  {78 0 40}

valgrind: m_syswrap/syswrap-main.c:438 (Bool eq_SyscallStatus(UInt, SyscallStatus *, SyscallStatus *)): the 'impossible' happened.

host stacktrace:
==90616==    at 0x2580523BB: ???
==90616==    by 0x25805274C: ???
==90616==    by 0x258052723: ???
==90616==    by 0x2580EB9D4: ???
==90616==    by 0x2580EAFB9: ???
==90616==    by 0x2580E91E0: ???
==90616==    by 0x2580E79A0: ???
==90616==    by 0x2580F985E: ???

sched status:
  running_tid=1

Thread 1: status = VgTs_Runnable (lwpid 771)
==90616==    at 0x10076AEFA: mach_generate_activity_id (in /usr/lib/system/libsystem_kernel.dylib)
==90616==    by 0x1003A7F57: _voucher_activity_id_allocate_slow (in /usr/lib/system/libdispatch.dylib)
==90616==    by 0x1003A7209: voucher_activity_create_with_data (in /usr/lib/system/libdispatch.dylib)
==90616==    by 0x1007CEFE2: _os_activity_create_addr (in /usr/lib/system/libsystem_trace.dylib)
==90616==    by 0x10055D219: ds_user_byuid (in /usr/lib/system/libsystem_info.dylib)
==90616==    by 0x10055CD5B: si_user_byuid (in /usr/lib/system/libsystem_info.dylib)
==90616==    by 0x10055CE3E: search_item_bynumber (in /usr/lib/system/libsystem_info.dylib)
==90616==    by 0x10055CD96: search_user_byuid (in /usr/lib/system/libsystem_info.dylib)
==90616==    by 0x10055CD5B: si_user_byuid (in /usr/lib/system/libsystem_info.dylib)
==90616==    by 0x10055C27E: getpwuid (in /usr/lib/system/libsystem_info.dylib)
==90616==    by 0x10002B958: setupvals (in /bin/zsh)
==90616==    by 0x10002D0C5: zsh_main (in /bin/zsh)
==90616==    by 0x1003FA144: start (in /usr/lib/system/libdyld.dylib)
Comment 1 Rhys Kidd 2017-10-01 22:19:53 UTC
Mach syscall 43 was re-added in macOS 10.12 as mach_generate_activity_id

See further: https://opensource.apple.com/source/xnu/xnu-4570.1.46/osfmk/mach/syscall_sw.h.auto.html
Comment 2 Rhys Kidd 2017-11-18 21:57:42 UTC
*** Bug 387045 has been marked as a duplicate of this bug. ***
Comment 3 Louis Brunner 2017-11-29 07:48:48 UTC
Created attachment 109108 [details]
Handle mach_generate_activity_id for Mac OS >= 10.12

Hi Rhys,

This patch adds mach_generate_activity_id for Mac OS >= 10.12 in valgrind, which allows it to run on High Sierra (10.13).
Comment 4 Rhys Kidd 2017-12-11 04:54:52 UTC
*** Bug 387690 has been marked as a duplicate of this bug. ***
Comment 5 FX 2017-12-21 14:37:32 UTC
Louis' patch fixes it for me. I've been able to run various binaries (bash, zsh, clang, etc.) under valgrind on macOS 10.13.2 without any error.

Many binaries, however, generate some messages like early in their execution:

--55772-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option
--55772-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 2 times)
--55772-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 4 times)
--55772-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 8 times)
Comment 6 Rhys Kidd 2017-12-21 14:40:43 UTC
(In reply to FX from comment #5)
> Louis' patch fixes it for me. I've been able to run various binaries (bash,
> zsh, clang, etc.) under valgrind on macOS 10.13.2 without any error.
Thanks for the additional testing feedback -- this patch is looking good to get into the next release of Valgrind.

> Many binaries, however, generate some messages like early in their execution:
> 
> --55772-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option
> --55772-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 2
> times)
> --55772-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 4
> times)
> --55772-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 8
> times)
That is a separate issue, which is tracked at https://bugs.kde.org/show_bug.cgi?id=343306
Comment 7 FX 2017-12-21 14:42:08 UTC
No, wrote to fast :)
I found another issue:

$ /tmp/bin/valgrind /usr/bin/lpq
==55783== Memcheck, a memory error detector
==55783== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==55783== Using Valgrind-3.14.0.GIT and LibVEX; rerun with -h for copyright info
==55783== Command: /usr/bin/lpq
==55783== 
--55783-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option
--55783-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 2 times)
--55783-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 4 times)
--55783-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 8 times)
==55783== valgrind: Unrecognised instruction at address 0x1010bfd93.
==55783==    at 0x1010BFD93: _dispatch_kq_init (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1010BDD4F: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1010BDD02: dispatch_once_f (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1010DBAAE: _dispatch_kq_poll (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1010DB735: _dispatch_kq_drain (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1010DAB03: _dispatch_kq_unote_update (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1010D5E33: _dispatch_source_refs_register (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1010D5F5D: _dispatch_source_finalize_activation (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1010CEF49: _dispatch_queue_resume_finalize_activation (in /usr/lib/system/libdispatch.dylib)
==55783==    by 0x1014734F6: _notify_lib_init (in /usr/lib/system/libsystem_notify.dylib)
==55783==    by 0x101473B89: notify_register_dispatch (in /usr/lib/system/libsystem_notify.dylib)
==55783==    by 0x100332C27: CFUserNotificationCreate (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==55783== Your program just tried to execute an instruction that Valgrind
==55783== did not recognise.  There are two possible reasons for this.

Several other native Apple programs (Preview, TextEdit) as well as non-Apple (Atom) give the same crash and error stack (with different addresses).
Comment 8 Rhys Kidd 2017-12-21 14:45:08 UTC
(In reply to FX from comment #7)
> ==55783== valgrind: Unrecognised instruction at address 0x1010bfd93.
> ==55783==    at 0x1010BFD93: _dispatch_kq_init (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1010BDD4F: _dispatch_client_callout (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1010BDD02: dispatch_once_f (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1010DBAAE: _dispatch_kq_poll (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1010DB735: _dispatch_kq_drain (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1010DAB03: _dispatch_kq_unote_update (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1010D5E33: _dispatch_source_refs_register (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1010D5F5D: _dispatch_source_finalize_activation (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1010CEF49: _dispatch_queue_resume_finalize_activation (in
> /usr/lib/system/libdispatch.dylib)
> ==55783==    by 0x1014734F6: _notify_lib_init (in
> /usr/lib/system/libsystem_notify.dylib)
> ==55783==    by 0x101473B89: notify_register_dispatch (in
> /usr/lib/system/libsystem_notify.dylib)
> ==55783==    by 0x100332C27: CFUserNotificationCreate (in
> /System/Library/Frameworks/CoreFoundation.framework/Versions/A/
> CoreFoundation)
> ==55783== Your program just tried to execute an instruction that Valgrind
> ==55783== did not recognise.  There are two possible reasons for this.
> 
> Several other native Apple programs (Preview, TextEdit) as well as non-Apple
> (Atom) give the same crash and error stack (with different addresses).
That's also another known issue: tracked at https://bugs.kde.org/show_bug.cgi?id=383723

There is a draft patch available, which needs further testing and review.
Comment 9 FX 2017-12-21 14:54:02 UTC
The lpq issue: with the patch from this PR, I trigger 383723. Adding the patch at 383723, I get 380269. Adding the tentative patch posted at 380269, I still get an error:

==63832== Thread 2:
==63832== Invalid read of size 4
==63832==    at 0x1014E1E9B: _pthread_wqthread (in /usr/lib/system/libsystem_pthread.dylib)
==63832==    by 0x1014E1C4C: start_wqthread (in /usr/lib/system/libsystem_pthread.dylib)
==63832==  Address 0x18 is not stack'd, malloc'd or (recently) free'd
==63832== 
==63832== 
==63832== Process terminating with default action of signal 11 (SIGSEGV)
==63832==  Access not within mapped region at address 0x18
==63832==    at 0x1014E1E9B: _pthread_wqthread (in /usr/lib/system/libsystem_pthread.dylib)
==63832==    by 0x1014E1C4C: start_wqthread (in /usr/lib/system/libsystem_pthread.dylib)
==63832==  If you believe this happened as a result of a stack
==63832==  overflow in your program's main thread (unlikely but
==63832==  possible), you can try to increase the size of the
==63832==  main thread stack using the --main-stacksize= flag.
==63832==  The main thread stack size used in this run was 8388608.
--63832:0:schedule VG_(sema_down): read returned -4

valgrind: m_scheduler/scheduler.c:952 (void run_thread_for_a_while(HWord *, Int *, ThreadId, HWord, Bool)): Assertion 'VG_(in_generated_code) == False' failed.

host stacktrace:
==63832==    at 0x258053C2B: ???
==63832==    by 0x258053FBC: ???
==63832==    by 0x258053F93: ???
==63832==    by 0x2580EADFB: ???
==63832==    by 0x2580E9788: ???
==63832==    by 0x2580FB7DE: ???
==63832==    by 0x2580FBAAA: ???

sched status:
  running_tid=3

Thread 1: status = VgTs_WaitSys (lwpid 771)
==63832==    at 0x1014AC562: __workq_kernreturn (in /usr/lib/system/libsystem_kernel.dylib)
==63832==    by 0x1014E1C27: _pthread_workqueue_addthreads (in /usr/lib/system/libsystem_pthread.dylib)
==63832==    by 0x1010D1E4D: _dispatch_global_queue_poke_slow (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010D91BD: _dispatch_mach_send_push (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010DD8C8: _voucher_activity_debug_channel_init (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010DBAFC: _dispatch_kq_poll (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010DB735: _dispatch_kq_drain (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010DAB03: _dispatch_kq_unote_update (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010D5E33: _dispatch_source_refs_register (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010D5F5D: _dispatch_source_finalize_activation (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010CEF49: _dispatch_queue_resume_finalize_activation (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1014734F6: _notify_lib_init (in /usr/lib/system/libsystem_notify.dylib)
==63832==    by 0x101473B89: notify_register_dispatch (in /usr/lib/system/libsystem_notify.dylib)
==63832==    by 0x100332C27: CFUserNotificationCreate (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x1010BDD4F: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x1010BDD02: dispatch_once_f (in /usr/lib/system/libdispatch.dylib)
==63832==    by 0x100332ADD: CFUserNotificationGetTypeID (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x1001ED210: -[CFPrefsPlistSource createSynchronizeMessage] (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x100375F96: -[NSOrderedSet objectsWithOptions:passingTest:] (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x10037723E: +[NSOrderedSet orderedSetWithObject:] (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x100375CD2: -[NSOrderedSet objectWithOptions:passingTest:] (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x100375B64: -[NSOrderedSet isSubsetOfSet:] (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x10039BCE2: __CFGenerateReport (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x1001F5252: +[_CFXNotificationTokenRegistration createTokenRegistration:token:connection:notifyToken:options:queue:handler:] (in /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation)
==63832==    by 0x1000E33E8: cupsLangGet (in /usr/lib/libcups.2.dylib)
==63832==    by 0x1000E2A42: _cupsSetLocale (in /usr/lib/libcups.2.dylib)
==63832==    by 0x100000B72: ??? (in /usr/bin/lpq)
==63832==    by 0x101131114: start (in /usr/lib/system/libdyld.dylib)

Thread 2: status = VgTs_Yielding (lwpid 4107)
==63832==    at 0x1014E1E9B: _pthread_wqthread (in /usr/lib/system/libsystem_pthread.dylib)
==63832==    by 0x1014E1C4C: start_wqthread (in /usr/lib/system/libsystem_pthread.dylib)

Thread 3: status = VgTs_Runnable (lwpid 5635)
==63832==    at 0x1014E1C40: start_wqthread (in /usr/lib/system/libsystem_pthread.dylib)





I'm afraid that's as far as I can go :(
Comment 10 Rhys Kidd 2017-12-21 14:58:13 UTC
We know there's some low-level threading state we aren't handling properly. Sounds like you're hitting that.

Helpfully, there's a bug report tracking that too and a draft patch for that at https://bugs.kde.org/show_bug.cgi?id=380269
Comment 11 FX 2017-12-21 15:16:37 UTC
(In reply to Rhys Kidd from comment #10)
> Helpfully, there's a bug report tracking that too and a draft patch for that
> at https://bugs.kde.org/show_bug.cgi?id=380269

That's what I was saying. I'm seeing that error, after applying the draft patch at 380269.
Comment 12 Chris Wilson 2018-02-11 14:27:13 UTC
I would like to echo comment #5: the patch works for me too, thank you Louis and FX!
Comment 13 Rhys Kidd 2018-02-12 01:01:40 UTC
Thanks for the patch and testing!

Fixed in dcb83cf84 ("macos: Fix unhandled syscall: mach:43 (mach_generate_activity_id). bz#385279" which will be part of the next Valgrind release.
Comment 14 Rhys Kidd 2018-06-13 16:54:04 UTC
*** Bug 395136 has been marked as a duplicate of this bug. ***