Bug 398086

Summary: Unrecognised instruction with X11 + OpenGL programs
Product: [Developer tools] valgrind Reporter: Jamie Zawinski <jwz>
Component: generalAssignee: Rhys Kidd <rhyskidd>
Status: REPORTED ---    
Severity: crash CC: jseward, jwz, mark, pjfloyd, rhyskidd, tom
Priority: NOR    
Version: 3.14 SVN   
Target Milestone: ---   
Platform: unspecified   
OS: macOS   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Bug Depends on: 383199    
Bug Blocks:    
Attachments: valgrind -v log
run log 2

Description Jamie Zawinski 2018-08-31 02:41:22 UTC
Created attachment 114712 [details]
valgrind -v log

Valgrind works fine on X11 applications, but any that try to use OpenGL fail with an illegal instruction. Log attached.

macOS 10.13.6 (17G65), iMacPro1,1
Darwin traitor.local 17.7.0 Darwin Kernel Version 17.7.0: Thu Jun 21 22:53:14 PDT 2018; root:xnu-4570.71.2~1/RELEASE_X86_64 x86_64

(This is with anything from xscreensaver/hacks/glx/ in case you'd like to try and duplicate it yourself.)
Comment 1 Mark Wielaard 2018-08-31 07:46:45 UTC
Relevant part of the log:

--172-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option
--172-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 2 times)
--172-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 4 times)
--172-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 8 times)
--172-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 16 times)
--172-- WARNING: unhandled amd64-darwin syscall: unix:463
==172==    at 0x10210B8C2: openat (in /usr/lib/system/libsystem_kernel.dylib)
==172==    by 0x102174733: _os_trace_read_file_at (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x10217494B: _os_trace_read_plist_at (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x1021702A0: _os_log_preferences_load (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x10217100F: _os_log_preferences_refresh (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x102170BBB: os_log_create (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x104A270B5: __SLSLogInit_block_invoke (in /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/SkyLight)
==172==    by 0x101C89DB7: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C89D6A: dispatch_once_f (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x104B175E7: __SLSInitialize_block_invoke (in /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/SkyLight)
==172==    by 0x101C89DB7: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C89D6A: dispatch_once_f (in /usr/lib/system/libdispatch.dylib)
--172-- You may be able to write your own handler.
--172-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--172-- Nevertheless we consider this a bug.  Please report
--172-- it at http://valgrind.org/support/bug_reports.html.
--172-- WARNING: unhandled amd64-darwin syscall: unix:463
==172==    at 0x10210B8C2: openat (in /usr/lib/system/libsystem_kernel.dylib)
==172==    by 0x102174733: _os_trace_read_file_at (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x10217494B: _os_trace_read_plist_at (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x102170360: _os_log_preferences_load (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x10217100F: _os_log_preferences_refresh (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x102170BBB: os_log_create (in /usr/lib/system/libsystem_trace.dylib)
==172==    by 0x104A270B5: __SLSLogInit_block_invoke (in /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/SkyLight)
==172==    by 0x101C89DB7: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C89D6A: dispatch_once_f (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x104B175E7: __SLSInitialize_block_invoke (in /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/SkyLight)
==172==    by 0x101C89DB7: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C89D6A: dispatch_once_f (in /usr/lib/system/libdispatch.dylib)
--172-- You may be able to write your own handler.
--172-- Read the file README_MISSING_SYSCALL_OR_IOCTL.
--172-- Nevertheless we consider this a bug.  Please report
--172-- it at http://valgrind.org/support/bug_reports.html.
==172== valgrind: Unrecognised instruction at address 0x101c8bdfc.
==172==    at 0x101C8BDFC: _dispatch_kq_init (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C89DB7: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C89D6A: dispatch_once_f (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA7ABA: _dispatch_kq_poll (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA7741: _dispatch_kq_drain (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA6B0F: _dispatch_kq_unote_update (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA1E3E: _dispatch_source_refs_register (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA1F68: _dispatch_source_finalize_activation (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C9AF54: _dispatch_queue_resume_finalize_activation (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x1049DCEF5: (anonymous namespace)::RunElsewhere::instance() (in /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/SkyLight)
==172==    by 0x104B17565: __SLSInitialize_block_invoke (in /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/SkyLight)
==172==    by 0x101C89DB7: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==172== Your program just tried to execute an instruction that Valgrind
==172== did not recognise.  There are two possible reasons for this.
==172== 1. Your program has a bug and erroneously jumped to a non-code
==172==    location.  If you are running Memcheck and you just saw a
==172==    warning about a bad jump, it's probably your program's fault.
==172== 2. The instruction is legitimate but Valgrind doesn't handle it,
==172==    i.e. it's Valgrind's fault.  If you think this is the case or
==172==    you are not sure, please let us know and we'll try to fix it.
==172== Either way, Valgrind will now raise a SIGILL signal which will
==172== probably kill your program.
==172== 
==172== Process terminating with default action of signal 4 (SIGILL)
==172==  Illegal opcode at address 0x101C8BDFC
==172==    at 0x101C8BDFC: _dispatch_kq_init (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C89DB7: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C89D6A: dispatch_once_f (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA7ABA: _dispatch_kq_poll (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA7741: _dispatch_kq_drain (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA6B0F: _dispatch_kq_unote_update (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA1E3E: _dispatch_source_refs_register (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101CA1F68: _dispatch_source_finalize_activation (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x101C9AF54: _dispatch_queue_resume_finalize_activation (in /usr/lib/system/libdispatch.dylib)
==172==    by 0x1049DCEF5: (anonymous namespace)::RunElsewhere::instance() (in /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/SkyLight)
==172==    by 0x104B17565: __SLSInitialize_block_invoke (in /System/Library/PrivateFrameworks/SkyLight.framework/Versions/A/SkyLight)
==172==    by 0x101C89DB7: _dispatch_client_callout (in /usr/lib/system/libdispatch.dylib)
==172== 

So it could be some issue with the unknown syscalls. It would also be helpful to have the disassembly from at 0x101C8BDFC: _dispatch_kq_init (in /usr/lib/system/libdispatch.dylib) to know which instruction it precisely is.
Comment 2 Rhys Kidd 2018-08-31 19:57:25 UTC
Thanks for the bug report and guide to reproduce the issue. Two issues here as Mark extracted from the log:

1. WARNING: unhandled amd64-darwin syscall: unix:463

This is already been tracked at bz#383199.

2. 0x101C8BDFC: _dispatch_kq_init (in /usr/lib/system/libdispatch.dylib)

The illegal instruction can be printed with the following command (I think, haven't actually checked as not in front of a Terminal):

otool -tvV /usr/lib/system/libdispatch.dylib | grep 101c8bdfc
Comment 3 Julian Seward 2018-09-04 06:40:04 UTC
It didn't actually print the undecodeable bytes, as it normally does in
such cases.  So I'd guess the "failing" insn is ud2 for which I think we
indeed don't print the failing bytes.  If I had to guess I'd say it is
libdispatch.dylib's way of throwing a fatal assertion failure following
the syscall failures, which it presumably can't recover from.

The otool command might not find the insn because it assumes that 
libdispatch.dylib got mapped into memory with zero "slide" -- at the
same address that it statically contains -- which is unlikely.  One
strategy is for jwz to rerun with --demangle=no --sym-offsets=yes, and
then find the insn by using the symbolname + offset pairing.
Comment 4 Rhys Kidd 2018-09-04 19:47:43 UTC
(In reply to Julian Seward from comment #3)
> It didn't actually print the undecodeable bytes, as it normally does in
> such cases.  So I'd guess the "failing" insn is ud2 for which I think we
> indeed don't print the failing bytes.  If I had to guess I'd say it is
> libdispatch.dylib's way of throwing a fatal assertion failure following
> the syscall failures, which it presumably can't recover from.

Makes sense. I recall seeing something similar before in another occasion where valgrind didn't provide support for a mach syscall, and a 'ud2' error was hit shortly thereafter.

I'll leave this bug open for now, but suspect it will be addressed by the eventual fix for bz#383199 which provides unix:463 (openat) support.
Comment 5 Jamie Zawinski 2018-09-04 20:38:04 UTC
Created attachment 114783 [details]
run log 2

Here's the output with -v --demangle=no --sym-offsets=yes
Comment 6 Rhys Kidd 2018-10-12 17:57:23 UTC
Thanks, I'll take another look at this one.
Comment 7 Tom Hughes 2019-08-04 09:04:31 UTC
There is a similar example in https://bugs.kde.org/show_bug.cgi?id=410562 of a _dispatch_kq routine hitting an undefined instruction which is confirmed there as ud2.

I won't close this as dupe as it has a second issue but https://bugs.kde.org/show_bug.cgi?id=191062 is the main ud2 bug.