uname -a Linux soft22 3.10.0-514.16.1.el7.x86_64 #1 SMP Fri Mar 10 13:12:32 EST 2017 x86_64 x86_64 x86_64 GNU/Linux cpu-info list a buch of: vendor_id : GenuineIntel cpu family : 6 model : 61 model name : Intel Core Processor (Broadwell) stepping : 2 microcode : 0x1 cpu MHz : 2299.996 cache size : 4096 KB physical id : 11 siblings : 3 core id : 2 cpu cores : 3 apicid : 46 initial apicid : 46 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch arat fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt bogomips : 4599.99 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: Log file: Valgrind options: -v --tool=callgrind Contents of /proc/version: Linux version 3.10.0-514.16.1.el7.x86_64 (mockbuild@x86-039.build.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) ) #1 SMP Fri Mar 10 13:12:32 EST 2017 Arch and hwcaps: AMD64, LittleEndian, amd64-cx16-lzcnt-rdtscp-sse3-ssse3-avx-avx2-bmi-f16c-rdrand Page sizes: currently 4096, max supported 4096 Valgrind library directory: /usr/libexec/valgrind For interactive control, run 'callgrind_control -h'. Reading syms from /opt/proprietary/bin/stuff Reading syms from /usr/lib64/ld-2.17.so Reading syms from /usr/libexec/valgrind/callgrind-amd64-linux object doesn't have a symbol table object doesn't have a dynamic symbol table Scheduler: using generic scheduler lock implementation. embedded gdbserver: reading from /tmp/vgdb-pipe-from-vgdb-to-288683-by-so-on-soft22 embedded gdbserver: writing to /tmp/vgdb-pipe-to-vgdb-from-288683-by-so-on-soft22 embedded gdbserver: shared mem /tmp/vgdb-pipe-shared-mem-vgdb-288683-by-so-on-soft22 Reading syms from /usr/libexec/valgrind/vgpreload_core-amd64-linux.so Reading syms from /opt/opencobol/ocesql-1.2.0/lib/libocesql.so.0.0.0 Reading syms from /opt/mqm_7.0.1/lib64/libmqicb.so Reading syms from /opt/lib/libmd5.so Reading syms from /opt/proprietary/stuff Reading syms from /usr/lib64/libdl-2.17.so Reading syms from /usr/lib64/libc-2.17.so Reading syms from /usr/lib64/libm-2.17.so Reading syms from /usr/lib64/libpthread-2.17.so Reading syms from /usr/lib64/libnsl-2.17.so Reading syms from /usr/lib64/librt-2.17.so Reading syms from /usr/lib64/libaio.so.1.0.1 object doesn't have a symbol table Reading syms from /usr/lib64/libresolv-2.17.so Reading syms from /opt/postgres/cli12/lib/libpq.so.5.12 Reading syms from /usr/lib64/libxml2.so.2.9.1 object doesn't have a symbol table Reading syms from /opt/mqm_7.0.1/lib64/libmqiz.so Reading syms from /usr/lib64/libgcc_s-4.8.5-20150702.so.1 object doesn't have a symbol table Reading syms from /usr/lib64/libz.so.1.2.7 object doesn't have a symbol table Reading syms from /usr/lib64/liblzma.so.5.2.2 object doesn't have a symbol table Reading syms from /opt/mqm_7.0.1/lib64/libmqmcs.so REDIR: 0xffffffffff600000 (???:???) redirected to 0x580a48c7 (???) Reading syms from /usr/lib64/libnss_files-2.17.so Reading syms from /usr/lib64/libnss_sss.so.2 object doesn't have a symbol table REDIR: 0xffffffffff600400 (???:???) redirected to 0x580a48d1 (???) vex amd64->IR: unhandled instruction bytes: 0xF3 0x49 0xF 0x6F 0x9C 0x24 0x60 0x2 0x0 0x0 vex amd64->IR: REX=1 REX.W=1 REX.R=0 REX.X=0 REX.B=1 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=1 at 0x426A2B6: ??? by 0x4257482: ??? by 0x423F095: ??? by 0x4157222: ??? by 0x98886BB: func_call (in /opt/proprietary/lib/stuff.so.3) by 0x985B4A1: do_call (in /opt/proprietary/lib/stuff.so.3) by 0x9857432: prop_main (in /opt/proprietary/lib/stuff.so.3) by 0x98574EE: prop_init (in /opt/proprietary/lib/stuff.so.3) by 0x4022C7: main (in /opt/proprietary/bin/stuff) Note: memcheck did not report a bad jump. Sadly: while I (currently) can reproduce it, this happens in /opt/proprietary/bin/stuff (or its library), so I cannot share a reproducer. And "of course" profiling another app with a similar call stack works just fine, obviously because the instruction is in the proprietary stuff (which I actually do not want to profile as there's no option for replacing it, I only want to profile/memcheck the library it calls libocesql->libpq).
Unless I'm misreading something that is movdqu but I'm surprised that isn't implemented. Perhaps it some something detail of the operands that it doesn't like.
Ah yes it has a redundant REX.W prefix so sz is 8 but we only match sz of 4.
However you figured that out (linking a pointer for that would be cool): marvellous, and it sounds like a patch is nearly done, too. ... after that is done I "only" need to find a way to compile a nightly tarball (after finding it) soon...
I used the Intel manuals (https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html) to manually decode the instruction then checked the valgrind source to what conditions we were applying when decoding it. There's a long standing issue of unusual compilers setting the REX.W flag (which valgrind has listed as set in it's error) on instructions that do the same thing with or without it so it's the kind of thing we're quite used to seeing.
Created attachment 139502 [details] Patch to ignore REX.W for MOVDQU Try this patch and see if it helps...
(In reply to Tom Hughes from comment #5) > Created attachment 139502 [details] > Patch to ignore REX.W for MOVDQU > > Try this patch and see if it helps... Yes, brought me directly to the next one: vx amd64->IR: unhandled instruction bytes: 0xF3 0x49 0xF 0x7F 0x4B 0xE0 0xF3 0x49 0xF 0x7F vex amd64->IR: REX=1 REX.W=1 REX.R=0 REX.X=0 REX.B=1 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=1 Following the lead with the patch I've did the same for the 0x7F case /* F3 0F 7F = MOVDQU -- move from G (xmm) to E (mem or xmm). */ allowing a size of either 4 or 8 - now the application can be run with valgrind. Thank you very much - I hope those two patches are well enough to get into the next release.
I've added both those changed as 43543527a293e626e601202ca4eeb2216f40815d.