Bug 428004 - Illegal opcode in NTL C++ library
Summary: Illegal opcode in NTL C++ library
Status: RESOLVED DUPLICATE of bug 383010
Alias: None
Product: valgrind
Classification: Developer tools
Component: callgrind (show other bugs)
Version: 3.13.0
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Josef Weidendorfer
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-10-19 22:44 UTC by shaveer.bajpeyi
Modified: 2024-02-25 02:10 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description shaveer.bajpeyi 2020-10-19 22:44:11 UTC
SUMMARY
Hi, I encountered this error trace when trying to run a program using the NTL C++ library from https://shoup.net/ntl/.


vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0x75 0x48 0xEF 0xC9 0xC5 0xF9 0x2E 0xC1
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==11959== valgrind: Unrecognised instruction at address 0x177dab.
==11959==    at 0x177DAB: _ntl_IsFinite(double*) (ctools.cpp:130)
==11959==    by 0x144EDA: NTL::conv(NTL::RR&, double) (tools.h:404)
==11959==    by 0x14A4DE: NTL::ReallyComputePi(NTL::RR&) (RR.h:420)
==11959==    by 0x14A983: NTL::ComputePi(NTL::RR&) (RR.cpp:1666)
==11959==    by 0x10EDBB: _GLOBAL__sub_I_main (in /home/ubuntu/environment/HEAAN/HEAAN/run/TestHEAAN)
==11959==    by 0x177E7C: __libc_csu_init (in /home/ubuntu/environment/HEAAN/HEAAN/run/TestHEAAN)
==11959==    by 0x5A2CB27: (below main) (libc-start.c:266)
==11959== Your program just tried to execute an instruction that Valgrind
==11959== did not recognise.  There are two possible reasons for this.
==11959== 1. Your program has a bug and erroneously jumped to a non-code
==11959==    location.  If you are running Memcheck and you just saw a
==11959==    warning about a bad jump, it's probably your program's fault.
==11959== 2. The instruction is legitimate but Valgrind doesn't handle it,
==11959==    i.e. it's Valgrind's fault.  If you think this is the case or
==11959==    you are not sure, please let us know and we'll try to fix it.
==11959== Either way, Valgrind will now raise a SIGILL signal which will
==11959== probably kill your program.
==11959== 
==11959== Process terminating with default action of signal 4 (SIGILL)
==11959==  Illegal opcode at address 0x177DAB
==11959==    at 0x177DAB: _ntl_IsFinite(double*) (ctools.cpp:130)
==11959==    by 0x144EDA: NTL::conv(NTL::RR&, double) (tools.h:404)
==11959==    by 0x14A4DE: NTL::ReallyComputePi(NTL::RR&) (RR.h:420)
==11959==    by 0x14A983: NTL::ComputePi(NTL::RR&) (RR.cpp:1666)

The line identified in ctools.cpp at 130 is the return line of this function:

long _ntl_IsFinite(double *p)
{
   volatile double x = *p;
   *p = x;

   double y = x;
   double diff = y - x;
   return diff == 0.0;
}

I understand the function signature defines the return as type long when the return expression will return a boolean, but won't this translate to a 0 or 1? I am not sure why this crashes valgrind.


Using Ubuntu 18.04.5 LTS

Thanks for your help!
Comment 1 Paul Floyd 2020-10-20 10:18:44 UTC
This is marked as Valgrind 3.13. Can you try the same with the latest Valgrind (3.16.1 or build from source)?
Comment 2 shaveer.bajpeyi 2020-10-20 15:30:04 UTC
Hi, I updated to Valgrind-3.16.1 and was met with the same error:

==15739== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info
==15739== Command: ./TestHEAAN LogReg
==15739== 
==15739== For interactive control, run 'callgrind_control -h'.
vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0x75 0x48 0xEF 0xC9 0xC5 0xF9 0x2E 0xC1
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==15739== valgrind: Unrecognised instruction at address 0x177dab.
==15739==    at 0x177DAB: _ntl_IsFinite(double*) (ctools.cpp:130)
==15739==    by 0x144EDA: NTL::conv(NTL::RR&, double) (tools.h:404)
==15739==    by 0x14A4DE: NTL::ReallyComputePi(NTL::RR&) (RR.h:420)
==15739==    by 0x14A983: NTL::ComputePi(NTL::RR&) (RR.cpp:1666)
==15739==    by 0x10EDBB: _GLOBAL__sub_I_main (in /home/ubuntu/environment/HEAAN/HEAAN/run/TestHEAAN)
==15739==    by 0x177E7C: __libc_csu_init (in /home/ubuntu/environment/HEAAN/HEAAN/run/TestHEAAN)
==15739==    by 0x5A2CB27: (below main) (libc-start.c:266)
==15739== Your program just tried to execute an instruction that Valgrind
==15739== did not recognise.  There are two possible reasons for this.
==15739== 1. Your program has a bug and erroneously jumped to a non-code
==15739==    location.  If you are running Memcheck and you just saw a
==15739==    warning about a bad jump, it's probably your program's fault.
==15739== 2. The instruction is legitimate but Valgrind doesn't handle it,
==15739==    i.e. it's Valgrind's fault.  If you think this is the case or
==15739==    you are not sure, please let us know and we'll try to fix it.
==15739== Either way, Valgrind will now raise a SIGILL signal which will
==15739== probably kill your program.
==15739== 
==15739== Process terminating with default action of signal 4 (SIGILL)
==15739==  Illegal opcode at address 0x177DAB
==15739==    at 0x177DAB: _ntl_IsFinite(double*) (ctools.cpp:130)
==15739==    by 0x144EDA: NTL::conv(NTL::RR&, double) (tools.h:404)
==15739==    by 0x14A4DE: NTL::ReallyComputePi(NTL::RR&) (RR.h:420)
==15739==    by 0x14A983: NTL::ComputePi(NTL::RR&) (RR.cpp:1666)
==15739==    by 0x10EDBB: _GLOBAL__sub_I_main (in /home/ubuntu/environment/HEAAN/HEAAN/run/TestHEAAN)
==15739==    by 0x177E7C: __libc_csu_init (in /home/ubuntu/environment/HEAAN/HEAAN/run/TestHEAAN)
==15739==    by 0x5A2CB27: (below main) (libc-start.c:266)

Thanks
Comment 3 Paul Floyd 2020-11-23 10:10:33 UTC
According to https://defuse.ca/online-x86-assembler.htm this is

0:  62 f1 75 48 ef c9       vpxord zmm1,zmm1,zmm1

This is an AVX-512 opcode, see

https://en.wikipedia.org/wiki/AVX-512

AVX-512 support is still ongoing. 

Can you recompile without AVX-512, at least for testing?
Comment 4 Mark Wielaard 2021-02-28 21:57:36 UTC
AVX512 support is work in progress, please track bug #383010

*** This bug has been marked as a duplicate of bug 383010 ***