Bug 458218 - memcheck "unhandled instruction bytes" but code seems fine
Summary: memcheck "unhandled instruction bytes" but code seems fine
Status: RESOLVED DUPLICATE of bug 383010
Alias: None
Product: valgrind
Classification: Developer tools
Component: memcheck (show other bugs)
Version: 3.18.1
Platform: RedHat Enterprise Linux Linux
: NOR crash
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-08-23 17:43 UTC by Manaure
Modified: 2022-09-26 08:34 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Manaure 2022-08-23 17:43:41 UTC
SUMMARY
***
NOTE: If you are reporting a crash, please try to attach a backtrace with debug symbols.
See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports
***

In running the
unit/ctest_fem_parproj
unit test in the gkylzero code (https://github.com/ammarhakim/gkylzero/) with

valgrind --leak-check=full ./build/unit/ctest_fem_parproj test_1x_p1_periodic

I get a
"vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFE 0x8 0x6F 0xA3 0x68 0x0 0x0 0x0"
error and some messages pointing to lines of code that seem perfectly fine (see below).

STEPS TO REPRODUCE
1. Download and install gkylzero (https://github.com/ammarhakim/gkylzero/) following instructions in the README (essentially install dependencies and run makefile, see example pre-built scripts in /machines).
2. Run the fem_parproj unit test: valgrind --leak-check=full ./build/unit/ctest_fem_parproj test_1x_p1_periodic

OBSERVED RESULT
==4071452== Memcheck, a memory error detector
==4071452== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==4071452== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==4071452== Command: ./build/unit/ctest_fem_parproj test_1x_p1_periodic
==4071452==
Test test_1x_p1_periodic...                     vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFE 0x8 0x6F 0xA3 0x68 0x0 0x0 0x0
vex amd64->IR:   REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR:   VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE
vex amd64->IR:   PFX.66=0 PFX.F2=0 PFX.F3=0
==4071452== valgrind: Unrecognised instruction at address 0x40ce64.
==4071452==    at 0x40CE64: gkyl_range_init (range.c:77)
==4071452==    by 0x41143D: gkyl_create_grid_ranges (rect_decomp.c:17)
==4071452==    by 0x404A9C: test_1x (ctest_fem_parproj.c:93)
==4071452==    by 0x403D01: test_do_run_ (acutest.h:1011)
==4071452==    by 0x401FE3: test_run_ (acutest.h:1182)
==4071452==    by 0x401FE3: main (acutest.h:1718)
==4071452== Your program just tried to execute an instruction that Valgrind
==4071452== did not recognise.  There are two possible reasons for this.
==4071452== 1. Your program has a bug and erroneously jumped to a non-code
==4071452==    location.  If you are running Memcheck and you just saw a
==4071452==    warning about a bad jump, it's probably your program's fault.
==4071452== 2. The instruction is legitimate but Valgrind doesn't handle it,
==4071452==    i.e. it's Valgrind's fault.  If you think this is the case or
==4071452==    you are not sure, please let us know and we'll try to fix it.
==4071452== Either way, Valgrind will now raise a SIGILL signal which will
==4071452== probably kill your program.
==4071452==
==4071452== Process terminating with default action of signal 4 (SIGILL)
==4071452==  Illegal opcode at address 0x40CE64
==4071452==    at 0x40CE64: gkyl_range_init (range.c:77)
==4071452==    by 0x41143D: gkyl_create_grid_ranges (rect_decomp.c:17)
==4071452==    by 0x404A9C: test_1x (ctest_fem_parproj.c:93)
==4071452==    by 0x403D01: test_do_run_ (acutest.h:1011)
==4071452==    by 0x401FE3: test_run_ (acutest.h:1182)
==4071452==    by 0x401FE3: main (acutest.h:1718)
==4071452==
==4071452== HEAP SUMMARY:
==4071452==     in use at exit: 1,152 bytes in 2 blocks
==4071452==   total heap usage: 2 allocs, 0 frees, 1,152 bytes allocated
==4071452==
==4071452== LEAK SUMMARY:
==4071452==    definitely lost: 0 bytes in 0 blocks
==4071452==    indirectly lost: 0 bytes in 0 blocks
==4071452==      possibly lost: 0 bytes in 0 blocks
==4071452==    still reachable: 1,152 bytes in 2 blocks
==4071452==         suppressed: 0 bytes in 0 blocks
==4071452== Reachable blocks (those to which a pointer was found) are not shown.
==4071452== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==4071452==
==4071452== For lists of detected and suppressed errors, rerun with: -s
==4071452== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Illegal instruction (core dumped)

EXPECTED RESULT
No errors or SIGILLs. Just a message that the test passed like

Test test_1x_p1_periodic...                     [ OK ]
SUCCESS: All unit tests have passed.

which you get if you run the test without valgrind via
./build/unit/ctest_fem_parproj test_1x_p1_periodic

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Linux RedHat I believe. The command uname -r returns
4.18.0-372.19.1.el8_6.x86_64

And I compiled gkylzero with
cc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-10)

Line 77 in range.c that the error refers to is the second of these two:

  int idxZero[GKYL_MAX_DIM];
  for (int i=0; i<ndim; ++i) idxZero[i] = 0;

where GKYL_MAX_DIM=6 and ndim<=6.
Comment 1 Paul Floyd 2022-08-30 12:54:17 UTC
This looks like

 0:  62 f1 fe 08 6f a3 68    vmovdqu64 xmm4,XMMWORD PTR [rbx+0x68]

so some form of AVX. Valgrind doesn't have full AVX support.

Can you build your application without AVX?

Or else try the patches here https://bugs.kde.org/show_bug.cgi?id=383010 (and let us know if that works for you).
Comment 2 Paul Floyd 2022-09-26 08:34:56 UTC
No answer so I'm assuming that this is a duplicate.

*** This bug has been marked as a duplicate of bug 383010 ***