Bug 271615 - unhandled instruction "popcnt" (arch=amd10h)
Summary: unhandled instruction "popcnt" (arch=amd10h)
Status: REOPENED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.9.0
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: Julian Seward
URL:
Keywords:
: 349891 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-04-24 14:36 UTC by Thomas Eschenbacher
Modified: 2020-11-25 18:26 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
patch to add support for the SSE4.2 instruction "popcnt" for X86 (4.16 KB, patch)
2012-11-03 19:37 UTC, Thomas Eschenbacher
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Eschenbacher 2011-04-24 14:36:56 UTC
Version:           unspecified (using KDE 4.6.2) 
OS:                Linux

Here some output of an attempt to run my application with cachegrind:

[...]
# --5039-- Reading syms from /usr/lib/libfontconfig.so.1.4.4 (0x4279000)
[...]
vex x86->IR: unhandled instruction bytes: 0xF3 0xF 0xB8 0x14
==5039== valgrind: Unrecognised instruction at address 0x42836c8.
==5039== Your program just tried to execute an instruction that Valgrind
==5039== did not recognise.  There are two possible reasons for this.
==5039== 1. Your program has a bug and erroneously jumped to a non-code
==5039==    location.  If you are running Memcheck and you just saw a
==5039==    warning about a bad jump, it's probably your program's fault.
==5039== 2. The instruction is legitimate but Valgrind doesn't handle it,
==5039==    i.e. it's Valgrind's fault.  If you think this is the case or
==5039==    you are not sure, please let us know and we'll try to fix it.
==5039== Either way, Valgrind will now raise a SIGILL signal which will
==5039== probably kill your program.
==5039== 
==5039== Process terminating with default action of signal 4 (SIGILL)
==5039==  Illegal opcode at address 0x42836C8
==5039==    at 0x42836C8: FcCharSetCount (in /usr/lib/libfontconfig.so.1.4.4)

=> offset within libfontconfig therefore seems to be 
0x42836C8 - 0x4279000 = 0xA6C8

When disassembling this lib with "objdump -d" I see this instruction:

    a6c8:       f3 0f b8 14 01          popcnt (%ecx,%eax,1),%edx

So I conclude that support for the "popcnt" instruction is missing.


Reproducible: Didn't try




$ valgrind --version
valgrind-3.6.1

$ gcc --version
gcc (Gentoo 4.5.2 p1.1, pie-0.4.5) 4.5.2

$ cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 5
model name      : AMD Athlon(tm) II X4 615e Processor
stepping        : 3
[...]
Comment 1 Tom Hughes 2011-08-23 10:22:48 UTC
This was implemented in VEX r1983.
Comment 2 Thomas Eschenbacher 2011-11-05 15:37:01 UTC
sorry, but this is still not fixed.
I tried again with svn r12257 from today, which
announces itself as "Valgrind-3.8.0.SVN and LibVEX"

vex x86->IR: unhandled instruction bytes: 0xF3 0xF 0xB8 0xD2

4b16a20d:       f3 0f b8 d2             popcnt %edx,%edx
Comment 3 Tom Hughes 2011-11-05 16:43:59 UTC
Ah that commit was only for 64 bit mode and you're in 32 bit mode.
Comment 4 Thomas Eschenbacher 2012-11-02 22:19:47 UTC
Hi, now nearly one year passed, but the problem still persists, valgrind (v3.8.1 / 32bit mode) still is practically unusable for me :-(

Any chance to get this fixed ?
Comment 5 Thomas Eschenbacher 2012-11-03 19:37:14 UTC
Created attachment 74970 [details]
patch to add support for the SSE4.2 instruction "popcnt" for X86
Comment 6 Thomas Eschenbacher 2012-11-03 19:42:20 UTC
ok, seems that nobody else had time for this...

So I tried to fix it on my own and wrote a patch for it (see attachment), which seems to work for me.
=> Could someone of the developers please review it?
Comment 7 Thomas Eschenbacher 2013-12-27 14:31:31 UTC
Apparently this patch has not been taken into v3.9.0 - why?
Please integrate it in the next release!
 (the patch file above can be still applied)
Comment 8 Philippe Waroquiers 2013-12-30 11:17:07 UTC
(In reply to comment #7)
> Apparently this patch has not been taken into v3.9.0 - why?
> Please integrate it in the next release!
>  (the patch file above can be still applied)

Personnally, I cannot commit this patch as I do not know this part of Valgrind,
and so cannot judge if it is ok or not.

But I think the patch has more chance to be looked at (by others :) if there is a test case.

Philippe
Comment 9 Thomas Eschenbacher 2013-12-30 12:30:44 UTC
Ok, you want a test case, here it is a simple one: ;-)

1. create a simple program "poptest.c":
   int main(int argc, char **argv) 
   {
          return __builtin_popcount(argc + *argv[0]);
   }

2. compile it:
   $> gcc -mpopcnt poptest.c -o poptest

3. verify that it contains a popcnt instruction:
   $> objdump -d poptest
   should give something like this:

   [...]
   0804842c <main>:
   804842c:       55                      push   %ebp
   804842d:       89 e5                   mov    %esp,%ebp
   804842f:       8b 45 0c                mov    0xc(%ebp),%eax
   8048432:       8b 00                   mov    (%eax),%eax
   8048434:       0f b6 00                movzbl (%eax),%eax
   8048437:       0f be d0                movsbl %al,%edx
   804843a:       8b 45 08                mov    0x8(%ebp),%eax
   804843d:       01 d0                   add    %edx,%eax
   804843f:       f3 0f b8 c0             popcnt %eax,%eax    <= here it is !
   8048443:       5d                      pop    %ebp
   8048444:       c3                      ret    
   [...]

4. and finally, try to analyze it with valgrind:
    $> valgrind ./poptest
Comment 10 Philippe Waroquiers 2013-12-31 16:55:49 UTC
(In reply to comment #9)
> Ok, you want a test case, here it is a simple one: ;-)
That was quick.
Note however that the best way to provide a test is to
conform to the technique used to write the currently existing tests
see e.g. none/tests/x86/lzcnt32.* or /none/tests/amd64/sse4-64.*

A better test might motivate better someone with a better knowledge of ss4 than me :)
Comment 11 Rhys Kidd 2015-07-22 04:55:38 UTC
*** Bug 349891 has been marked as a duplicate of this bug. ***