267630 – Add support for IBM Power ISA 2.06 -- stage 1

Bug 267630 - Add support for IBM Power ISA 2.06 -- stage 1

Summary: Add support for IBM Power ISA 2.06 -- stage 1

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	general (show other bugs)
Version:	3.7 SVN
Platform:	unspecified Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:	259977
Blocks:
	Show dependency tree / graph

Reported:	2011-03-04 16:06 UTC by Maynard Johnson
Modified:	2011-04-16 01:03 UTC (History)
CC List:	1 user (show)

See Also:
Latest Commit:
Version Fixed In:

Attachments
Patch to add support for POWER7 instructions used by GLIBC (495.53 KB, patch) 2011-03-04 16:06 UTC, Maynard Johnson	Details
Version 2 of stage 1 patch for Power ISA 2.06 (496.63 KB, patch) 2011-03-29 01:39 UTC, Maynard Johnson	Details
Add support for new IR ops to memcheck (504 bytes, patch) 2011-03-30 23:26 UTC, Maynard Johnson	Details
Version 3 of stage 1 patch for Power ISA 2.06 (498.03 KB, patch) 2011-04-01 19:18 UTC, Maynard Johnson	Details
Version 4 of stage 1 patch for Power ISA 2.06 (497.27 KB, patch) 2011-04-12 19:13 UTC, Maynard Johnson	Details
Show Obsolete (4) View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Maynard Johnson 2011-03-04 16:06:29 UTC

Created attachment 57672 [details]
Patch to add support for POWER7 instructions used by GLIBC

The purpose of this bug is to add support for a subset of the new instructions in IBM Power ISA 2.06 (i.e., IBM POWER7 processor).  The patch attached to this bug is stage 1 of a set of patches that will be submitted over time.  This first patch implements instructions that are used by GLIBC.  These were deemed the most critical to support as soon as possible since application developers who compile and link their programs on an IBM POWER7 system (on RHEL 6 or SLES 11 SP1) will automatically link to power7-specific GLIBC libraries (e.g., libc, libm).  Doing so will almost always result in the program using some new POWER7 instructions.

NOTE: The attached patch is dependent on another patch that was submitted via bug Valgrind bug #259977 (https://bugs.kde.org/show_bug.cgi?id=259977).

Comment 1 Maynard Johnson 2011-03-04 17:11:23 UTC

The new shell script, tests/check_isa-2_06_cap, must be made executable before running the new testcases included with this patch.

Comment 2 Maynard Johnson 2011-03-04 17:32:47 UTC

The attached patch was created against a March 3, 2011 snapshot of valgrind svn.  The testsuite results for the POWER7-patched valgrind when I ran on a POWER7/SLES 11 SP1 system had 15 failing testcases.  When I ran the testsuite for the pristine snapshot on a POWER6/SLES 10 SP2 system, there were 17 failing testcases.  While not a perfect apples-to-apples comparison, it's a good sign that I had fewer test failures on the patched tree.  I'm quite confident that my patch has not caused any regressions.  I do plan on scrubbing the testsuite failures at some point, but getting the new support added is higher priority at this time.

Comment 3 Julian Seward 2011-03-28 12:08:17 UTC

Some prelim comments/queries:

* there's a whole bunch of renaming of FPRn to VSXn in the guest
  state.  What's the deal?  How does this relate to existing 
  Altivec and FP support?  There's no lossage of support for
  older processors, eg 7447 in old PPC Mac Minis, right?

* +      Iop_Cnt64,  /* count ones: I64 -> I64 */
  +      Iop_I64UtoF64, /* IRRoundingMode(I32) x unsigned I64 -> F64 */
  +      Iop_I64UtoF32, /* IRRoundingMode(I32) x unsigned I64 -> F32 */

  I prefer to avoid new primops wherever possible.  Cnt64 you
  can avoid by copying gen_POPCOUNT in guest_amd64_toIR.c.  The other
  two .. I guess they are unavoidable.

* Does the recent s390x merge simplify/impact on this at all?

* The patch consists mostly of testcases + expected outputs.
  I like that :-)

Comment 4 Christian Borntraeger 2011-03-28 12:13:16 UTC

Am 28.03.2011 12:08, schrieb Julian Seward:
> https://bugs.kde.org/show_bug.cgi?id=267630
> 
> 
> 
> 
> 
> --- Comment #3 from Julian Seward <jseward acm org>  2011-03-28 12:08:17 ---
> Some prelim comments/queries:
> 
> * there's a whole bunch of renaming of FPRn to VSXn in the guest
>   state.  What's the deal?  How does this relate to existing 
>   Altivec and FP support?  There's no lossage of support for
>   older processors, eg 7447 in old PPC Mac Minis, right?
> 
> * +      Iop_Cnt64,  /* count ones: I64 -> I64 */
>   +      Iop_I64UtoF64, /* IRRoundingMode(I32) x unsigned I64 -> F64 */
>   +      Iop_I64UtoF32, /* IRRoundingMode(I32) x unsigned I64 -> F32 */
[...]
> * Does the recent s390x merge simplify/impact on this at all?

Probably not, since the unsigned conversions (called convert from/to logical)
were added with the z196 processor.

Since gcc 4.6 was now released (and is the first gcc with z196 support) 
I am currently cooking up full support for the z196 processor.  I will also
need unsigned int <-> float conversion. (all combinations of 32/64 unsigned
integers with 32/64/128 floats). In fact, I am currently working on these.
So we will need I*UtoF* for s390x as well.

Christian

Comment 5 Christian Borntraeger 2011-03-28 12:23:45 UTC

For reference, this is the current state of my patch converning libvex_ir.h.
ir_defs.c and mc_translate.c are also impacted.

--- valgrind-base//VEX/pub/libvex_ir.h  2011-03-14 13:35:18.067655000 +0100
+++ valgrind-ibm//VEX/pub/libvex_ir.h   2011-03-25 15:31:06.626104000 +0100
@@ -618,12 +618,19 @@ typedef
       Iop_I32StoF64, /*                       signed I32 -> F64 */
       Iop_I64StoF64, /* IRRoundingMode(I32) x signed I64 -> F64 */
 
+      Iop_I32UtoF32, /* IRRoundingMode(I32) x unsigned I32 -> F32 */
       Iop_I32UtoF64, /*                       unsigned I32 -> F64 */
+      Iop_I64UtoF32, /* IRRoundingMode(I32) x unsigned I64 -> F32 */
+      Iop_I64UtoF64, /* IRRoundingMode(I32) x unsigned I64 -> F64 */
 
       Iop_F32toI16S, /* IRRoundingMode(I32) x F32 -> signed I16 */
       Iop_F32toI32S, /* IRRoundingMode(I32) x F32 -> signed I32 */
       Iop_F32toI64S, /* IRRoundingMode(I32) x F32 -> signed I64 */
 
+      Iop_F32toI32U, /* IRRoundingMode(I32) x F32 -> unsigned I32 */
+      Iop_F32toI64U, /* IRRoundingMode(I32) x F32 -> unsigned I64 */
+      Iop_F64toI64U, /* IRRoundingMode(I32) x F64 -> unsigned I64 */
+
       Iop_I16StoF32, /*                       signed I16 -> F32 */
       Iop_I32StoF32, /* IRRoundingMode(I32) x signed I32 -> F32 */
       Iop_I64StoF32, /* IRRoundingMode(I32) x signed I64 -> F32 */
@@ -653,11 +660,15 @@ typedef
 
       Iop_I32StoF128, /*                signed I32  -> F128 */
       Iop_I64StoF128, /*                signed I64  -> F128 */
+      Iop_I32UtoF128, /*              unsigned I32  -> F128 */
+      Iop_I64UtoF128,/*               unsigned I64  -> F128 */
       Iop_F32toF128,  /*                       F32  -> F128 */
       Iop_F64toF128,  /*                       F64  -> F128 */
 
       Iop_F128toI32S, /* IRRoundingMode(I32) x F128 -> signed I32  */
       Iop_F128toI64S, /* IRRoundingMode(I32) x F128 -> signed I64  */
+      Iop_F128toI32U, /* IRRoundingMode(I32) x F128 -> unsigned I32  */
+      Iop_F128toI64U, /* IRRoundingMode(I32) x F128 -> unsigned I64  */
       Iop_F128toF64,  /* IRRoundingMode(I32) x F128 -> F64         */
       Iop_F128toF32,  /* IRRoundingMode(I32) x F128 -> F32         */

Comment 6 Julian Seward 2011-03-28 12:37:18 UTC

(In reply to comment #4)
> So we will need I*UtoF* for s390x as well.

Sure.  I looked over the existing set and it's a bit ad-hoc, so
making it more complete is no bad thing.

Comment 7 Maynard Johnson 2011-03-28 21:23:21 UTC

(In reply to comment #3)
> Some prelim comments/queries:
> 
> * there's a whole bunch of renaming of FPRn to VSXn in the guest
>   state.  What's the deal?  How does this relate to existing 
>   Altivec and FP support?  There's no lossage of support for
>   older processors, eg 7447 in old PPC Mac Minis, right?
As described on page 273, "7.1.1 Overview of the
Vector-Scalar Extension", in the Power ISA 2.06 document (http://www.power.org/resources/downloads/PowerISA_V2.06B_V2_PUBLIC.pdf), the FP registers and vector registers have been combined into one unified register file -- the VSX register set.  This patch creates this unified register file and remaps all existing FP and vector register accesses to this new data structure.  This has absolutely no impact on the behavior of the existing FP and vector support in valgrind, other than to store the data in the new register file.  It was cleaner to remove the old stuff and do the remapping than to have the redundant separate set of FP and vector registers.

I had tested the patch on some non-POWER7 systems before I posted it, and I know the results of regtest were good.  But I realized today that I apparently missed something. My earlier testing must have been done either without the new testcase stuff applied or on a newer distro (or toolchain).  When I applied the full code+testcase patch (attached to this bug) on a POWER5/SLES 10 system today, it failed to compile because that version of gcc did not support the '-mvsx' option needed by the new testcase.  :-/   I hacked in a HAS_VSX macro to the configure and used that in the none/tests/ppc[32|64]/Makefile.am to decide whether or not to pass '-mvsx' to gcc.  After doing so, I ran the testsuite and compared it to an unmodified svn checkout, and the results were identical.

> 
> * +      Iop_Cnt64,  /* count ones: I64 -> I64 */
>   +      Iop_I64UtoF64, /* IRRoundingMode(I32) x unsigned I64 -> F64 */
>   +      Iop_I64UtoF32, /* IRRoundingMode(I32) x unsigned I64 -> F32 */
> 
>   I prefer to avoid new primops wherever possible.  Cnt64 you
>   can avoid by copying gen_POPCOUNT in guest_amd64_toIR.c.  The other
>   two .. I guess they are unavoidable.

I'll take a look at gen_POPCOUNT and, hopefully be able to leverage that code.  I wasn't aware of its existence.  Thanks for the tip.

So, aside from probably updating the patch with code from gen_POPCOUNT, I have the testcase compilation issue I need to fix.  Would you like to me fix these two issues now and post a new patch?

Thanks.
> 
> * Does the recent s390x merge simplify/impact on this at all?
> 
> * The patch consists mostly of testcases + expected outputs.
>   I like that :-)

Comment 8 Julian Seward 2011-03-28 22:34:26 UTC

(In reply to comment #7)
> As described on page 273, "7.1.1 Overview of the
> Vector-Scalar Extension", in the Power ISA 2.06 document
> (http://www.power.org/resources/downloads/PowerISA_V2.06B_V2_PUBLIC.pdf), the
> FP registers and vector registers have been combined into one unified register
> file -- the VSX register set.  [...]

k, fine.  Thx for the clarification.

> I'll take a look at gen_POPCOUNT and, hopefully be able to leverage
> that code.

Should be easy.
 
> the testcase compilation issue I need to fix.  Would you like to me fix these
> two issues now and post a new patch?

Yes, pls respin.

Comment 9 Maynard Johnson 2011-03-29 01:39:40 UTC

Created attachment 58412 [details]
Version 2 of stage 1 patch for Power ISA 2.06

The updates to the patch fixes problems with testcase compilation and changes the implementation of popcntd so as not to add a new IR operation.  This patch was respun against today's svn.

Just a reminder . . . this patch is dependent on the patch attached to valgrind bug #259977.

Comment 10 Maynard Johnson 2011-03-30 23:26:47 UTC

Created attachment 58470 [details]
Add support for new IR ops to memcheck

It came to my attention today that memcheck had a problem running the new testcase provided with this POWER7 support patch.  It was a simple change to memcheck/mc_translate.c, so I've attached a patch for just that change instead of respinning the whole POWER7 support patch.

Comment 11 Maynard Johnson 2011-03-31 18:30:07 UTC

(In reply to comment #9)
> Created an attachment (id=58412) [details]
> Version 2 of stage 1 patch for Power ISA 2.06
> 
> The updates to the patch fixes problems with testcase compilation and changes
> the implementation of popcntd so as not to add a new IR operation.  This patch
> was respun against today's svn.
I missed something when re-implementing popcntd.  Since it's no longer implemented as a native instruction, we don't have to bail out of VEX/priv/guest_ppc_toIR.c:disInstr_PPC_WRK if allow_VX is false.

Julian, if you're in the middle of reviewing this patch, maybe you'd prefer that I attach another small patch to the bug (as with the memcheck patch).  But if you'd like, I can re-spin the entire patch with this fix + the memcheck patch and mark the individual memcheck as obsolete.

> 
> Just a reminder . . . this patch is dependent on the patch attached to valgrind
> bug #259977.

Comment 12 Julian Seward 2011-04-01 18:58:33 UTC

(In reply to comment #11)
> Julian, if you're in the middle of reviewing this patch, maybe you'd
> prefer that I attach another small patch to the bug (as with the
> memcheck patch).  But if you'd like, I can re-spin the entire patch
> with this fix + the memcheck patch and mark the individual memcheck
> as obsolete.

If it's easy for you to do, I'd prefer a respin of the entire patch;
no big deal if not though.

Comment 13 Maynard Johnson 2011-04-01 19:18:01 UTC

Created attachment 58505 [details]
Version 3 of stage 1 patch for Power ISA 2.06

As mentioned in comment #11 of this bug and as requested by Julian in his response, this patch is a respin of version 2 that includes the aforementioned memcheck fix, plus the "complete" fix to the rework of the popcnt insn.

Comment 14 Maynard Johnson 2011-04-11 20:19:46 UTC

Julian,
On March 30, I sent an email to you describing a problem I had when running the 32-bit test_isa_2_06_part1 under memcheck.  I have a patch for that problem (adds 32-bit "Iop_Not64" support to VEX/priv/host_ppc_isel.c).  Would you prefer a new, separate patch attachment just for that fix or do you want me to respin the entire "stage 1 POWER7" patch?

Comment 15 Julian Seward 2011-04-12 00:11:17 UTC

(In reply to comment #14)
> (adds 32-bit "Iop_Not64" support to VEX/priv/host_ppc_isel.c).  Would you
> prefer a new, separate patch attachment just for that fix

Please file a new bug report w/ the separate patch just for that fix.
It's much easier to track stuff when each bug report tracks just
one issue -- addressing multiple problems in one bug report leads
to confusion.

Comment 16 Maynard Johnson 2011-04-12 14:18:50 UTC

(In reply to comment #15)
> (In reply to comment #14)
> > (adds 32-bit "Iop_Not64" support to VEX/priv/host_ppc_isel.c).  Would you
> > prefer a new, separate patch attachment just for that fix
> 
> Please file a new bug report w/ the separate patch just for that fix.
> It's much easier to track stuff when each bug report tracks just
> one issue -- addressing multiple problems in one bug report leads
> to confusion.
I can certainly do that, but I just wanted to be clear that this "Iop_Not64" fix addresses a problem with the stage 1 POWER7 patch associated with this bugzilla.  If you still want a new bug opened, then I should probably wait until you commit the stage 1 POWER7 patch, right?

Comment 17 Julian Seward 2011-04-12 14:28:07 UTC

(In reply to comment #16)
> bugzilla.  If you still want a new bug opened, then I should probably wait
> until you commit the stage 1 POWER7 patch, right?

Really .. create the new bug right now.  You can use the "Depends on"
and "Blocks" fields to indicate the connection between it and this
one (you only need to fill in one end of the link, so to speak.)

Comment 18 Maynard Johnson 2011-04-12 19:13:51 UTC

Created attachment 58856 [details]
Version 4 of stage 1 patch for Power ISA 2.06

This version of the patch is essentially the same as Version 3 except that the changes to coregrind/m_machine.c to detect VSX/ISA 2.06 capabilities has been changed to use the new internal setjmp/longjmp (VG_MINIMAL_SETJMP).

Comment 19 Julian Seward 2011-04-14 19:25:13 UTC

(In reply to comment #18)
> Created an attachment (id=58856) [details]
> Version 4 of stage 1 patch for Power ISA 2.06

Weirdness.

When I build on a PPC970 with Fedora 10 and the following (default)
toolchain components, there are no build problems, neither for
Valgrind itself nor for the regression tests.

  $ cat /etc/issue
  Fedora release 10 (Cambridge)
  Kernel \r on an \m (\l)

  $ uname -a
  Linux g5 2.6.27.30-170.2.82.fc10.ppc64 #1 SMP
  Mon Aug 17 08:11:20 EDT 2009 ppc64 ppc64 ppc64 GNU/Linux

  $ gcc --version
  gcc (GCC) 4.3.2 20081105 (Red Hat 4.3.2-7)

  $ as --version
  GNU assembler version 2.18.50.0.9-8.fc10 20080822

However, when I build with a from-source build of gcc-4.6.0 (vanilla FSF)

  $ gcc --version
  gcc (GCC) 4.6.0

then building of the regression tests fails as shown below.  Do you
also see that?  Any ideas?  We need to have working support for
gcc-4.6.0 in the next Valgrind release.


gcc -DHAVE_CONFIG_H -I. -I../../..  -I../../.. -I../../../include -I../../../coregrind -I../../../include -I../../../VEX/pub -DVGA_ppc64=1 -DVGO_linux=1 -DVGP_ppc64_linux=1  -Winline -Wall -Wshadow -g -m32 -mvsx -Winline -Wall -O -g -mregnames -maltivec -m32 -DHAS_ALTIVEC -Wno-long-long  -Wno-pointer-sign -fno-stack-protector -MT jm_insns-jm-insns.o -MD -MP -MF .deps/jm_insns-jm-insns.Tpo -c -o jm_insns-jm-insns.o `test -f 'jm-insns.c' || echo './'`jm-insns.c
jm-insns.c: In function ‘test_int_two_args’:
jm-insns.c:4631:22: warning: variable ‘zap_hi32’ set but not used [-Wunused-but-set-variable]
/tmp/ccm388fz.s: Assembler messages:
/tmp/ccm388fz.s:5424: Error: Unrecognized opcode: `stxvw4x'
/tmp/ccm388fz.s:5447: Error: Unrecognized opcode: `xxlxor'
/tmp/ccm388fz.s:5477: Error: Unrecognized opcode: `stxvw4x'
/tmp/ccm388fz.s:5483: Error: Unrecognized opcode: `lxvw4x'
/tmp/ccm388fz.s:5485: Error: Unrecognized opcode: `stxvw4x'
/tmp/ccm388fz.s:5488: Error: Unrecognized opcode: `stxvw4x'
/tmp/ccm388fz.s:5490: Error: Unrecognized opcode: `stxvw4x'
/tmp/ccm388fz.s:5504: Error: Unrecognized opcode: `lxvw4x'
/tmp/ccm388fz.s:5529: Error: Unrecognized opcode: `lxvw4x'
/tmp/ccm388fz.s:5545: Error: Unrecognized opcode: `stxvw4x'
/tmp/ccm388fz.s:5559: Error: Unrecognized opcode: `stxvw4x'
/tmp/ccm388fz.s:5578: Error: Unrecognized opcode: `lxvw4x'
/tmp/ccm388fz.s:5618: Error: Unrecognized opcode: `lxvw4x'
/tmp/ccm388fz.s:6248: Error: Unrecognized opcode: `stxvw4x'
/tmp/ccm388fz.s:6251: Error: Unrecognized opcode: `stxvw4x'
[..zillions more omitted..]

Comment 20 Maynard Johnson 2011-04-14 20:15:40 UTC

(In reply to comment #19)
> (In reply to comment #18)
> > Created an attachment (id=58856) [details] [details]
> > Version 4 of stage 1 patch for Power ISA 2.06
> 
> Weirdness.
> 
> When I build on a PPC970 with Fedora 10 and the following (default)
> toolchain components, there are no build problems, neither for
> Valgrind itself nor for the regression tests.
> 
>   $ cat /etc/issue
>   Fedora release 10 (Cambridge)
>   Kernel \r on an \m (\l)
> 
>   $ uname -a
>   Linux g5 2.6.27.30-170.2.82.fc10.ppc64 #1 SMP
>   Mon Aug 17 08:11:20 EDT 2009 ppc64 ppc64 ppc64 GNU/Linux
> 
>   $ gcc --version
>   gcc (GCC) 4.3.2 20081105 (Red Hat 4.3.2-7)
> 
>   $ as --version
>   GNU assembler version 2.18.50.0.9-8.fc10 20080822
> 
> However, when I build with a from-source build of gcc-4.6.0 (vanilla FSF)
> 
>   $ gcc --version
>   gcc (GCC) 4.6.0
> 
> then building of the regression tests fails as shown below.  Do you
> also see that?  Any ideas?  We need to have working support for
> gcc-4.6.0 in the next Valgrind release.
> 
> 
> gcc -DHAVE_CONFIG_H -I. -I../../..  -I../../.. -I../../../include
> -I../../../coregrind -I../../../include -I../../../VEX/pub -DVGA_ppc64=1
> -DVGO_linux=1 -DVGP_ppc64_linux=1  -Winline -Wall -Wshadow -g -m32 -mvsx

Bug # 270794 contains a patch to remove the '-mvsx' from the build command for jm-insns.c.  That was mistakenly done when I updated the none/tests/ppc{32|64}/Makefile.am to add the new ISA 2.06 testcase.  But from the compile errors shown, I suspect you'll run into similar errors when trying to rightly compile the new ISA 2.06 testcase.  And I don't know off-hand why that would be.  Please let me know if my suspicion is correct -- and I'll do some digging on my end, too.

> -Winline -Wall -O -g -mregnames -maltivec -m32 -DHAS_ALTIVEC -Wno-long-long 
> -Wno-pointer-sign -fno-stack-protector -MT jm_insns-jm-insns.o -MD -MP -MF
> .deps/jm_insns-jm-insns.Tpo -c -o jm_insns-jm-insns.o `test -f 'jm-insns.c' ||
> echo './'`jm-insns.c
> jm-insns.c: In function ‘test_int_two_args’:
> jm-insns.c:4631:22: warning: variable ‘zap_hi32’ set but not used
> [-Wunused-but-set-variable]
> /tmp/ccm388fz.s: Assembler messages:
> /tmp/ccm388fz.s:5424: Error: Unrecognized opcode: `stxvw4x'
> /tmp/ccm388fz.s:5447: Error: Unrecognized opcode: `xxlxor'
> /tmp/ccm388fz.s:5477: Error: Unrecognized opcode: `stxvw4x'
> /tmp/ccm388fz.s:5483: Error: Unrecognized opcode: `lxvw4x'
> /tmp/ccm388fz.s:5485: Error: Unrecognized opcode: `stxvw4x'
> /tmp/ccm388fz.s:5488: Error: Unrecognized opcode: `stxvw4x'
> /tmp/ccm388fz.s:5490: Error: Unrecognized opcode: `stxvw4x'
> /tmp/ccm388fz.s:5504: Error: Unrecognized opcode: `lxvw4x'
> /tmp/ccm388fz.s:5529: Error: Unrecognized opcode: `lxvw4x'
> /tmp/ccm388fz.s:5545: Error: Unrecognized opcode: `stxvw4x'
> /tmp/ccm388fz.s:5559: Error: Unrecognized opcode: `stxvw4x'
> /tmp/ccm388fz.s:5578: Error: Unrecognized opcode: `lxvw4x'
> /tmp/ccm388fz.s:5618: Error: Unrecognized opcode: `lxvw4x'
> /tmp/ccm388fz.s:6248: Error: Unrecognized opcode: `stxvw4x'
> /tmp/ccm388fz.s:6251: Error: Unrecognized opcode: `stxvw4x'
> [..zillions more omitted..]

Comment 21 Maynard Johnson 2011-04-14 20:23:57 UTC

(In reply to comment #20)
> (In reply to comment #19)
> > (In reply to comment #18)
> > > Created an attachment (id=58856) [details] [details] [details]
> > > Version 4 of stage 1 patch for Power ISA 2.06
> > 
> > Weirdness.
> > 
> > When I build on a PPC970 with Fedora 10 and the following (default)
> > toolchain components, there are no build problems, neither for
> > Valgrind itself nor for the regression tests.
> > 
> >   $ cat /etc/issue
> >   Fedora release 10 (Cambridge)
> >   Kernel \r on an \m (\l)
> > 
> >   $ uname -a
> >   Linux g5 2.6.27.30-170.2.82.fc10.ppc64 #1 SMP
> >   Mon Aug 17 08:11:20 EDT 2009 ppc64 ppc64 ppc64 GNU/Linux
> > 
> >   $ gcc --version
> >   gcc (GCC) 4.3.2 20081105 (Red Hat 4.3.2-7)
> > 
> >   $ as --version
> >   GNU assembler version 2.18.50.0.9-8.fc10 20080822
> > 
> > However, when I build with a from-source build of gcc-4.6.0 (vanilla FSF)
> > 
> >   $ gcc --version
> >   gcc (GCC) 4.6.0
> > 
> > then building of the regression tests fails as shown below.  Do you
> > also see that?  Any ideas?  We need to have working support for
> > gcc-4.6.0 in the next Valgrind release.
> > 
> > 
> > gcc -DHAVE_CONFIG_H -I. -I../../..  -I../../.. -I../../../include
> > -I../../../coregrind -I../../../include -I../../../VEX/pub -DVGA_ppc64=1
> > -DVGO_linux=1 -DVGP_ppc64_linux=1  -Winline -Wall -Wshadow -g -m32 -mvsx
> 
> Bug # 270794 contains a patch to remove the '-mvsx' from the build command for
> jm-insns.c.  That was mistakenly done when I updated the
> none/tests/ppc{32|64}/Makefile.am to add the new ISA 2.06 testcase.  But from
> the compile errors shown, I suspect you'll run into similar errors when trying
> to rightly compile the new ISA 2.06 testcase.  And I don't know off-hand why
> that would be.  Please let me know if my suspicion is correct -- and I'll do
> some digging on my end, too.
> 
> > -Winline -Wall -O -g -mregnames -maltivec -m32 -DHAS_ALTIVEC -Wno-long-long 
> > -Wno-pointer-sign -fno-stack-protector -MT jm_insns-jm-insns.o -MD -MP -MF
> > .deps/jm_insns-jm-insns.Tpo -c -o jm_insns-jm-insns.o `test -f 'jm-insns.c' ||
> > echo './'`jm-insns.c
> > jm-insns.c: In function ‘test_int_two_args’:
> > jm-insns.c:4631:22: warning: variable ‘zap_hi32’ set but not used
> > [-Wunused-but-set-variable]
> > /tmp/ccm388fz.s: Assembler messages:
> > /tmp/ccm388fz.s:5424: Error: Unrecognized opcode: `stxvw4x'

You need to update your binutils.  The latest release 2.21 should work.  And then rebuild gcc 4.6 and configure it with either --with-ld=/path/to/new/ld or by setting LD=/path/to/new/ld.

Sorry, I should have mentioned this requirement up front.

[snip]

Comment 22 Julian Seward 2011-04-15 13:45:45 UTC

(In reply to comment #21)
> > > /tmp/ccm388fz.s:5424: Error: Unrecognized opcode: `stxvw4x'
> 
> You need to update your binutils.

That's less than ideal.  If we had the same approach for x86
instruction set extensions, we'd be in a situation where the test
suite didn't compile on a large fraction of end-user machines.

I applied the 270794 patch on top of this one, and then edited the
configure.in test so it requires that both gcc accepts -mvsx and that
the assembler understands such instructions.  That makes it build with
gcc-4.6 on my 970.

I'll commit it all shortly.  You need to check that the vsx tests
are actually still built and work on a complete Power7 system, though.

Please also verify that all the new tests you introduced run without
asserting on "--tool=memcheck --track-origins=yes", both in 32- and
64-bit modes.  (Once this lands.)

Comment 23 Julian Seward 2011-04-15 13:51:33 UTC

Committed, r11697 and r2127.

Comment 24 Maynard Johnson 2011-04-15 17:08:53 UTC

Julian Seward wrote:
> https://bugs.kde.org/show_bug.cgi?id=267630

Julian,
First, I want to thank you for your reviews and your help in getting this initial POWER7 support patch into upstream valgrind.  :-)

I have a couple other comments below.

> 
> 
> 
> 
> 
> --- Comment #22 from Julian Seward <jseward acm org>  2011-04-15 13:45:45 ---
> (In reply to comment #21)
>>>> /tmp/ccm388fz.s:5424: Error: Unrecognized opcode: `stxvw4x'
>>
>> You need to update your binutils.
> 
> That's less than ideal.  If we had the same approach for x86
> instruction set extensions, we'd be in a situation where the test
> suite didn't compile on a large fraction of end-user machines.
> 
> I applied the 270794 patch on top of this one, and then edited the
> configure.in test so it requires that both gcc accepts -mvsx and that
> the assembler understands such instructions.  That makes it build with
> gcc-4.6 on my 970.

Generally speaking, most people would be using a distro-supplied toolchain where the compiler and assembler are properly matched up.  But with that said, I'm certainly OK with the addition of the assembler check.  Can't hurt to ensure against a mis-matched assembler and compiler.

> 
> I'll commit it all shortly.  You need to check that the vsx tests
> are actually still built and work on a complete Power7 system, though.

I checked it out.  Everything builds and runs fine on a POWER5 system with older toolchain.  But when I built on POWER7 with a new toolchain, the config test for VSX failed.  In config.log, I get:

	configure:7307: gcc -c -mvsx conftest.c >&5
	conftest.c: In function 'main':
	conftest.c:38:3: error: invalid 'asm': operand number missing after %-letter
	conftest.c:38:3: error: invalid 'asm': operand number missing after %-letter
	conftest.c:38:3: error: invalid 'asm': operand number missing after %-letter

The line in configure.in that it's complaining about is:
	__asm__ __volatile__("xsmaddadp %vs32, %vs32, %vs33" ::: "memory","cc");

The "%" should not be included, and we'd also need to add " -Wa,-regnames" to the compile line.  Alternatively, the line could be replaced with:

	__asm__ __volatile__("xsmaddadp 32, 32, 33" ::: "memory","cc");

I'll open a bug report for this.

With configure.in patched as above (either option), the configure runs as expected on POWER7 and valgrind builds OK, with one caveat.  There is a small patch needed for coregrind/m_dispatch/dispatch-ppc64-linux.S when building with binutils 2.21.  I had not opened a bug for this previously because I didn't want to distract from the main issue of the initial POWER7 support patch.  But now I'll open a bug report for that, too.

> 
> Please also verify that all the new tests you introduced run without
> asserting on "--tool=memcheck --track-origins=yes", both in 32- and
> 64-bit modes.  (Once this lands.)

Yes, I'll do that.  Thanks again!

-Maynard

>

Comment 25 Maynard Johnson 2011-04-15 18:56:15 UTC

(In reply to comment #22)
> 
[snip]
> Please also verify that all the new tests you introduced run without
> asserting on "--tool=memcheck --track-origins=yes", both in 32- and
> 64-bit modes.  (Once this lands.)
After applying the patches for bug # 270851 and bug # 270856, running memcheck with "--track-origins=yes" against both the 32-bit and 64-bit test_isa_2_06_part1 testcase pass with no problems.  Both of the above patches are needed to prevent the errors documented in the above-mentioned bugs.

Comment 26 Julian Seward 2011-04-15 23:19:01 UTC

(In reply to comment #24)
>     configure:7307: gcc -c -mvsx conftest.c >&5
>     conftest.c: In function 'main':
>     conftest.c:38:3: error: invalid 'asm': operand number missing after
> %-letter
>     conftest.c:38:3: error: invalid 'asm': operand number missing after
> %-letter
>     conftest.c:38:3: error: invalid 'asm': operand number missing after
> %-letter
> 
> The line in configure.in that it's complaining about is:
>     __asm__ __volatile__("xsmaddadp %vs32, %vs32, %vs33" ::: "memory","cc");
> 
> The "%" should not be included, and we'd also need to add " -Wa,-regnames" to
> the compile line.  Alternatively, the line could be replaced with:
> 
>     __asm__ __volatile__("xsmaddadp 32, 32, 33" ::: "memory","cc");

Yeah, I suspected something like that might happen :-)  I committed the
above line as r11699 (closing bug 271042) to fix it.


> expected on POWER7 and valgrind builds OK, with one caveat.  There is a small
> patch needed for coregrind/m_dispatch/dispatch-ppc64-linux.S when building with
> binutils 2.21.  I had not opened a bug for this previously because I didn't

and committed r11700 (closing bug 271043).