| Summary: | Constant folding improvements | ||
|---|---|---|---|
| Product: | [Developer tools] valgrind | Reporter: | Florian Krohm <flo2030> |
| Component: | vex | Assignee: | Florian Krohm <flo2030> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | CC: | sam |
| Priority: | NOR | ||
| Version First Reported In: | unspecified | ||
| Target Milestone: | --- | ||
| Platform: | Other | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
| Attachments: |
audit of constant folding and algebraic simplification
constant folding update |
||
1752297cb1 adds the checker for VEX constant folding. As expected no surprises. 25b9956679 updates the iropt tester to run both with and without constant
folding. For now only s390 is enabled because there are failures on the
other supported architectures. Here are a few samples:
ppc failures:
*** Incorrect result for operator MullU32 ppc64be, ppc64le
opnd 0: 00000001
opnd 1: fffffffe
result fold: 00000000fffffffe
result nofold: fffffffffffffffe
result expected: 00000000fffffffe
*** Incorrect result for operator DivU32E ppc64be, ppc64le, ppc32
opnd 0: 000000ff
opnd 1: 00000002
result fold: 80000000
result nofold: 00000000
result expected: 80000000
*** Incorrect result for operator DivS32E ppc64be, ppc64le, ppc32
opnd 0: 00000001
opnd 1: 00000002
result fold: 80000000
result nofold: 00000000
result expected: 80000000
amd64 failures:
*** Incorrect result for operator ClzNat64
opnd 0: 0000000000000000
result fold: 0000000000000040
result nofold: fffffffffffff870
result expected: 0000000000000040
*** Incorrect result for operator CtzNat64
opnd 0: 0000000000000000
result fold: 0000000000000040
result nofold: 00000000000173da
result expected: 0000000000000040
This is understood. The reason is that ClzNat64 is implemented via the BSF
insn which has undefined behaviour when the operand is zero. Clearly,
this is wrong and needs to be fixed.
See also https://bugs.kde.org/show_bug.cgi?id=507033
This error does not surface in normal valgrind operation because in IR
generation there is scaffolding to handle a 0 operand.
x86 failures:
*** Incorrect result for operator ClzNat32
opnd 0: 00000000
result fold: 00000020
result nofold: 7e0a7c0f
result expected: 00000020
*** Incorrect result for operator CtzNat32
opnd 0: 00000000
result fold: 00000020
result nofold: 81f58410
result expected: 00000020
See amd64 explanation.
The amd64 and x86 failures have been fixed in 7389eabaf31. iropt-test is now enabled for x86 and amd64 as well. Created attachment 184583 [details] constant folding update The fields with green background identify the IROps for which constant folding has been added. No support for IROps using 128 bit values. I will open a separate BZ for that. This concludes part #3 of the plan outlined in https://bugs.kde.org/show_bug.cgi?id=506211#c0 I'm calling it a day. Constant folding for integer IROps not yielding or operating on 128-bit values has been finished. There has also been moderate addition of algebraic simplification rules. One can keep adding rules but it is often not at all clear whether there is a benefit. So adding rules is best done when a particular inefficiency has been observed. As for part #4 in the original plan https://bugs.kde.org/show_bug.cgi?id=506211#c0 this cannot be done once you start thinking about it.. |
Created attachment 182684 [details] audit of constant folding and algebraic simplification Are there missed constant folding opportunities in VEX ir_opt.c? Yes. Attached is a spreadsheet that shows the status of constant folding and algebraic simplification for IROps that have integer operands and results. Function fold_Expr_WRK is the work horse. There are also fold_IRExpr_Binop and fold_IRExpr_Unop. I did not look at those. Here's the plan: 1) Write iropt-test similar to vbit-test using the IR injection machinery to verify that constant folding works as expected. 2) Verify existing constant folding :) 3) Fill in the constant folding gaps running iropt-test alongside. 4) Explore whether IR injection can be extended to allow symbolic verification of algebraic simplifications. E.g. verify that Iop_And32(x, 0) --> x without assigning selected constants to x thereby transforming the issue to constant folding (and having only partial coverage). Current status: initial version of iropt-test verified that constant folding for Iop_Not8 works :)