Bug 506211

Summary:	Constant folding improvements
Product:	[Developer tools] valgrind	Reporter:	Florian Krohm <flo2030>
Component:	vex	Assignee:	Florian Krohm <flo2030>
Status:	RESOLVED FIXED
Severity:	normal	CC:	sam
Priority:	NOR
Version First Reported In:	unspecified
Target Milestone:	---
Platform:	Other
OS:	Linux
Latest Commit:		Version Fixed/Implemented In:
Sentry Crash Report:
Attachments:	audit of constant folding and algebraic simplification constant folding update

Description Florian Krohm 2025-06-26 12:37:12 UTC

Created attachment 182684 [details]
audit of constant folding and algebraic simplification

Are there missed constant folding opportunities in VEX ir_opt.c?  Yes.
Attached is a spreadsheet that shows the status of constant folding and algebraic simplification for IROps that have integer operands and results. Function fold_Expr_WRK is the work horse. There are also fold_IRExpr_Binop and fold_IRExpr_Unop. I did not look at those. 

Here's the plan:
1) Write iropt-test similar to vbit-test using the IR injection machinery to verify that constant folding works as expected.
2) Verify existing constant folding :)
3) Fill in the constant folding gaps running iropt-test alongside.
4) Explore whether IR injection can be extended to allow symbolic verification of algebraic simplifications.
E.g. verify that Iop_And32(x, 0)  --> x   without assigning selected constants to x thereby transforming the issue to constant folding (and having only partial coverage).

Current status: initial version of iropt-test verified that constant folding for Iop_Not8 works :)

Comment 1 Florian Krohm 2025-07-12 16:08:16 UTC

1752297cb1 adds the checker for VEX constant folding.
As expected no surprises.

Comment 2 Florian Krohm 2025-08-22 17:51:34 UTC

25b9956679   updates the iropt tester to run both with and without constant
folding. For now only s390 is enabled because there are failures on the
other supported architectures. Here are a few samples:

ppc failures:

*** Incorrect result for operator MullU32    ppc64be, ppc64le
    opnd 0:          00000001
    opnd 1:          fffffffe
    result fold:     00000000fffffffe
    result nofold:   fffffffffffffffe
    result expected: 00000000fffffffe

*** Incorrect result for operator DivU32E    ppc64be, ppc64le, ppc32
    opnd 0:          000000ff
    opnd 1:          00000002
    result fold:     80000000
    result nofold:   00000000
    result expected: 80000000

*** Incorrect result for operator DivS32E    ppc64be, ppc64le, ppc32
    opnd 0:          00000001
    opnd 1:          00000002
    result fold:     80000000
    result nofold:   00000000
    result expected: 80000000

amd64 failures:
*** Incorrect result for operator ClzNat64
    opnd 0:          0000000000000000
    result fold:     0000000000000040
    result nofold:   fffffffffffff870
    result expected: 0000000000000040
*** Incorrect result for operator CtzNat64
    opnd 0:          0000000000000000
    result fold:     0000000000000040
    result nofold:   00000000000173da
    result expected: 0000000000000040

This is understood. The reason is that ClzNat64 is implemented via the BSF
insn which has undefined behaviour when the operand is zero. Clearly,
this is wrong and needs to be fixed.
See also https://bugs.kde.org/show_bug.cgi?id=507033
This error does not surface in normal valgrind operation because in IR
generation there is scaffolding to handle a 0 operand.

x86 failures:
*** Incorrect result for operator ClzNat32
    opnd 0:          00000000
    result fold:     00000020
    result nofold:   7e0a7c0f
    result expected: 00000020
*** Incorrect result for operator CtzNat32
    opnd 0:          00000000
    result fold:     00000020
    result nofold:   81f58410
    result expected: 00000020

See amd64 explanation.

Comment 3 Florian Krohm 2025-08-26 21:06:17 UTC

The amd64 and x86 failures have been fixed in 7389eabaf31.
iropt-test is now enabled for x86 and amd64 as well.

Comment 4 Florian Krohm 2025-08-30 11:48:30 UTC

Created attachment 184583 [details]
constant folding update

The fields with green background identify the IROps for which constant folding has
been added.
No support for IROps using 128 bit values. I will open a separate BZ for that.
This concludes part #3 of the plan outlined in https://bugs.kde.org/show_bug.cgi?id=506211#c0

Comment 5 Florian Krohm 2025-10-01 14:42:04 UTC

I'm calling it a day.
Constant folding for integer IROps not yielding or operating on 128-bit values has been
finished.  There has also been moderate addition of algebraic simplification rules.
One can keep adding rules but it is often not at all clear whether there is a benefit.
So adding rules is best done when a particular inefficiency has been observed.

As for part #4 in the original plan https://bugs.kde.org/show_bug.cgi?id=506211#c0
this cannot be done once you start thinking about it..