Bug 507033

Summary:	Remove Iop_Clz32/64 and Iop_Ctz32/64
Product:	[Developer tools] valgrind	Reporter:	Florian Krohm <flo2030>
Component:	vex	Assignee:	Florian Krohm <flo2030>
Status:	RESOLVED FIXED
Severity:	normal
Priority:	NOR
Version First Reported In:	unspecified
Target Milestone:	---
Platform:	Other
OS:	Linux
Latest Commit:		Version Fixed/Implemented In:
Sentry Crash Report:
Attachments:	Patch for x86 Patch for amd64 Patch for nanomips Patch for mips Patch for arm Patch for arm64 Patch for ppc Remove Iop_Clz32/64 and Iop_Ctz32/64

Description Florian Krohm 2025-07-14 16:45:58 UTC

From libvex_ir.h:

      /* Ctz64/Ctz32/Clz64/Clz32 are UNDEFINED when given arguments of zero.    
         You must ensure they are never given a zero argument.  As of           
         2018-Nov-14 they are deprecated.  Try to use the Nat variants          
         immediately below, if you can.                                         
      */

I'd say: time to get rid of them.
Announce removal in 3.25.2. Remove in 3.26.. (or whatever the process is)

Comment 1 Florian Krohm 2025-07-23 08:20:13 UTC

The semantics of Iop_Clz32/64 are identical to Iop_ClzNat32/64 unless the input value is zero.
And likewise for Iop_Ctz32/64.

So.... audit the code, check occurrences of above operators and if their use is guarded to ensure
that the input value is != 0 then the operator can be replaced with its "Nat" counterpart.
Easy as pie.
Spoiler alert: not always so easy as there are "interesting" uses of Iop_Clz ....

Comment 2 Florian Krohm 2025-07-23 10:38:57 UTC

Created attachment 183451 [details]
Patch for x86

All occurrences of Iop_Clz32 and Iop_Ctz were properly guarded against a zero input.
Remove those IROps from insn selection. Adjust vbit-test accordingly.

Comment 3 Florian Krohm 2025-07-23 13:07:42 UTC

Created attachment 183455 [details]
Patch for amd64

As expected all occurrences of Iop_Clz64 and Iop_Ctz64 were properly guarded against
 a zero input. Remove those IROps from insn selection. Adjust vbit-test accordingly.

Comment 4 Florian Krohm 2025-07-23 14:12:01 UTC

Created attachment 183457 [details]
Patch for nanomips

There are 2 occurrences of Iop_Clz32.

#1 for the CLZ insn:

      case nano_POOL32Axf4_CLZ: {  /* clz */
         DIP("clz r%u, r%u", rt, rs);
         putIReg(rt, unop(Iop_Clz32, getIReg(rs)));
         break;
      }

This what happens:

    CLZ insn --ir--> Iop_Clz32 --isel--> NMun_CLZ --emit--> CLZ

The CLZ insn has no special behaviour when the input is 0. Using Iop_Clz32
is not the right choice then as it has an undefined result for a 0 input.
This all works out, though, because in the end a CLZ insn will be emitted.
So we can simply change Iop_Clz64 --> Iop_ClzNat64 and be done here.

#2 for the CLO insn:

      case nano_POOL32Axf4_CLO: {  /* clo */
         DIP("clo r%u, r%u", rt, rs);
         t1 = newTemp(Ity_I1);
         assign(t1, binop(Iop_CmpEQ32, getIReg(rs), mkU32(0xffffffff)));
         putIReg(rt, IRExpr_ITE(mkexpr(t1),
                                mkU32(0x00000020),
                                unop(Iop_Clz32,
                                     unop(Iop_Not32, getIReg(rs)))));
         break;
      }

The special case is not needed because Clz32(0) == 32;

      case nano_POOL32Axf4_CLO: {  /* clo */
         DIP("clo r%u, r%u", rt, rs);
         putIReg(rt, unop(Iop_Clz32, unop(Iop_Not32, getIReg(rs)))));
         break;
      }

Done.

Comment 5 Florian Krohm 2025-07-23 15:22:27 UTC

Created attachment 183461 [details]
Patch for mips

There were multiple uses of Iop_Clz32 and Iop_Clz64 to implement CLZ, CLO,
DCLZ and DCLO. Sometimes but not always there was code to guard against a
0 input value. But it does not matter. Because in the end CLZ and DCLZ were
emitted which do the right thing. See also the comment about nanomips.

The patch removes any special handling for 0 and replaces Iop_Clz32 with
Iop_ClzNat32 and Iop_Clz64 with Iop_ClzNat64.

Comment 6 Florian Krohm 2025-07-23 16:12:31 UTC

Created attachment 183464 [details]
Patch for arm

Replace Iop_Clz32 with Iop_ClzNat32 and remove special handling for a 0 input value.

CLZ --ir--> Iop_ClzNat32 --isel--> ARMun_CLZ --emit--> CLZ

CLZ handles all input values correctly

Comment 7 Florian Krohm 2025-07-23 17:07:12 UTC

Created attachment 183468 [details]
Patch for arm64

Replace Iop_Clz64 with Iop_ClzNat64 for which eventually a CLZ insn will be emitted and that
handles all input values correctly.

Comment 8 Florian Krohm 2025-07-29 15:11:44 UTC

I'd like to move this along. Will check in the x86 / amd64 patches soonish.
If somebody with access to arm/64 and mips/nanomips could regtest the patches above that would be great.
Still waiting to head back from Carl on ppc.

Comment 9 Florian Krohm 2025-08-24 20:20:44 UTC

Created attachment 184410 [details]
Patch for ppc

Iop_Clz64 and Iop_ClzNat64 are already handled the same way, namely:

Iop_Clz64 / Iop_ClzNat64 --isel--> Pun_CLZ64 --emit-->cntlzd

where the cntlzd insn has "natural" semantics and a 0 operand is not special.

Comment 10 Florian Krohm 2025-08-24 20:43:06 UTC

Created attachment 184414 [details]
Remove Iop_Clz32/64 and Iop_Ctz32/64

The patch removes those IRops from libvex_ir.h and fixes the ripple.

What is left to do is to fix the insn emitters for x86 and amd64 so they handle a
0 operand properly i.e. "naturally".  This isn't a problem in valgrind operation because
in IR generation a 0 operand is handled as a special case. See also https://bugs.kde.org/show_bug.cgi?id=506211#c2

Comment 11 Florian Krohm 2025-08-26 20:54:05 UTC

Fixed in 7389eabaf3