507033 – Remove Iop_Clz32/64 and Iop_Ctz32/64

Bug 507033 - Remove Iop_Clz32/64 and Iop_Ctz32/64

Summary: Remove Iop_Clz32/64 and Iop_Ctz32/64

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	vex (other bugs)
Version First Reported In:	unspecified
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Florian Krohm

URL:
Keywords:

Depends on:
Blocks:

Reported:	2025-07-14 16:45 UTC by Florian Krohm
Modified:	2025-08-26 20:54 UTC (History)
CC List:	0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
Patch for x86 (6.02 KB, patch) 2025-07-23 10:38 UTC, Florian Krohm	Details
Patch for amd64 (7.72 KB, patch) 2025-07-23 13:07 UTC, Florian Krohm	Details
Patch for nanomips (1.64 KB, patch) 2025-07-23 14:12 UTC, Florian Krohm	Details
Patch for mips (11.18 KB, patch) 2025-07-23 15:22 UTC, Florian Krohm	Details
Patch for arm (4.12 KB, patch) 2025-07-23 16:12 UTC, Florian Krohm	Details
Patch for arm64 (1.54 KB, patch) 2025-07-23 17:07 UTC, Florian Krohm	Details
Patch for ppc (2.56 KB, patch) 2025-08-24 20:20 UTC, Florian Krohm	Details
Remove Iop_Clz32/64 and Iop_Ctz32/64 (8.67 KB, patch) 2025-08-24 20:43 UTC, Florian Krohm	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Florian Krohm 2025-07-14 16:45:58 UTC

From libvex_ir.h:

      /* Ctz64/Ctz32/Clz64/Clz32 are UNDEFINED when given arguments of zero.    
         You must ensure they are never given a zero argument.  As of           
         2018-Nov-14 they are deprecated.  Try to use the Nat variants          
         immediately below, if you can.                                         
      */

I'd say: time to get rid of them.
Announce removal in 3.25.2. Remove in 3.26.. (or whatever the process is)

Comment 1 Florian Krohm 2025-07-23 08:20:13 UTC

The semantics of Iop_Clz32/64 are identical to Iop_ClzNat32/64 unless the input value is zero.
And likewise for Iop_Ctz32/64.

So.... audit the code, check occurrences of above operators and if their use is guarded to ensure
that the input value is != 0 then the operator can be replaced with its "Nat" counterpart.
Easy as pie.
Spoiler alert: not always so easy as there are "interesting" uses of Iop_Clz ....

Comment 2 Florian Krohm 2025-07-23 10:38:57 UTC

Created attachment 183451 [details]
Patch for x86

All occurrences of Iop_Clz32 and Iop_Ctz were properly guarded against a zero input.
Remove those IROps from insn selection. Adjust vbit-test accordingly.

Comment 3 Florian Krohm 2025-07-23 13:07:42 UTC

Created attachment 183455 [details]
Patch for amd64

As expected all occurrences of Iop_Clz64 and Iop_Ctz64 were properly guarded against
 a zero input. Remove those IROps from insn selection. Adjust vbit-test accordingly.

Comment 4 Florian Krohm 2025-07-23 14:12:01 UTC

Created attachment 183457 [details]
Patch for nanomips

There are 2 occurrences of Iop_Clz32.

#1 for the CLZ insn:

      case nano_POOL32Axf4_CLZ: {  /* clz */
         DIP("clz r%u, r%u", rt, rs);
         putIReg(rt, unop(Iop_Clz32, getIReg(rs)));
         break;
      }

This what happens:

    CLZ insn --ir--> Iop_Clz32 --isel--> NMun_CLZ --emit--> CLZ

The CLZ insn has no special behaviour when the input is 0. Using Iop_Clz32
is not the right choice then as it has an undefined result for a 0 input.
This all works out, though, because in the end a CLZ insn will be emitted.
So we can simply change Iop_Clz64 --> Iop_ClzNat64 and be done here.

#2 for the CLO insn:

      case nano_POOL32Axf4_CLO: {  /* clo */
         DIP("clo r%u, r%u", rt, rs);
         t1 = newTemp(Ity_I1);
         assign(t1, binop(Iop_CmpEQ32, getIReg(rs), mkU32(0xffffffff)));
         putIReg(rt, IRExpr_ITE(mkexpr(t1),
                                mkU32(0x00000020),
                                unop(Iop_Clz32,
                                     unop(Iop_Not32, getIReg(rs)))));
         break;
      }

The special case is not needed because Clz32(0) == 32;

      case nano_POOL32Axf4_CLO: {  /* clo */
         DIP("clo r%u, r%u", rt, rs);
         putIReg(rt, unop(Iop_Clz32, unop(Iop_Not32, getIReg(rs)))));
         break;
      }

Done.

Comment 5 Florian Krohm 2025-07-23 15:22:27 UTC

Created attachment 183461 [details]
Patch for mips

There were multiple uses of Iop_Clz32 and Iop_Clz64 to implement CLZ, CLO,
DCLZ and DCLO. Sometimes but not always there was code to guard against a
0 input value. But it does not matter. Because in the end CLZ and DCLZ were
emitted which do the right thing. See also the comment about nanomips.

The patch removes any special handling for 0 and replaces Iop_Clz32 with
Iop_ClzNat32 and Iop_Clz64 with Iop_ClzNat64.

Comment 6 Florian Krohm 2025-07-23 16:12:31 UTC

Created attachment 183464 [details]
Patch for arm

Replace Iop_Clz32 with Iop_ClzNat32 and remove special handling for a 0 input value.

CLZ --ir--> Iop_ClzNat32 --isel--> ARMun_CLZ --emit--> CLZ

CLZ handles all input values correctly

Comment 7 Florian Krohm 2025-07-23 17:07:12 UTC

Created attachment 183468 [details]
Patch for arm64

Replace Iop_Clz64 with Iop_ClzNat64 for which eventually a CLZ insn will be emitted and that
handles all input values correctly.

Comment 8 Florian Krohm 2025-07-29 15:11:44 UTC

I'd like to move this along. Will check in the x86 / amd64 patches soonish.
If somebody with access to arm/64 and mips/nanomips could regtest the patches above that would be great.
Still waiting to head back from Carl on ppc.

Comment 9 Florian Krohm 2025-08-24 20:20:44 UTC

Created attachment 184410 [details]
Patch for ppc

Iop_Clz64 and Iop_ClzNat64 are already handled the same way, namely:

Iop_Clz64 / Iop_ClzNat64 --isel--> Pun_CLZ64 --emit-->cntlzd

where the cntlzd insn has "natural" semantics and a 0 operand is not special.

Comment 10 Florian Krohm 2025-08-24 20:43:06 UTC

Created attachment 184414 [details]
Remove Iop_Clz32/64 and Iop_Ctz32/64

The patch removes those IRops from libvex_ir.h and fixes the ripple.

What is left to do is to fix the insn emitters for x86 and amd64 so they handle a
0 operand properly i.e. "naturally".  This isn't a problem in valgrind operation because
in IR generation a 0 operand is handled as a special case. See also https://bugs.kde.org/show_bug.cgi?id=506211#c2

Comment 11 Florian Krohm 2025-08-26 20:54:05 UTC

Fixed in 7389eabaf3