Bug 484426 - aarch64: 0.5 gets rounded to 0
Summary: aarch64: 0.5 gets rounded to 0
Status: RESOLVED FIXED
Alias: None
Product: valgrind
Classification: Developer tools
Component: general (show other bugs)
Version: 3.16.1
Platform: Debian stable Linux
: NOR normal
Target Milestone: ---
Assignee: Paul Floyd
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-03-25 02:07 UTC by git.53jvg
Modified: 2024-03-28 20:57 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Patch adding new Iops (13.49 KB, patch)
2024-03-27 08:33 UTC, Paul Floyd
Details

Note You need to log in before you can comment on or make changes to this bug.
Description git.53jvg 2024-03-25 02:07:46 UTC
SUMMARY
Running the following program with 0.5 as the argument outputs 1 when run without valgrind and 0 when run with valgrind

#include <iostream>
#include <cmath>

int main(int argc, char** argv)
{
        float value = std::stof(argv[1]);
        std::cout << std::round(value) << "\n";
}

I compiled the program with the following command: g++ ./main.cpp -o rounding_test
I also tried different optimization levels, but didn't find them to make a difference.

OBSERVED RESULT
When run under valgrind the program rounds 0.5 down to 0 instead of rounding up to 1. Also -0.5 seems to round to -0 instead of -1.

EXPECTED RESULT
Rounding 0.5 should match the normal behavior and round up to 1

SOFTWARE/OS VERSIONS
Debian GNU/Linux 11 (bullseye) aarch64
Raspberry Pi 4 Model B Rev 1.4
Comment 1 Paul Floyd 2024-03-25 06:26:28 UTC
I get the same with clang++17, FreeBSD 14 on an RPi 5.

The code compiles to a call to __builtin_round

│    0x211820 <_Z5roundB7v160006f+12>                                                                                                                                fcvt    d0, s0
│    0x211824 <_Z5roundB7v160006f+16>                                                                                                                                frinta  d0, d0
│    0x211828 <_Z5roundB7v160006f+20>                                                                                                                                fcvt    s0, d0

I think that means
convert float to double
round to int
convert double to float

Loaded into s0 I see
(gdb) p $s0
$4 = {f = 0.5, u = 1056964608, s = 1056964608}
(gdb) p /x $s0
$5 = {f = 0x3f000000, u = 0x3f000000, s = 0x3f000000}

After conversion to double
(gdb) p $d0
$8 = {f = 5.2220990168285998e-315, u = 1056964608, s = 1056964608}
(gdb) p /x $d0
$9 = {f = 0x3f000000, u = 0x3f000000, s = 0x3f000000}

Looks wrong

After the rounding
(gdb) p $d0
$10 = {f = 0, u = 0, s = 0}

And convert back to float
(gdb) p $s0
$12 = {f = 0, u = 0, s = 0}


Now, if I fix $d0 in gdb I get

(gdb) set $d0=0.5
(gdb) p $d0
$14 = {f = 0.5, u = 4602678819172646912, s = 4602678819172646912}
(gdb) p /x $d0
$15 = {f = 0x3fe0000000000000, u = 0x3fe0000000000000, s = 0x3fe0000000000000}

but that still rounds down to zero.

If I comple a double version, the __builtin_round becomes just a frinta, the value for $d0 is the same as when I set it in gdb and it still rounds down to 0.0.

So I see two problems, the first with fcvt and the second with frinta.
Comment 2 Paul Floyd 2024-03-25 07:20:32 UTC
Comment from code:

   * FRINTA, FRINTN are kludged .. they just round to nearest.  No special
     handling for the "ties" case.  FRINTX might be dubious too.

   * Ditto FCVTXN.  No idea what "round to odd" means.  This implementation
     just rounds to nearest.

"round to odd" does what the name says and rounds to odd integers. It's one of several rounding methods with the advantage that (like round to even) there is no bias, half of ties round up ("even point 5") and half round down ("odd point five").  So 0.5 rounds to 1.0. 

https://developer.arm.com/documentation/dui0801/g/A64-Floating-point-Instructions/FCVT?lang=en

says that depends on the FPCR

gdb says

fpcr           0x0                 [ Len=0 Stride=0 RMode=0 ]

RMode 0 means round to zero, which is probably just about the only thing ever used.

For float to double there should be no rounding (which I guess only applies to the ULP for floating point conversions).
Comment 3 Paul Floyd 2024-03-26 21:17:23 UTC
I think that I understand the problem. arm64 has two overlapping ways of doing floating point rounding.

The mnemonics are FRINT (Float Round to Int) and then one more letter for the kind of rounding.

The FPCR (Floating Point Control Register) has 2 bits for the rounding mode (00 nearest 01 towards +inf 10 towards -inf 11 towards zero)

I - use the current rounding mode
P - like rounding mode 01
M - like rounding mode 10
Z - like rounding mode 11

but then there is
N - to nearest, ties to even
A - to nearest, ties away from zero
X - exact, like I but causes inexect FPE or sets IneXact Computing bit in FPSR

Out problem is that we're trying to use FRINTI for everything in the generated code. That's not too bad for FRINTX (we don't support FPEs). But it's wrong for N and A.
Comment 4 Paul Floyd 2024-03-27 08:33:17 UTC
Created attachment 167836 [details]
Patch adding new Iops

Dunno why arm64 didn't add more FPCR rounding modes. Probably because of historical baggage from am32 where a) there are only 32 bits and b) fpcsr does the job of both control fpcr and status fpsr.

Anyway, this seems to work. Developed on FreeBSD but it should work OK on Linux.

Small bonus: frinta and frintn now generate unops and don't pfaff around with the rounding mode.
Comment 5 Paul Floyd 2024-03-27 08:35:38 UTC
The patch doesn't cover the vector versions.
Comment 6 Mark Wielaard 2024-03-28 16:51:07 UTC
This looks good to me. But I have not tested it. And I have to trust you on the emit code (too lazy to look up the instr enc)

It looks like none/tests/arm64/fp_and_simd.c already has tests for both frinti and frinta.
Did they just happen to work just fine with the existing code?
Comment 7 Paul Floyd 2024-03-28 17:00:52 UTC
(In reply to Mark Wielaard from comment #6)
> This looks good to me. But I have not tested it. And I have to trust you on
> the emit code (too lazy to look up the instr enc)
> 
> It looks like none/tests/arm64/fp_and_simd.c already has tests for both
> frinti and frinta.
> Did they just happen to work just fine with the existing code?

I haven't looked at the testcase to see exactly what values they are using. 

frinta only affects floating point values that are "integer.5". frintn only affects the result for "odd integer.5" values. So I guess that the testcase doesn't use "point five" values.
Comment 8 Paul Floyd 2024-03-28 20:57:51 UTC
commit dc30fbf673953fefb115d2cf441119ee28039c9c (HEAD -> master, origin/master, origin/HEAD)
Author: Paul Floyd <pjfloyd@wanadoo.fr>
Date:   Thu Mar 28 20:55:38 2024 +0100

    Bug 484426 - aarch64: 0.5 gets rounded to 0