SUMMARY Running the following program with 0.5 as the argument outputs 1 when run without valgrind and 0 when run with valgrind #include <iostream> #include <cmath> int main(int argc, char** argv) { float value = std::stof(argv[1]); std::cout << std::round(value) << "\n"; } I compiled the program with the following command: g++ ./main.cpp -o rounding_test I also tried different optimization levels, but didn't find them to make a difference. OBSERVED RESULT When run under valgrind the program rounds 0.5 down to 0 instead of rounding up to 1. Also -0.5 seems to round to -0 instead of -1. EXPECTED RESULT Rounding 0.5 should match the normal behavior and round up to 1 SOFTWARE/OS VERSIONS Debian GNU/Linux 11 (bullseye) aarch64 Raspberry Pi 4 Model B Rev 1.4
I get the same with clang++17, FreeBSD 14 on an RPi 5. The code compiles to a call to __builtin_round │ 0x211820 <_Z5roundB7v160006f+12> fcvt d0, s0 │ 0x211824 <_Z5roundB7v160006f+16> frinta d0, d0 │ 0x211828 <_Z5roundB7v160006f+20> fcvt s0, d0 I think that means convert float to double round to int convert double to float Loaded into s0 I see (gdb) p $s0 $4 = {f = 0.5, u = 1056964608, s = 1056964608} (gdb) p /x $s0 $5 = {f = 0x3f000000, u = 0x3f000000, s = 0x3f000000} After conversion to double (gdb) p $d0 $8 = {f = 5.2220990168285998e-315, u = 1056964608, s = 1056964608} (gdb) p /x $d0 $9 = {f = 0x3f000000, u = 0x3f000000, s = 0x3f000000} Looks wrong After the rounding (gdb) p $d0 $10 = {f = 0, u = 0, s = 0} And convert back to float (gdb) p $s0 $12 = {f = 0, u = 0, s = 0} Now, if I fix $d0 in gdb I get (gdb) set $d0=0.5 (gdb) p $d0 $14 = {f = 0.5, u = 4602678819172646912, s = 4602678819172646912} (gdb) p /x $d0 $15 = {f = 0x3fe0000000000000, u = 0x3fe0000000000000, s = 0x3fe0000000000000} but that still rounds down to zero. If I comple a double version, the __builtin_round becomes just a frinta, the value for $d0 is the same as when I set it in gdb and it still rounds down to 0.0. So I see two problems, the first with fcvt and the second with frinta.
Comment from code: * FRINTA, FRINTN are kludged .. they just round to nearest. No special handling for the "ties" case. FRINTX might be dubious too. * Ditto FCVTXN. No idea what "round to odd" means. This implementation just rounds to nearest. "round to odd" does what the name says and rounds to odd integers. It's one of several rounding methods with the advantage that (like round to even) there is no bias, half of ties round up ("even point 5") and half round down ("odd point five"). So 0.5 rounds to 1.0. https://developer.arm.com/documentation/dui0801/g/A64-Floating-point-Instructions/FCVT?lang=en says that depends on the FPCR gdb says fpcr 0x0 [ Len=0 Stride=0 RMode=0 ] RMode 0 means round to zero, which is probably just about the only thing ever used. For float to double there should be no rounding (which I guess only applies to the ULP for floating point conversions).
I think that I understand the problem. arm64 has two overlapping ways of doing floating point rounding. The mnemonics are FRINT (Float Round to Int) and then one more letter for the kind of rounding. The FPCR (Floating Point Control Register) has 2 bits for the rounding mode (00 nearest 01 towards +inf 10 towards -inf 11 towards zero) I - use the current rounding mode P - like rounding mode 01 M - like rounding mode 10 Z - like rounding mode 11 but then there is N - to nearest, ties to even A - to nearest, ties away from zero X - exact, like I but causes inexect FPE or sets IneXact Computing bit in FPSR Out problem is that we're trying to use FRINTI for everything in the generated code. That's not too bad for FRINTX (we don't support FPEs). But it's wrong for N and A.
Created attachment 167836 [details] Patch adding new Iops Dunno why arm64 didn't add more FPCR rounding modes. Probably because of historical baggage from am32 where a) there are only 32 bits and b) fpcsr does the job of both control fpcr and status fpsr. Anyway, this seems to work. Developed on FreeBSD but it should work OK on Linux. Small bonus: frinta and frintn now generate unops and don't pfaff around with the rounding mode.
The patch doesn't cover the vector versions.
This looks good to me. But I have not tested it. And I have to trust you on the emit code (too lazy to look up the instr enc) It looks like none/tests/arm64/fp_and_simd.c already has tests for both frinti and frinta. Did they just happen to work just fine with the existing code?
(In reply to Mark Wielaard from comment #6) > This looks good to me. But I have not tested it. And I have to trust you on > the emit code (too lazy to look up the instr enc) > > It looks like none/tests/arm64/fp_and_simd.c already has tests for both > frinti and frinta. > Did they just happen to work just fine with the existing code? I haven't looked at the testcase to see exactly what values they are using. frinta only affects floating point values that are "integer.5". frintn only affects the result for "odd integer.5" values. So I guess that the testcase doesn't use "point five" values.
commit dc30fbf673953fefb115d2cf441119ee28039c9c (HEAD -> master, origin/master, origin/HEAD) Author: Paul Floyd <pjfloyd@wanadoo.fr> Date: Thu Mar 28 20:55:38 2024 +0100 Bug 484426 - aarch64: 0.5 gets rounded to 0