Bug 463458

Summary: memcheck/tests/vcpu_fnfns fails when glibc is built for x86-64-v3 target
Product: [Developer tools] valgrind Reporter: Alexander Kanavin <alex.kanavin>
Component: memcheckAssignee: Julian Seward <jseward>
Status: RESOLVED FIXED    
Severity: normal CC: mark, pjfloyd, sam
Priority: NOR    
Version: 3.20.0   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:

Description Alexander Kanavin 2022-12-25 15:03:13 UTC
Yocto project is transitioning x86_64 builds to build for x86-64-v3 (e.g. enabling AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE).

Presumably, something in the instructions enabled by that target (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) distorts the floating point results slightly in libm math functions:

memcheck/tests/vcpu_fnfns.stdout.diff
************
--- vcpu_fnfns.stdout.exp	2018-03-09 12:34:56.000000000 +0000
+++ vcpu_fnfns.stdout.out	2022-12-25 14:43:29.195000000 +0000
@@ -305,7 +305,7 @@
   cosF(         -3.0170e-01) =          +9.5483e-01
   cosF(         -2.0180e-01) =          +9.7971e-01
   cosF(         -1.0190e-01) =          +9.9481e-01
-  cosF(         -1.9999e-03) =          +1.0000e-00
+  cosF(         -1.9999e-03) =          +1.0000e+00
   cosF(         +9.7900e-02) =          +9.9521e-01
   cosF(         +1.9780e-01) =          +9.8050e-01
   cosF(         +2.9770e-01) =          +9.5601e-01
@@ -536,7 +536,7 @@
   logD(+9.9999999900000e-02) = -2.3025850939940e+00
   logD(+1.9999999980000e-01) = -1.6094379134341e+00
   logD(+2.9999999970000e-01) = -1.2039728053259e+00
-  logD(+3.9999999960000e-01) = -9.1629073287415e-01
+  logD(+3.9999999960000e-01) = -9.1629073287416e-01
   logD(+4.9999999950000e-01) = -6.9314718155995e-01
   logD(+5.9999999940000e-01) = -5.1082562476599e-01
   logD(+6.9999999930000e-01) = -3.5667494493873e-01


More detailed information about the feature level targets in gcc is here:
https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels
Comment 1 Mark Wielaard 2024-04-12 14:43:21 UTC
Confirmed. But also seems to be fixed with the patch from https://bugs.kde.org/show_bug.cgi?id=481127
Comment 2 Mark Wielaard 2024-04-13 13:06:54 UTC
commit a5693c1203c3a26443af13182a8082c2e9152f6c
Author: Mark Wielaard <mark@klomp.org>
Date:   Sat Apr 13 14:33:19 2024 +0200

    amd64: Implement VFMADD213 for Iop_MAddF32 and Iop_MAddF64
    
    Speed up F32 and F64 FMA on amd64. Add priv/host_amd64_maddf.c
    implementing h_amd64_calc_MAddF32_fma4 and h_amd64_calc_MAddF64_fma4
    to be used instead of the generic variants h_generic_calc_MAddF32
    and h_generic_calc_MAddF64 when host has VEX_HWCAPS_AMD64_FMA4.
    Add fma3 and fma4 detection m_machine.c (machine_get_hwcaps).
    
    This patch also fixes the memcheck/tests/vcpu_fnfns and
    none/tests/amd64/fma testcases when run on a x86-64-v3 system.
    
    Patch contributed by Grazvydas Ignotas <notasas@gmail.com> and
    Bruno Lathuilière <bruno.lathuiliere@edf.fr>
    
    https://bugs.kde.org/show_bug.cgi?id=481127
    https://bugs.kde.org/show_bug.cgi?id=463463
    https://bugs.kde.org/show_bug.cgi?id=463458