Bug 463458

Summary:	memcheck/tests/vcpu_fnfns fails when glibc is built for x86-64-v3 target
Product:	[Developer tools] valgrind	Reporter:	Alexander Kanavin <alex.kanavin>
Component:	memcheck	Assignee:	Julian Seward <jseward>
Status:	RESOLVED FIXED
Severity:	normal	CC:	mark, pjfloyd, sam
Priority:	NOR
Version First Reported In:	3.20.0
Target Milestone:	---
Platform:	Other
OS:	Linux
Latest Commit:		Version Fixed/Implemented In:
Sentry Crash Report:

Description Alexander Kanavin 2022-12-25 15:03:13 UTC

Yocto project is transitioning x86_64 builds to build for x86-64-v3 (e.g. enabling AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE).

Presumably, something in the instructions enabled by that target (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) distorts the floating point results slightly in libm math functions:

memcheck/tests/vcpu_fnfns.stdout.diff
************
--- vcpu_fnfns.stdout.exp	2018-03-09 12:34:56.000000000 +0000
+++ vcpu_fnfns.stdout.out	2022-12-25 14:43:29.195000000 +0000
@@ -305,7 +305,7 @@
   cosF(         -3.0170e-01) =          +9.5483e-01
   cosF(         -2.0180e-01) =          +9.7971e-01
   cosF(         -1.0190e-01) =          +9.9481e-01
-  cosF(         -1.9999e-03) =          +1.0000e-00
+  cosF(         -1.9999e-03) =          +1.0000e+00
   cosF(         +9.7900e-02) =          +9.9521e-01
   cosF(         +1.9780e-01) =          +9.8050e-01
   cosF(         +2.9770e-01) =          +9.5601e-01
@@ -536,7 +536,7 @@
   logD(+9.9999999900000e-02) = -2.3025850939940e+00
   logD(+1.9999999980000e-01) = -1.6094379134341e+00
   logD(+2.9999999970000e-01) = -1.2039728053259e+00
-  logD(+3.9999999960000e-01) = -9.1629073287415e-01
+  logD(+3.9999999960000e-01) = -9.1629073287416e-01
   logD(+4.9999999950000e-01) = -6.9314718155995e-01
   logD(+5.9999999940000e-01) = -5.1082562476599e-01
   logD(+6.9999999930000e-01) = -3.5667494493873e-01


More detailed information about the feature level targets in gcc is here:
https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels

Comment 1 Mark Wielaard 2024-04-12 14:43:21 UTC

Confirmed. But also seems to be fixed with the patch from https://bugs.kde.org/show_bug.cgi?id=481127

Comment 2 Mark Wielaard 2024-04-13 13:06:54 UTC

commit a5693c1203c3a26443af13182a8082c2e9152f6c
Author: Mark Wielaard <mark@klomp.org>
Date:   Sat Apr 13 14:33:19 2024 +0200

    amd64: Implement VFMADD213 for Iop_MAddF32 and Iop_MAddF64
    
    Speed up F32 and F64 FMA on amd64. Add priv/host_amd64_maddf.c
    implementing h_amd64_calc_MAddF32_fma4 and h_amd64_calc_MAddF64_fma4
    to be used instead of the generic variants h_generic_calc_MAddF32
    and h_generic_calc_MAddF64 when host has VEX_HWCAPS_AMD64_FMA4.
    Add fma3 and fma4 detection m_machine.c (machine_get_hwcaps).
    
    This patch also fixes the memcheck/tests/vcpu_fnfns and
    none/tests/amd64/fma testcases when run on a x86-64-v3 system.
    
    Patch contributed by Grazvydas Ignotas <notasas@gmail.com> and
    Bruno Lathuilière <bruno.lathuiliere@edf.fr>
    
    https://bugs.kde.org/show_bug.cgi?id=481127
    https://bugs.kde.org/show_bug.cgi?id=463463
    https://bugs.kde.org/show_bug.cgi?id=463458