463458 – memcheck/tests/vcpu_fnfns fails when glibc is built for x86-64-v3 target

Bug 463458 - memcheck/tests/vcpu_fnfns fails when glibc is built for x86-64-v3 target

Summary: memcheck/tests/vcpu_fnfns fails when glibc is built for x86-64-v3 target

Status:	RESOLVED FIXED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	memcheck (show other bugs)
Version:	3.20.0
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Depends on:
Blocks:

Reported:	2022-12-25 15:03 UTC by Alexander Kanavin
Modified:	2024-04-13 13:06 UTC (History)
CC List:	3 users (show)

See Also:
Latest Commit:
Version Fixed In:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Alexander Kanavin 2022-12-25 15:03:13 UTC

Yocto project is transitioning x86_64 builds to build for x86-64-v3 (e.g. enabling AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE).

Presumably, something in the instructions enabled by that target (AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE) distorts the floating point results slightly in libm math functions:

memcheck/tests/vcpu_fnfns.stdout.diff
************
--- vcpu_fnfns.stdout.exp	2018-03-09 12:34:56.000000000 +0000
+++ vcpu_fnfns.stdout.out	2022-12-25 14:43:29.195000000 +0000
@@ -305,7 +305,7 @@
   cosF(         -3.0170e-01) =          +9.5483e-01
   cosF(         -2.0180e-01) =          +9.7971e-01
   cosF(         -1.0190e-01) =          +9.9481e-01
-  cosF(         -1.9999e-03) =          +1.0000e-00
+  cosF(         -1.9999e-03) =          +1.0000e+00
   cosF(         +9.7900e-02) =          +9.9521e-01
   cosF(         +1.9780e-01) =          +9.8050e-01
   cosF(         +2.9770e-01) =          +9.5601e-01
@@ -536,7 +536,7 @@
   logD(+9.9999999900000e-02) = -2.3025850939940e+00
   logD(+1.9999999980000e-01) = -1.6094379134341e+00
   logD(+2.9999999970000e-01) = -1.2039728053259e+00
-  logD(+3.9999999960000e-01) = -9.1629073287415e-01
+  logD(+3.9999999960000e-01) = -9.1629073287416e-01
   logD(+4.9999999950000e-01) = -6.9314718155995e-01
   logD(+5.9999999940000e-01) = -5.1082562476599e-01
   logD(+6.9999999930000e-01) = -3.5667494493873e-01


More detailed information about the feature level targets in gcc is here:
https://www.phoronix.com/news/GCC-11-x86-64-Feature-Levels

Comment 1 Mark Wielaard 2024-04-12 14:43:21 UTC

Confirmed. But also seems to be fixed with the patch from https://bugs.kde.org/show_bug.cgi?id=481127

Comment 2 Mark Wielaard 2024-04-13 13:06:54 UTC

commit a5693c1203c3a26443af13182a8082c2e9152f6c
Author: Mark Wielaard <mark@klomp.org>
Date:   Sat Apr 13 14:33:19 2024 +0200

    amd64: Implement VFMADD213 for Iop_MAddF32 and Iop_MAddF64
    
    Speed up F32 and F64 FMA on amd64. Add priv/host_amd64_maddf.c
    implementing h_amd64_calc_MAddF32_fma4 and h_amd64_calc_MAddF64_fma4
    to be used instead of the generic variants h_generic_calc_MAddF32
    and h_generic_calc_MAddF64 when host has VEX_HWCAPS_AMD64_FMA4.
    Add fma3 and fma4 detection m_machine.c (machine_get_hwcaps).
    
    This patch also fixes the memcheck/tests/vcpu_fnfns and
    none/tests/amd64/fma testcases when run on a x86-64-v3 system.
    
    Patch contributed by Grazvydas Ignotas <notasas@gmail.com> and
    Bruno Lathuilière <bruno.lathuiliere@edf.fr>
    
    https://bugs.kde.org/show_bug.cgi?id=481127
    https://bugs.kde.org/show_bug.cgi?id=463463
    https://bugs.kde.org/show_bug.cgi?id=463458