197915 – 80-bit floats are not supported on x86 and x86-64

Bug 197915 - 80-bit floats are not supported on x86 and x86-64

Summary: 80-bit floats are not supported on x86 and x86-64

Status:	ASSIGNED

Alias:	None

Product:	valgrind
Classification:	Developer tools
Component:	general (show other bugs)
Version:	3.5 SVN
Platform:	unspecified All

Importance:	HI normal
Target Milestone:	---
Assignee:	Julian Seward

URL:
Keywords:

Duplicates (7):	117742 121029 130358 147241 188984 201670 421262 (view as bug list)
Depends on:
Blocks:

Reported:	2009-06-26 02:37 UTC by Nicholas Nethercote
Modified:	2024-12-03 12:06 UTC (History)
CC List:	22 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
demonstrate valgrind accuracy problem with libm acos() (266 bytes, text/x-c++src) 2010-05-05 20:16 UTC, Jeff Epler	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Nicholas Nethercote 2009-06-26 02:37:16 UTC

From the user manual:

"Precision: There is no support for 80 bit arithmetic. Internally, Valgrind represents all such "long double" numbers in 64 bits, and so there may be some differences in results. Whether or not this is critical remains to be seen. Note, the x86/amd64 fldt/fstpt instructions (read/write 80-bit numbers) are correctly simulated, using conversions to/from 64 bits, so that in-memory images of 80-bit numbers look correct if anyone wants to see."

I believe Julian doesn't plan to add 80-bit precision to Vex -- the argument is that it's something of a pain and programs that rely on 80-bit precision are not portable.  But people hit this every so often and complain, so I'm creating this bug as a placeholder for the issue.

Comment 1 Nicholas Nethercote 2009-06-26 02:38:04 UTC

*** Bug 121029 has been marked as a duplicate of this bug. ***

Comment 2 Nicholas Nethercote 2009-06-29 09:46:01 UTC

*** Bug 147241 has been marked as a duplicate of this bug. ***

Comment 3 Nicholas Nethercote 2009-07-01 08:48:25 UTC

*** Bug 117742 has been marked as a duplicate of this bug. ***

Comment 4 Tom Hughes 2009-07-28 11:41:00 UTC

*** Bug 201670 has been marked as a duplicate of this bug. ***

Comment 5 tom.vercauteren 2009-08-13 15:01:35 UTC

The following test case shows a minimal portable code example for which limiting the precision of long doubles to 64 bits on a system that has 80-bit long doubles implies changing some standard semantics:

#include <cassert>
#include <limits>

int main()
{
  // std::numeric_limits<long double>::min() is typically close to
  // 3.3621e-4932
  assert( std::numeric_limits<long double>::min() > 0.0L );
}

This code only asserts within valgrind. More discussion can be found on the mailing list:
http://sourceforge.net/mailarchive/forum.php?thread_name=28392e8b0908120240lbf4c314qb4086e6c17731f7e%40mail.gmail.com&forum_name=valgrind-users

Comment 6 Nicholas Nethercote 2009-08-18 00:47:17 UTC

*** Bug 188984 has been marked as a duplicate of this bug. ***

Comment 7 Nicholas Nethercote 2009-08-18 00:48:17 UTC

(In reply to comment #6)
> *** Bug 188984 has been marked as a duplicate of this bug. ***

That bug has a small test case that is worth looking at.

Comment 8 Jeff Epler 2010-05-05 20:16:25 UTC

Created attachment 43276 [details]
demonstrate valgrind accuracy problem with libm acos()

Even for programs that don't explicitly use variables of type 'long double' there are visible consequences of this omission.

on x86, glibc's acos(x) basically computes fpatan(fsqrt(1-x*x),x) [where fpatan and fsqrt are the x87 instructions].  acos(x) = atan2(sqrt(1-x*x),x) is a trigonometric identity, and when the temporaries are stored with 'long double' precision it also gives accurate results for all 'double' arguments.

When the subexpression 1-x*x is computed with only "double" precision, acos() becomes very inaccurate for values close to 1.  In the following example, 18 LSBs differ when running on valgrind.

$ g++ -O2 ac.cc
$ ./a.out .999999; valgrind --log-file=/dev/null ./a.out .999999
acos(          0.999999) ->  1.4142136802445865e-03 [0X1.72BA46065AF1CP-10]
acos(          0.999999) ->  1.4142136802524064e-03 [0X1.72BA460663BFBP-10]
                                                 differing LSBs  ^^^^^

Comment 9 Julian Seward 2010-07-12 15:58:25 UTC

As per comment #0, adding support for 80-bit floats is low
priority, because (1) AIUI the majority of floating point code
is portable and restricts itself to 64-bit values, and (2) doing
80-bit support will soak up a considerable amount of engineering
effort.  So it's not an easy case to make, and we are already
extremely resource-constrained w.r.t. development effort.

If anyone wants to hack up a patch to do this I would be at least
willing to review it and provide feedback.

Comment 10 Duncan Sands 2010-07-12 17:16:12 UTC

Another possibility is to add support for a mode where 80 bit float operations
are executed natively, i.e. valgrind does not try to track uninitialized bits
etc in floats.  Hopefully this would be simpler to implement.  In my case this
would be helpful because the problem I have is not that valgrind isn't catching
use of uninitialized float values.  The problem is that due to valgrind rounding
to 64 bits, programs run under valgrind behave differently to programs run
natively: if they run long enough different code paths are taken.  It can occur
that when run natively there is a memory error (such as use after free) that does
not show up under when run under valgrind because of this.

Comment 11 Risto Vanhanen 2012-09-04 17:33:01 UTC

Just used a day to find this out the hard way. I am willing to use another if it gets fixed. So, can a know-nothing-about-how-valgrind-works implement support in a day? :)

Comment 12 Mark Wielaard 2012-10-14 14:17:46 UTC

Another instance/example of this bug hitting/confusing people can be found in the fedora bug tracker: https://bugzilla.redhat.com/show_bug.cgi?id=837650

Comment 13 Philippe Waroquiers 2013-03-05 21:07:56 UTC

*** Bug 130358 has been marked as a duplicate of this bug. ***

Comment 14 Philippe Waroquiers 2013-03-05 21:09:24 UTC

(In reply to comment #13)
> *** Bug 130358 has been marked as a duplicate of this bug. ***
This bug has a small Ada test case.

Comment 15 Bud Davis 2013-05-19 05:08:25 UTC

A fortran example:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46703

note that the 80 bit floating format is popular in the fortran community (aka REAL*10)

Comment 16 Rich Felker 2014-02-14 18:53:47 UTC

I would like to ask that this issue be prioritized, as it's a show-stopper for using valgrind on programs linked with musl libc that perform floating point/decimal conversions with printf/scanf/strtod-family functions. musl always performs such computations in long double, and has very good reason to do so: on targets where FLT_EVAL_METHOD==2, it's impossible to utilize rounding with float or double in a predictable, portable manner, since the computations will actually take place in long double, and the results will be double-rounded (rounded twice). By using long double and the float.h LDBL_* macros describing the long double representation, we're able to give correct results with a single unified implementation on all targets where long double has IEEE semantics (which is a documented requirement for musl). However, since valgrind changes the properties of long double not to match the compile-time properties reported by float.h, our (valid) assumptions are broken and the code does not behave as expected. I have not performed testing (presently, I believe there may still be some other obstacles to using valgrind with musl) to determine how badly the results are affected, but it's conceivable that they could be completely wrong (as opposed to just lacking precision or correct rounding).

Comment 17 Peter Barnes 2014-03-14 19:30:35 UTC

I'd like to echo Rich Felker's comment #16:  we also have code that is sensitive to the available precision.  We use std::numeric_limits<long double>::epsilon () to condition on that, but, as Tom Vercauteren (comment #5) puts it, valgrind silently using 64-bit doubles means the semantics are wrong.
+20 votes

Comment 18 Peter Barnes 2014-03-14 19:35:18 UTC

A question for my edification:  if valgrind is silently replacing 80-bit floating point variables with 64-bits, doesn't that change the memory layout? In the case of adjacent data members not 64 bit aligned, doesn't that change their offsets?  and hence the validity of valgrind results? (it's checking some other code base, with 64-bit floats, not my code base, with 80-bit floats).

Or does it keep the 80-bit variable, but only do 64-bit arithmetic and function calls?

Comment 19 Rich Felker 2014-03-14 20:10:00 UTC

Obviously it can't change the memory layout. In theory it could write doubles padded with 2 zero/junk bytes, but of course that would also break a lot of software (anything inspecting the representation). As far as I know what it really does is simply perform the arithmetic in doubles, but load/store the correct ld80 representation when requested to.

Comment 20 Julian Seward 2014-05-09 11:36:09 UTC

I would actually like to get this fixed, at least for 64-bit
processes.  The resulting non-usefulness of the tools for some groups
of users bothers me, but we are extremely resource constrained, and
this is a significant amount of work.

Would anybody be willing to help out by writing a proper test program?
It would need to exercise all the relevant x87 instructions
individually, in such a way that it makes clear when the
implementation of an instruction is correct (no accuracy loss) vs when
it is incorrect.  It would have to be fairly convincing, along the
lines of none/tests/amd64/sse4-64.c, perhaps.

If such a test case did exist, I would be a lot more motivated to
grapple with the compilation pipeline (VEX) aspects of the fix.  Or,
for that matter, to help out anybody who wanted to try fixing it
themselves.

Comment 21 Bud Davis 2014-05-15 19:55:36 UTC

Do you have any objection if I forward this e-mail to the gfortran mailing list ?
There might be a user interested enough to help out in that group.

regards,
Bud Davis



________________________________
 From: Julian Seward <jseward@acm.org>
To: bdavis9659@sbcglobal.net 
Sent: Friday, May 9, 2014 7:36 AM
Subject: [valgrind] [Bug 197915] 80-bit floats are not supported on x86 and x86-64
 

https://bugs.kde.org/show_bug.cgi?id=197915

--- Comment #20 from Julian Seward <jseward@acm.org> ---
I would actually like to get this fixed, at least for 64-bit
processes.  The resulting non-usefulness of the tools for some groups
of users bothers me, but we are extremely resource constrained, and
this is a significant amount of work.

Would anybody be willing to help out by writing a proper test program?
It would need to exercise all the relevant x87 instructions
individually, in such a way that it makes clear when the
implementation of an instruction is correct (no accuracy loss) vs when
it is incorrect.  It would have to be fairly convincing, along the
lines of none/tests/amd64/sse4-64.c, perhaps.

If such a test case did exist, I would be a lot more motivated to
grapple with the compilation pipeline (VEX) aspects of the fix.  Or,
for that matter, to help out anybody who wanted to try fixing it
themselves.

Comment 22 Philippe Waroquiers 2014-05-15 20:29:41 UTC

(In reply to comment #21)
> Do you have any objection if I forward this e-mail to the gfortran mailing
> list ?
> There might be a user interested enough to help out in that group.
Valgrind bugzilla is publicly accessible, so no problem.

Note that it is better to have the link to this bug forwarded, so that suggestions,
helps, ... and so on are recorded here.

Comment 23 jeremy 2014-05-18 06:55:23 UTC

Hi Julian,

Testing 80-bit x87 FPU instructions.

I would like to have a try at writing a basic version of this program.

Is the reason for the 64-bit restriction because it limits the x87
instructions to their latest, and probably final, versions?
Which is fine.

I propose to write a stand-alone program, in C (GCC 4.9) with some inline
asm, that excercises all the FPU related instructions as documented in

   Intel® 64 and IA-32 Architectures Software Developer’s Manual, February
2014

For each instruction the results would be compared to the known correct
value as obtained from a prior run without valgrind.
A range of values should be tried along with some edge cases such as
subnormals, infinities and nan.  Where applicable, each of
the four rounding modes would be tested.
The documented FPU flags and the floating-point exceptions will be checked
 (but not the Protected Mode exceptions or the Real-address mode
exceptions).   In the first cut anyway, the exceptions would be checked by
simply examining the status word.   The state of the FPU stack should be
checked.

Let me know if you think this is useful and I'll make a start.  Its quite a
lot of work.

Regards
Jeremy

On 9 May 2014 12:36, Julian Seward <jseward@acm.org> wrote:

> https://bugs.kde.org/show_bug.cgi?id=197915
>
> --- Comment #20 from Julian Seward <jseward@acm.org> ---
> I would actually like to get this fixed, at least for 64-bit
> processes.  The resulting non-usefulness of the tools for some groups
> of users bothers me, but we are extremely resource constrained, and
> this is a significant amount of work.
>
> Would anybody be willing to help out by writing a proper test program?
> It would need to exercise all the relevant x87 instructions
> individually, in such a way that it makes clear when the
> implementation of an instruction is correct (no accuracy loss) vs when
> it is incorrect.  It would have to be fairly convincing, along the
> lines of none/tests/amd64/sse4-64.c, perhaps.
>
> If such a test case did exist, I would be a lot more motivated to
> grapple with the compilation pipeline (VEX) aspects of the fix.  Or,
> for that matter, to help out anybody who wanted to try fixing it
> themselves.
>
> --
> You are receiving this mail because:
> You voted for the bug.
>

Comment 24 Rich Felker 2014-05-19 01:42:02 UTC

Re: the 64-bit remarks, is this about doing the tests only on 64-bit, or about only supporting correct x87 fpu emulation on x86_64 (and not 32-bit x86)? To address our usage case (programs using musl libc) it's highly desirable for both 32-bit and 64-bit apps to work.

While I agree having instruction-level tests would be nice, I questions whether it's really essential to a first effort at fixing this. There are plenty of existing x87 software implementations (libgcc, Linux kernel [before it was removed], qemu, dosbox, etc.) that could be used as a guide for implementation or even reused directly if there are not license problems. Tests could then be used to look for errors and tweak the behavior: first, high-level tests compiled from C sources testing floating point behavior, and later, instruction-level tests.

Comment 25 Julian Seward 2014-05-19 09:09:55 UTC

(In reply to comment #23)
> Is the reason for the 64-bit restriction because it limits the x87
> instructions to their latest, and probably final, versions?

AFAIK there only has ever been one version of the x87 instruction set.
The restriction to 64 bit accuracy is because implementing 80 bit is a
lot of hassle and it isn't necessary for the majority of (portable,
IEEE754 compliant) code.

> I propose to write a stand-alone program, in C (GCC 4.9) with some inline
> asm, that excercises all the FPU related instructions as documented in
> 
>    Intel® 64 and IA-32 Architectures Software Developer’s Manual, February
> 2014
> 
> For each instruction the results would be compared to the known correct
> value as obtained from a prior run without valgrind.
> A range of values should be tried along with some edge cases such as
> subnormals, infinities and nan.  Where applicable, each of
> the four rounding modes would be tested.

The rounding modes are only applicable for integer-fp conversions and
for fp-fp format conversions (maybe).  For normal math (+, -, etc)
they are ignored, at least on the x86 and x86_64 implementation.

> The documented FPU flags and the floating-point exceptions will be
> checked

Don't bother with checking exceptions.  V doesn't simulate FP exceptions.

> Let me know if you think this is useful and I'll make a start.  Its quite a
> lot of work.

It does sound useful.  I would recommend you study some of the other
test programs, especially the 64-bit SSE4 test program, to see the
general style.  In general, for each insn and each test case, you
need to do: get the FPU in a known state (FINIT); load operands;
do the instruction; dump the FPU state.

Comment 26 Julian Seward 2014-05-19 09:16:09 UTC

(In reply to comment #24)
> x86)? To address our usage case (programs using musl libc) it's highly
> desirable for both 32-bit and 64-bit apps to work.

I propose only to implement this for 64 bit apps.  I regard 32 bit x86
as legacy and in maintenance mode only.  Even now, the 32 bit coverage
is far behind the 64 bit case.

> While I agree having instruction-level tests would be nice,

Not nice -- essential.  If we compile from source, (1) there's no
guarantee that gcc will produce the exactly the instruction you want
to test, and no other junk, and (2) there's no way to test
instructions that gcc doesn't generate.

> There are plenty of existing x87 software implementations (libgcc,
> Linux kernel [before it was removed], qemu, dosbox, etc.) that could
> be used as a guide for implementation

What I'm looking for is a way to verify that the implementation is
correct and remains correct in future.  How to actually implement this
stuff is not a problem.

Comment 27 Rich Felker 2014-05-19 12:33:01 UTC

On Mon, May 19, 2014 at 09:16:09AM +0000, Julian Seward wrote:
> I propose only to implement this for 64 bit apps.  I regard 32 bit x86
> as legacy and in maintenance mode only.  Even now, the 32 bit coverage
> is far behind the 64 bit case.

Since the exact same code should be usable for both, I don't see any
motivation for omitting use of it on 32-bit machines. Usage of the x87
fpu is actually much more common in 32-bit code.

> > While I agree having instruction-level tests would be nice,
> 
> Not nice -- essential.  If we compile from source, (1) there's no
> guarantee that gcc will produce the exactly the instruction you want
> to test, and no other junk, and (2) there's no way to test
> instructions that gcc doesn't generate.

I agree completely, with points 1 and 2, but I don't think that makes
it essential. Statistically all the important instructions are
extremely likely to get coverage, and the ones that compilers don't
generate are not relevant to the vast vast majority of real-world
code.

> > There are plenty of existing x87 software implementations (libgcc,
> > Linux kernel [before it was removed], qemu, dosbox, etc.) that could
> > be used as a guide for implementation
> 
> What I'm looking for is a way to verify that the implementation is
> correct and remains correct in future.  How to actually implement this
> stuff is not a problem.

It would be hard to be worse than the implementation right now, which
is just completely wrong. That's why I'm saying an incremental
approach to testing is worth considering.

Comment 28 Tom Hughes 2014-05-19 12:57:55 UTC

(In reply to comment #27)
> On Mon, May 19, 2014 at 09:16:09AM +0000, Julian Seward wrote:
> > I propose only to implement this for 64 bit apps.  I regard 32 bit x86
> > as legacy and in maintenance mode only.  Even now, the 32 bit coverage
> > is far behind the 64 bit case.
> 
> Since the exact same code should be usable for both, I don't see any
> motivation for omitting use of it on 32-bit machines. Usage of the x87
> fpu is actually much more common in 32-bit code.

No, the same code will not be usable because the amd64 code has been completely refactored to support newer instruction encodings and the x86 code has not so there are large differences between them which means code cannot simply be copied from one to the other.

> > > While I agree having instruction-level tests would be nice,
> > 
> > Not nice -- essential.  If we compile from source, (1) there's no
> > guarantee that gcc will produce the exactly the instruction you want
> > to test, and no other junk, and (2) there's no way to test
> > instructions that gcc doesn't generate.
> 
> I agree completely, with points 1 and 2, but I don't think that makes
> it essential. Statistically all the important instructions are
> extremely likely to get coverage, and the ones that compilers don't
> generate are not relevant to the vast vast majority of real-world
> code.

That's fine in theory, but in practice when somebody does hit one of those rare instructions we want it to fail hard, not just produce subtly wrong results - a bug which causes us to silently execute code incorrectly is at best a major pain to debug and at worst may never even get noticed.

> > > There are plenty of existing x87 software implementations (libgcc,
> > > Linux kernel [before it was removed], qemu, dosbox, etc.) that could
> > > be used as a guide for implementation
> > 
> > What I'm looking for is a way to verify that the implementation is
> > correct and remains correct in future.  How to actually implement this
> > stuff is not a problem.
> 
> It would be hard to be worse than the implementation right now, which
> is just completely wrong. That's why I'm saying an incremental
> approach to testing is worth considering.

But a software emulation is not what valgrind does or is looking for - you seem to be very confused about what valgrind actually does.

What valgrind does is to decompile the code, instrument it, and then turn it back into native code. It doesn't emulate the decompiled code using a software FP library.

Comment 29 Rich Felker 2014-05-19 13:07:34 UTC

On Mon, May 19, 2014 at 12:57:55PM +0000, Tom Hughes wrote:
> > > I propose only to implement this for 64 bit apps.  I regard 32 bit x86
> > > as legacy and in maintenance mode only.  Even now, the 32 bit coverage
> > > is far behind the 64 bit case.
> > 
> > Since the exact same code should be usable for both, I don't see any
> > motivation for omitting use of it on 32-bit machines. Usage of the x87
> > fpu is actually much more common in 32-bit code.
> 
> No, the same code will not be usable because the amd64 code has been completely
> refactored to support newer instruction encodings and the x86 code has not so
> there are large differences between them which means code cannot simply be
> copied from one to the other.

Perhaps I misunderstand you, but my idea is that you would not be
modifying the instruction decoding/handling code at all, but rather
replacing the backend it calls for floating point. Even if it's all
inline right now, the new correct 80-bit code would be large enough
that it should probably be factored (at least at the source level,
even if it's still macros/inlines) and bolting it onto both the 32-bit
and 64-bit versions should not be significantly more work...

> > I agree completely, with points 1 and 2, but I don't think that makes
> > it essential. Statistically all the important instructions are
> > extremely likely to get coverage, and the ones that compilers don't
> > generate are not relevant to the vast vast majority of real-world
> > code.
> 
> That's fine in theory, but in practice when somebody does hit one of those rare
> instructions we want it to fail hard, not just produce subtly wrong results - a
> bug which causes us to silently execute code incorrectly is at best a major
> pain to debug and at worst may never even get noticed.

Right now it's producing not-so-subtly wrong results, just silently
doing the wrong thing. So in principle it wouldn't be any worse than
the status quo. Of course someone could introduce an even bigger error
somewhere but that's always a possibility.

> But a software emulation is not what valgrind does or is looking for - you seem
> to be very confused about what valgrind actually does.

Yes, somewhat. I was under the impression that it was emulating the
fpu with C code using doubles.

> What valgrind does is to decompile the code, instrument it, and then turn it
> back into native code. It doesn't emulate the decompiled code using a software
> FP library.

Then how is it throwing away proper 80-bit support? Just by generating
gratuitous load/store at low precision? It seems like this should be
much easier to fix...

Comment 30 jeremy 2014-05-21 08:54:18 UTC

> AFAIK there only has ever been one version of the x87 instruction set.

There has been some additions, the most recent being FISTTP which came in
with sse3.
The others (FCMOVE etc) came in for early Pentiums as did the accuracy and
range improvements to the transcendentals.

GCC uses FISTTP where available for the float to int truncating cast, but I
guess its preferable to hand code it in asm, using cpuid to
check for sse3 beforehand.

The older changes must be present in 64-bit CPU's.

> Don't bother with checking exceptions.  V doesn't simulate FP
exceptions.
But does it set the cumulative exception flags in the status word??
If not the test code could simply zero those flags when it dumps the state.

The transcendentals, and load constants, are affected by rounding modes too.

> In general, for each insn and each test case, you
> need to do: get the FPU in a known state (FINIT); load operands;
> do the instruction; dump the FPU state.
Thanks.  Yes FNSAVE dumps it all, including the stack - and kindly does an
FINIT afterwards to set up for the next instruction test.

When its done, I'll send you:-

  fpu-64.c
  fpu-64.stderr.exp    /* empty*/
  fpu-64.stdout.exp
  fpu-64.vgtest

Can I assume C99 by the way?

Regards, Jeremy











On 19 May 2014 10:09, Julian Seward <jseward@acm.org> wrote:

> https://bugs.kde.org/show_bug.cgi?id=197915
>
> --- Comment #25 from Julian Seward <jseward@acm.org> ---
> (In reply to comment #23)
> > Is the reason for the 64-bit restriction because it limits the x87
> > instructions to their latest, and probably final, versions?
>
> AFAIK there only has ever been one version of the x87 instruction set.
> The restriction to 64 bit accuracy is because implementing 80 bit is a
> lot of hassle and it isn't necessary for the majority of (portable,
> IEEE754 compliant) code.
>
> > I propose to write a stand-alone program, in C (GCC 4.9) with some inline
> > asm, that excercises all the FPU related instructions as documented in
> >
> >    Intel® 64 and IA-32 Architectures Software Developer’s Manual,
> February
> > 2014
> >
> > For each instruction the results would be compared to the known correct
> > value as obtained from a prior run without valgrind.
> > A range of values should be tried along with some edge cases such as
> > subnormals, infinities and nan.  Where applicable, each of
> > the four rounding modes would be tested.
>
> The rounding modes are only applicable for integer-fp conversions and
> for fp-fp format conversions (maybe).  For normal math (+, -, etc)
> they are ignored, at least on the x86 and x86_64 implementation.
>
> > The documented FPU flags and the floating-point exceptions will be
> > checked
>
> Don't bother with checking exceptions.  V doesn't simulate FP exceptions.
>
> > Let me know if you think this is useful and I'll make a start.  Its
> quite a
> > lot of work.
>
> It does sound useful.  I would recommend you study some of the other
> test programs, especially the 64-bit SSE4 test program, to see the
> general style.  In general, for each insn and each test case, you
> need to do: get the FPU in a known state (FINIT); load operands;
> do the instruction; dump the FPU state.
>
> --
> You are receiving this mail because:
> You voted for the bug.
>

Comment 31 Julian Seward 2014-05-21 12:21:23 UTC

(In reply to comment #30)
> GCC uses FISTTP where available for the float to int truncating cast,
> but I guess its preferable to hand code it in asm,

Definitely.  Better not to assume gcc will generate any given
instruction.

> > Don't bother with checking exceptions.  V doesn't simulate FP
> exceptions.
> But does it set the cumulative exception flags in the status word??

No.  V doesn't have any awareness of FP exceptions.  It's as if it
lives in a world where such things don't exist.

> Can I assume C99 by the way?

Yes.

Comment 32 Rich Felker 2018-04-01 00:31:52 UTC

Ping. Today in #musl we had another user who was experiencing 1.2==atof("1.2") evaluating to false. After spending a while trying to diagnose it, it turned out they were running under valgrind. Is something blocking fixing this issue still?

Comment 33 Ivo Raisr 2018-04-01 07:55:03 UTC

(In reply to Rich Felker from comment #32)
> Is something blocking fixing this issue still?

Lack of skilled manpower :-) Feel free to work this issue!

Comment 34 phma 2020-04-28 06:34:50 UTC

I just ran into this while debugging Bezitopo. I wrote the isTooCurly method and found that it takes about 100 times as long when compiled by gcc as when compiled by clang. So I ran it in Valgrind (callgrind) to see what's different. It produced bizarre output. I debugged the code in Valgrind and found that the line

precision=nextafterl(bigpart,2*bigpart)-bigpart;

in spiral.cpp produced 0 when bigpart is about 1e80. Dumping the variables internal to the cornu function showed numbers like 3.33236731613775325228e+4605 (garbage), but it appears unable to compute a number bigger than about 5e307.

Versions:
valgrind-3.15.0
Linux puma 5.3.0-7625-generic #27~1576774560~19.10~f432cd8-Ubuntu SMP Thu Dec 19 20:35:37 UTC  x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008
clang version 9.0.0-2 (tags/RELEASE_900/final)
Eoan Ermine.

Comment 35 Paul Floyd 2023-07-02 08:35:30 UTC

*** Bug 421262 has been marked as a duplicate of this bug. ***

abominable-snowman
baldrick
bangerth
bdavis9659
bugdal
chris
ianb
imbaczek
ivosh
m.doppler
mark
mfranc
pdbarnes
philippe.waroquiers
phma
risto.vanhanen
rudolf.hornig
sam
tifferay7711
tom.vercauteren
tom
vincent-kde