## 8.6 Floating-point issues

The IEEE-754 standard defines the bit-level behavior of floating-point
arithmetic operations on all modern processors. This allows numerical
programs to be ported between different platforms with identical
results, in principle. In practice, there are often minor variations
caused by differences in the order of operations (depending on the
compiler and optimization level) but these are generally not
significant.

However, more noticeable discrepancies can be seen when porting
numerical programs between x86 systems and other platforms, because the
the x87 floating point unit (FPU) on x86 processors computes results
using extended precision internally (the values being converted to
double precision only when they are stored to memory). In contrast,
processors such as SPARC, PA-RISC, Alpha, MIPS and POWER/PowerPC work
with native double-precision values throughout.^{(26)} The
differences between these implementations lead to changes in rounding
and underflow/overflow behavior, because intermediate values have a
greater relative precision and exponent range when computed in extended
precision.^{(27)} In particular, comparisons involving extended
precision values may fail where the equivalent double precision values
would compare equal.

To avoid these incompatibilities, the x87 FPU also offers a hardware
double-precision rounding mode. In this mode the results of each
extended-precision floating-point operation are rounded to
double precision in the floating-point registers by the FPU.
It is important to note that the rounding
only affects the precision, not the exponent range, so the result is a
hybrid double-precision format with an extended range of exponents.

On BSD systems such as FreeBSD, NetBSD and OpenBSD, the hardware
double-precision rounding mode is the default, giving the greatest
compatibility with native double precision platforms. On x86 GNU/Linux
systems the default mode is extended precision (with the aim of
providing increased accuracy). To enable the double-precision rounding
mode it is necessary to override the default setting on per-process
basis using the FLDCW "floating-point load control-word" machine
instruction.^{(28)} A simple function which can be called to
execute this instruction is shown below. It uses the GCC extension
keyword `asm`

to insert the specified instruction in the
assembly language output:

void
set_fpu (unsigned int mode)
{
asm ("fldcw %0" : : "m" (*&mode));
}

The appropriate `mode`

setting for double-precision rounding is
`0x27F`

. The mode value also controls the floating-point exception
handling behavior and rounding-direction (see the Intel and AMD
processor reference manuals for details).

On x86 GNU/Linux, the function above can be called at the start of any
program to disable excess precision. This will then reproduce the
results of native double-precision processors, in the absence of
underflows and overflows.

The following program demonstrates the different rounding modes:

#include <stdio.h>
void
set_fpu (unsigned int mode)
{
asm ("fldcw %0" : : "m" (*&mode));
}
int
main (void)
{
double a = 3.0, b = 7.0, c;
#ifdef DOUBLE
set_fpu (0x27F); /* use double-precision rounding */
#endif
c = a / b;
if (c == a / b) {
printf ("comparison succeeds\n");
} else {
printf ("unexpected result\n");
}
return 0;
}

On x86 GNU/Linux systems the comparison `c == a / b`

can produce an
unexpected result if `c`

is taken from memory (double precision)
while `a / b`

is computed in extended precision, because the
fraction 3/7 has different representations in double and extended
precision.

$ gcc -Wall fptest.c
$ ./a.out
unexpected result

Setting the hardware rounding mode to double precision prevents this
from happening:

$ gcc -Wall -DDOUBLE fptest.c
$ ./a.out
comparison succeeds

Note that the floating-point control word affects the whole environment
of the process, including any C Library functions that are called. One
consequence of this is that `long double`

arithmetic is effectively
reduced to double precision, since it relies on extended precision operations.

The floating point control word only affects the behavior of the x87
FPU. Floating point operations computed with SSE and SSE2 instructions
are always carried out in native double precision. Thus, the combined
options

$ gcc -Wall -msse2 -mfpmath=sse ...

are often sufficient to remove the effects of extended-precision.
However, some operations (such as transcendental functions) are not
available in the SSE/SSE2 extensions and will still be computed on the
x87 FPU.