Curse of extended floating point arithmetic
In hindsight, Intel's decision of having 80bit extended-precision arithmetic internally in x87 was unfortunate. 64bit double-precision became pervasive, and the curse of double-rounding has bitten number of careless programmers.
And it finally bit me when I was implementing more reasonable flonum to rational conversion. To be precise, the issue had been already there, but that feature made me find a concrete example to exhibit the issue.
★ ★ ★
exact procedure tries to find the simplest rational
number within the range where a flonum covers. With that scheme,
1.0000000000000002 (the "next" floating point number above 1.0 in
double-precision flonum) becomes
gosh> (exact 1.0000000000000002) 3002399751580332/3002399751580331
However, users on i686 systems reported that converting the rational back to flonum yielded 1.0000000000000004. That is, 1 unit greater than the original number.
Let's see what was happening, exactly. First, save these big numerator and denominator in variables for the convenience.
gosh> (define x 3002399751580332) x gosh> (define y 3002399751580331) y
When we divide x by y with double precision arithmetic, what will we get?
gosh> (quotient&remainder (* x (expt 2 52)) y) 4503599627370497 1501199875790165 gosh> (* 1501199875790165 2) 3002399751580330 ; smaller than y=3002399751580331
Since the remainder is slightly less than the divisor,
we will round down the result and the mantissa will be
4503599627370497, or, in binary:
1000...0001 ^53th bit
In normalized double precision number, the
1 in MSB is hidden,
and the rest of 52 bits fit into the 52bit mantissa field.
The fact that the remainder is so close to the divisor makes this division susceptible to the rounding errors. Let's see the extended precision division. 80-bit extended floating point number does not use hidden bit and the fraction part is 63 bit wide.
gosh> (quotient&remainder (* x (expt 2 63)) y) 9223372036854778879 3002399751579307
The quotient, in binary notation, is as follows:
1000...000101111111111 ^53th bit
However, the remainder is greater than the half of divisor, so it is rounded up in the FPU register:
1000...000110000000000 ^53th bit
When this extended number is stored into a double-precision floating point variable, the bits beyond 53-th one is rounded. since it falls exactly the half point of two double-precision floating numbers, "round-to-even" rule is applied, and the result becomes...
1000...0010 ^53th bit
Alas, the trap of double-rounding.
★ ★ ★
We can set up the control register to force the FPU to round at double-precision for every operation. The way to do so depends on OS, though, so I've been avoided it. This time there seemed no choice but using it---I could've carried out the division in bignum, as we shown above, and manually fitted the calculated 52bits into the flonum's mantissa field, but it would be huge performance loss, given that almost all cases FPU would yield correct results.
I tried to find a way to test if the input falls in this kind of
particular cases---if I could do that cheaply, I'd first apply the test
then would do the expensive calculation only in the special cases---
but couldn't find any. I ended up having special C preprocessor macros
src/gauche/float.h to set up FPU registers depending
Post a comment