# 2020/07/22

## A curious case of rational to flonum conversion

Tanaka Akira filed a curious bug report in Ruby: https://bugs.ruby-lang.org/issues/17037

He computed `1 + (k/100)ε` in rational arithmetic, then converted it to a floating-point number, for k=0 to 100, where `ε` is `FLT_EPSILON`. If k=0 it's 1, and k=100 it's nextfloat(1), so somewhere inbetween the value switches from 1 to nextfloat(1). However, what he saw was that the value flips more than once:

```...
[31, 1.0, (450359962737049631/450359962737049600)]
[32, 1.0, (14073748835532801/14073748835532800)]
[33, 1.0000000000000002, (450359962737049633/450359962737049600)]
[34, 1.0000000000000002, (225179981368524817/225179981368524800)]
[35, 1.0, (90071992547409927/90071992547409920)]
[36, 1.0000000000000002, (112589990684262409/112589990684262400)]
[37, 1.0000000000000002, (450359962737049637/450359962737049600)]
[38, 1.0000000000000002, (225179981368524819/225179981368524800)]
[39, 1.0000000000000002, (450359962737049639/450359962737049600)]
[40, 1.0, (11258999068426241/11258999068426240)]
[41, 1.0000000000000002, (450359962737049641/450359962737049600)]
[42, 1.0000000000000002, (225179981368524821/225179981368524800)]
[43, 1.0000000000000002, (450359962737049643/450359962737049600)]
...
```

Intrigued, I ran the same computation in Gauche. To my surprise, I got exactly the same result (0.9.9):

```gosh> (dotimes (k 100)
(let1 d (+ 1 (* k 1/100 (exact (flonum-epsilon))))
(print `(,k ,(inexact d) ,d))))
(0 1.0 1)
(1 1.0 450359962737049601/450359962737049600)
(2 1.0 225179981368524801/225179981368524800)
...
(31 1.0 450359962737049631/450359962737049600)
(32 1.0 14073748835532801/14073748835532800)
(33 1.0000000000000002 450359962737049633/450359962737049600)
(34 1.0000000000000002 225179981368524817/225179981368524800)
(35 1.0 90071992547409927/90071992547409920)
(36 1.0000000000000002 112589990684262409/112589990684262400)
(37 1.0000000000000002 450359962737049637/450359962737049600)
(38 1.0000000000000002 225179981368524819/225179981368524800)
(39 1.0000000000000002 450359962737049639/450359962737049600)
(40 1.0 11258999068426241/11258999068426240)
(41 1.0000000000000002 450359962737049641/450359962737049600)
(42 1.0000000000000002 225179981368524821/225179981368524800)
(43 1.0000000000000002 450359962737049643/450359962737049600)
(44 1.0000000000000002 112589990684262411/112589990684262400)
(45 1.0000000000000002 90071992547409929/90071992547409920)
(46 1.0000000000000002 225179981368524823/225179981368524800)
(47 1.0000000000000002 450359962737049647/450359962737049600)
(48 1.0000000000000002 28147497671065603/28147497671065600)
(49 1.0000000000000002 450359962737049649/450359962737049600)
(50 1.0 9007199254740993/9007199254740992)
(51 1.0000000000000002 450359962737049651/450359962737049600)
(52 1.0000000000000002 112589990684262413/112589990684262400)
...
```

To my knowledge, Gauche and Ruby don't share code regarding this feature. Whatever the issue is, it should be in the logic.

What Gauche did was basically converting the numerator and the denominator of the rational number to doubles, then did a floating-point division:

```double dnumer = Scm_GetDouble(SCM_RATNUM_NUMER(obj));
double ddenom = Scm_GetDouble(SCM_RATNUM_DENOM(obj));

return dnumer/ddenom;
```

There's a case that dnumer and/or ddenom overflows that requires special handling, and also we have to make sure the floating-point division is done in IEEE double precision (as opposed to extended precision). I guess Ruby does the same, and that is the cause of the weird behavior. What's going on?

Let's look at the k=33. `1 + 33/100*ε` is `450359962737049633/450359962737049600`. This is closer to 1 than nextfloat(1), so it should be rounded down.

The numerator, `450359962737049633`, consists of the following bits:

```gosh> (format "~,,,5:b" 450359962737049633)
"1100,10000,00000,00000,00000,00000,00000,00000,00000,00000,00001,00001"
;;                                                             ^53bit
```

Note that it requires more than 53bits to represent it accurately. If we convert it to double, the last 6 bits are rounded---since it is greater than the midpoint, it is rounded up:

```1100,10000,00000,00000,00000,00000,00000,00000,00000,00000,00001,00001
↓
1100,10000,00000,00000,00000,00000,00000,00000,00000,00000,0001 * 2^6
```

Effectively, we now have `1 + 64/100*ε`. So we get 1.0000000000000002 rather than 1.0. We can say this is another case of double-rounding.

When k = 32, the last 6 bits falls on the exact midpoint, so it is rounded down by the even-rounding rule.

k=35, 40 and 50 are anomalies---the rational numbers are simplified, so their numerator happened to fit within 53 bits and we didn't lose the precision.

The fix I've implemented is as follows:

• Scale numerator by factor S so that numerator * S > 2^54 * denominator.
• Compute integer division. The quotient has at least 54 bits.
• Look at the 54-th bits.
• If it is 0, we can round down the quotient.
• If it is 1 and there's any '1' in the following bits, or the remainder of the division is not zero, we can round up the quotient.
• Otherwise, quotient is on the midpoint. We see the 53-th bit and round to even.
• Convert the quotient into double, then scale back.

If the result falls on the denormalized range, we have to adjust the significant digits, otherwise we'll get another double-rounding.

This solution involves lots of bignum allocations, so we want to optimize eventually.

Tags: Flonums, 0.9.10