Gauche Devlog

2017/11/18

Rounding 1.15

When you round 1.15 to the nearest tenths, what do you get?

The elementary math gives 1.2. Many programming language implementations disagree, however:

$ cat tmp.c
#include <stdio.h>
int main()
{
    printf("%5.1f\n", 1.15);
    return 0;
}
$ gcc tmp.c && ./a.out
  1.1

Is it a bug? No, it's working as intended. The intention, however, is different from what people naturally expect.

* * *

Decimal 1.15 can't be represented exactly with a binary floating point number; so internally, the runtime picks the closest binary floating point number to 1.15, which happens to be very slightly smaller than the actual 1.15. So, when you need to choose to round it to either 1.1 or 1.2---if you look at the actual number you have, you should say 1.1 is closer. (By the way, if you use 4.15 instead in the above example, you'll get 4.2. That's because the binary floating point number closest to 4.15 is slightly greater than that.)

You can use Gauche to check if that's really the case. The exact function tries to find the simplest rational number within the error boundary of the floating point number, but using real->rational you can get the exact number represented internally by the floating point number.

gosh> (exact 1.15)
23/20
gosh> (real->rational 1.15 0 0 #f)
2589569785738035/2251799813685248

And indeed, the exact one is smaller than the one you naturally expect from the notation:

gosh> (< 2589569785738035/2251799813685248 23/20)
#t

With 4.15, the exact one is greater than the closest simplified one:

gosh> (exact 4.15)
83/20
gosh> (real->rational 4.15 0 0 #f)
2336242306698445/562949953421312
gosh> (> 2336242306698445/562949953421312 83/20)
#t

So, if you take a point of view that a binary floating point number stands for the value it exactly represents (which is how they're treated inside the computer), you should round "the closest number to 1.15" to 1.1.

When users complain, we programmer tend to say "Floating point numbers have error. Use arbitrary precison arithmetic!" Well, floating point numbers themselves don't have error, per se. It has a precisely defined exact value, sign * mantissa * 2^exponent. It is an operation that has error, and in this case, it is the conversion from the decimal notation 1.15 to a binary floating point number.

* * *

But is it the only valid interpretation?

Another view is that when we treat a value noted "1.15", it is intended to be exactly 1.15 but we take the closest floating-point number as an approximation. The distinction is subtle but important---in the previous view, the intended value is 2589569785738035/2251799813685248 in floating-point number, and 1.15 is approximation. In the current view, the intended value is 1.15, and 2589569785738035/2251799813685248 in floating-point number is the approximation.

In this view, rounding "1.15" to the nearest tenths should result "1.2". (To be precise, we must also assume round-half-up or round-half-to-even rule). This usually fits user's expectation better. But it may be costly that we have to first obtain the optimal decimal representation of the given floating point number to decide which way to round.

* * *

We see both views are useful depending on circumstances. So we decided to support both.

The format procedure now supports floating-point number output directive, ~f. You can specify the field width and precision:

(format "~6,3f" 3.141592) ⇒ " 3.142"

If we need to round to the given precision, the default is to take the exact value of the floating-point number---the first view we discussed above. We call it effective rounding.

(format "~6,1f" 1.15)  ⇒ "   1.1"

However, if you need the latter view---we call it notational rounding---you can have it with : flag.

(format "~6,1:f" 1.15) ⇒ "   1.2"

Tags: 0.9.6, Flonums, format

2017/08/29

Pretty printer (and more) in REPL

It is daunting when you evaluate an expression on REPL and realize the result is a huge S-expr. Especially when you're running gosh inside Emacs with font-lock mode, since Emacs gets crawling trying to parse the huge output.

The reason I don't want to abbreviate the output was that I frequently copy the REPL output and reuse it---with Emacs, copying one S-expr is just a couple of keystrokes, no matter how big it is---but the big output dragging Emacs is irritating, nonetheless.

Gauche had several mechanisms to improve it for long time, but I finally put things together into a usable feature.

Sample session

Let me take you a little tour, for it is easier to see in examples. First, we need some interesting data.

gosh> ,u data.random
gosh> ,u gauche.generator
gosh> (define word (gmap string->symbol (strings-of (integers-poisson$ 12) (chars$ #[A-Z]))))
word
gosh> (define leaf? (samples$ '(#t #f #f #f)))
leaf?
gosh> (define (tree d) (cons (if (or (zero? d) (leaf?)) (word) (tree (- d 1)))
                             (if (or (zero? d) (leaf?)) '() (tree (- d 1)))))
tree

(tree N) would generate random nested list of max depth N. You can make several tries to find a reasonable size of the data.

gosh> (tree 5)
((HESUBSPMIBQBBWWZZ (((EHMZYLCL) QZKTHLZIKIXS)) NTAQUDHAXX (FMEBQP) PSHRSTW) ((UAYIBNNC (XAPYQBPOHSY) QFIZMITEWULRBMEO)) (WLQITJTZNBO (GJZNEKWBMLGCWKLPN) EINLIRVDLLGPQ) ((HZBDNGYBBQD)) YIQZWPL RELGWZEGSR)

Looks good, so let's save it.

gosh> (define t *1)
t

Now, all the tree in one line is hard to understand. Let's pretty-print it.

gosh> ,pm pretty #t
Current print mode:
  length :  50
   level :  10
  pretty :  #t
   width :  79
    base :  10
   radix :  #f
gosh> t
((HESUBSPMIBQBBWWZZ (((EHMZYLCL) QZKTHLZIKIXS)) NTAQUDHAXX (FMEBQP) PSHRSTW)
 ((UAYIBNNC (XAPYQBPOHSY) QFIZMITEWULRBMEO))
 (WLQITJTZNBO (GJZNEKWBMLGCWKLPN) EINLIRVDLLGPQ) ((HZBDNGYBBQD)) YIQZWPL
 RELGWZEGSR)

The ,pm toplevel command is an abbreviation of ,print-mode. Yes, setting print mode pretty to #t makes REPL pretty-prints the result.

The pretty printer tries to fit the S-expression within width. You can change it.

gosh> ,pm width 40
Current print mode:
  length :  50
   level :  10
  pretty :  #t
   width :  40
    base :  10
   radix :  #f
gosh> t
((HESUBSPMIBQBBWWZZ
  (((EHMZYLCL) QZKTHLZIKIXS)) NTAQUDHAXX
  (FMEBQP) PSHRSTW)
 ((UAYIBNNC (XAPYQBPOHSY)
   QFIZMITEWULRBMEO))
 (WLQITJTZNBO (GJZNEKWBMLGCWKLPN)
  EINLIRVDLLGPQ)
 ((HZBDNGYBBQD))
 YIQZWPL
 RELGWZEGSR)

It's still too long, so let's limit the length of the printed list:

gosh> ,pm length 3
Current print mode:
  length :   3
   level :  10
  pretty :  #t
   width :  #f
    base :  10
   radix :  #f
gosh> t
((HESUBSPMIBQBBWWZZ
  (((EHMZYLCL) QZKTHLZIKIXS)) NTAQUDHAXX
  ....)
 ((UAYIBNNC (XAPYQBPOHSY)
   QFIZMITEWULRBMEO))
 (WLQITJTZNBO (GJZNEKWBMLGCWKLPN)
  EINLIRVDLLGPQ)
 ....)

Lists (and vectors) longer than 3 elements are abbreviated using ellipses. You can also limit the number of nesting level:

gosh> ,pm level 3
Current print mode:
  length :   3
   level :   3
  pretty :  #t
   width :  40
    base :  10
   radix :  #f
gosh> t
((HESUBSPMIBQBBWWZZ (#) NTAQUDHAXX ....)
 ((UAYIBNNC # QFIZMITEWULRBMEO))
 (WLQITJTZNBO (GJZNEKWBMLGCWKLPN)
  EINLIRVDLLGPQ)
 ....)

The lists nested deeper than the current level are shown as #.

If you need to see everything, e.g. to copy & paste, you can use ,pa toplevel command (shorthand of ,print-all), which writes previous result without abbreviation.

gosh> ,pa
((HESUBSPMIBQBBWWZZ (((EHMZYLCL) QZKTHLZIKIXS)) NTAQUDHAXX (FMEBQP) PSHRSTW) ((UAYIBNNC (XAPYQBPOHSY) QFIZMITEWULRBMEO)) (WLQITJTZNBO (GJZNEKWBMLGCWKLPN) EINLIRVDLLGPQ) ((HZBDNGYBBQD)) YIQZWPL RELGWZEGSR)

You can also change the default base radix of integers by base. The radix mode switches whether radix prefix (#b, #x, #nr etc.) should be printed.

gosh> ,pm base 2
Current print mode:
  length :   3
   level :   3
  pretty :  #t
   width :  #f
    base :   2
   radix :  #f
gosh> 4753
1001010010001
gosh> ,pm base 16 radix #t
Current print mode:
  length :   3
   level :   3
  pretty :  #t
   width :  40
    base :  16
   radix :  #t
gosh> 4753
#x1291

Now, get back to the default.

gosh> ,pm default
Current print mode:
  length :  50
   level :  10
  pretty :  #f
   width :  79
    base :  10
   radix :  #f

You may notice that we have length=50 and level=10 as default. This prevents accidentally printing huge S-expression, while most useful data can be printed entirely.

Write controls

Common Lisp has several special (dynamic) variables such as *print-length* and *print-pretty* that affect how print (Scheme's write) works. Our REPL's print-mode imitates that, but instead of using individual dynamic parameters we have a packaged structure, <write-controls>. A new write-controls can be created by make-write-controls:

gosh> (make-write-controls)
#<write-controls (:length #f :level #f :base 10 :radix #f :pretty #f :width #f)>
gosh> (make-write-controls :length 10 :base 2)
#<write-controls (:length 10 :level #f :base 2 :radix #f :pretty #f :width #f)>

Write controls structure is immutable. If you want a controls that's only slightly different from existing controls, you can use write-controls-copy, to which you can give keyword arguments you want to change:

gosh> (write-controls-copy *1 :pretty #t)
#<write-controls (:length 10 :level #f :base 2 :radix #f :pretty #f :width #f)>

Gauche's output procedures such as write or display are extended to accept optional write controls.

Limitations

Currently, the pretty printer only handles lists, vectors and uniform vectors. Other objects (including objects with custom printer) are formatted by the system's default writer, so it is rendered as an unbreakable chunk. Ideally, we'd like to pretty-print such objects as well.

Pretty-printing Scheme code requires more features---it must recognize syntactic keywors and adjust indentation. Such feature will be pretty handy to format result of macro transformation, for example. We're planning to support it eventually.

Tags: 0.9.6, pretty-printing

2017/05/23

A heads-up for an incompatible fix in util.match

TL;DR: If you match records with inheritance using $ or struct match pattern, you need to change the code for 0.9.6.

We fixed a bug in the positional record matching pattern of match, existed in 0.9.5 and before. The fix actually breaks previously documented behavior, but we believe the previous behavior was incorrect and decided it's better to fix now.

Background

The $, or struct pattern allows you to extract slot values from objects using match (ref:match):

(define-class <point> ()
  ((x :init-keyword :x)
   (y :init-keyword :y)
   (z :init-keyword :z)))

(match (make <point> :x 1 :y 2 :z 3)
  [($ <point> a b c)
   (list a b c)])  => (1 2 3)

However, Gauche's object system isn't designed to access slots with their positions. You use slot names instead. In match, you can use object pattern (or @ in sort) to match with slot values, using slot names.

(match (make <point> :x 1 :y 2 :z 3)
  [(@ <point> (x a) (y b) (z c))
   (list a b c)])  => (1 2 3)

The reason we provided $ was for the compatibility of original Wright's match, which aimed at struct types provided in some Scheme impelementations. We didn't give much thought to it; just made the pattern match with the slot values of the order of class-slots (ref:class-slots). It works just fine with srfi:9 records:

(define-record-type pare (make-pare fst snd) pare?
  (fst get-fst)
  (snd get-snd))

(match (make-pare 1 2)
  [($ pare a b)
   (list a b)])  => (1 2)

Problem

Things got complicated when inheritance enters the picture. How the inherited slots are laid out depends on the implementation of metaclass (ref:compute-slots generic function), and because of multiple inheritance, the slot layout of class S doesn't necessarily a subsequence of the layout of class T that inheriting S. This is highly confusing, and we've always recommended using object match in such a case, in the manual.

However, srfi:99 records only allows single inheritance chain, and the default constructor takes initial value of inherited slot first. So it is a natural call to make positional match in the same way.

(define-record-type S make-S S?
  a b)

(define-record-type (T S) make-T T?   ;; inherit S
  c d)

(make-S 1 2)      ;; Initialize a=1, b=2

(make-T 1 2 3 4)  ;; Initialize a=1, b=2, c=3, d=4

;; Then, ($ T w x y z) should match with w=1, x=2, y=3, z=4.

It hadn't been so. The compute-slots method of <record> placed the direct slots first, followed by the inherited slots. It needs to do so to be consistent with that "fields in derived record types shadow fields of the same name in a parent record type", as defined in srfi-99.

Thus, ($ T w x y z) pattern in the above example matched w=3, x=4, y=1, and z=2. This wasn't inconsistent with the manual, which stated that positional match was done with the order of class-slots. It was an unintended artifact of implementation that was overlooked, unfortunately.

It also had a defect when duplicate slot names existed. When a subclass defines a slot with the same name as inherited slot, the standard compute-slots merges them into one, which is also CLOS's behaviro. However, srfi:99 record types allow subtype to have slots with the same name but as independent slots.

(define-record-type S #t #t
  a)

(define-record-type (T S) #t #t
  a)

(define t (make-T 1 2))

(T-a t)  ;=> 2    ; accesses T's a
(S-a t)  ;=> 1    ; accesses S's a in T

(slot-ref t 'a) ;=> 2  ; named access takes the subtype's slot

The existing implementation of positional matching needed to rely on named slot access, and didn't work on such record types.

Fix

We introduced a generic function to be specialized with metaclass, that handles positional access within match. We keep the underlying mechanism undocumented for now; changing the way of positional matching should be rare and based on well-established customs. The order of record types fits this criteria, and made to work as expected:

(define-record-type S make-S S?
  a b)

(define-record-type (T S) make-T T?   ;; inherit S
  c d)

(match (make-T 1 2 3 4)
  [($ T w x y z) (list w x y z)]) => (1 2 3 4)

Now it also works with record types having duplicate slot names:

(define-record-type S #t #t
  a)

(define-record-type (T S) #t #t
  a)

(match (make-T 1 2)
  [($ T x y) (list x y)])   ;=> (1 2);  was (2 2) before

We hope few have used positional match with inherited records--- the old behavior seems apparently wrong---so we decided to fix this now.

If you happen to have the code that relies on the previous behavior, and need to make it work with both versions, you can switch to use named match (object or @).

Tags: 0.9.6, util.match, gauche.record

2017/04/25

Method call optimization - skipping sort-applicable-methods

It's not that we don't sort methods. It's just that if we have only one applicable method, we don't need to call sort-applicable-methods at all. Obvious, right?

I thought it didn't matter, for that part is written in C (we call Scm_SortMethods) and it just returns without sorting at all when we have only one method. But lo and behold, it is quite effective. Here's the average of several runs, calling ref on a sparse vector 10M times.

Baseline:   3.065 real, 3.863 cpu
Optimized:  2.431 real, 2.793 cpu

Scm_SortMethods uses shell sort on C array to avoid allocation, and we have a setup overhead (convert list to array). That was a waste when we have only one method. And in many cases, we do.

Not bad for just a few lines of change.

https://github.com/shirok/Gauche/commit/556fee985246eed0dd0189b3763d662bf0b0075b

We could also optimize for the cases we have just two or three applicable methods, in which case we can use hard wired comparison instead of using general sorting. It depends on how often we get such cases; some benchmark is required.

Tags: 0.9.6, sort-applicable-methods

2017/04/22

Method call optimization - avoiding next-method

Gauche's object system is a direct descendant of stklos, which is based on TinyCLOS. Each method takes an implicit parameter, next-method, which is bound to a next-method object and can be used to invoke the less specific method in a method chain. It is similar to the super method in class-oriented object systems. However, the next-method object must remember the actual arguments used in the method's invocation. That is, when it is called without arguments, it should pass the original arguments to the next method:

gosh> (define-method foo ((a <number>)) `((number ,a)))
#<generic foo (2)>
gosh> (define-method foo ((a <real>)) (cons `(real ,a) (next-method)))
#<generic foo (2)>
gosh> (foo 3)
((real 3) (number 3))

Here, (next-method) in (foo <real>) passes 3 to the next method, (foo <number>). Note that the next-method object is a first-class object, so for example, it can be saved to a global variable:

gosh> (define nm #f)
nm
gosh> (define-method foo ((a <real>)) (set! nm next-method) nm)
#<generic foo (2)>
gosh> (foo 3)
#<next-method (#<method (foo <number>)>)0 (3)>
gosh> nm
#<next-method (#<method (foo <number>)>)0 (3)>

And invoked later:

gosh> (nm)
((number 3))

This means that a next-method object must be allocated for every invocation of a method. In many cases, though, next-method isn't used at all in the method body, so it's a waste. Can we do something?

An easy path would be to let programmers indicate whether the method want to use next-method or not, probably using a CL-style method qualifier. But it's kind of silly---whether the method uses next-method or not is already shown in the code; why the programmer need to bother to say it again?

Fortunately, during compilation, we track how many times each local variable is referenced. So at some point in the compiler passes, we know whether next-method is used in the method body or not. We can extract that information and save in the method object, and when the method is invoked we check it and avoid next-method creation if possible.

The define-method form is expanded as follows:

(define-method name ((arg spec) ...) <method-body>)

  |
  v

(add-method! name
             (make <method>
               ...
               :specializers (spec ...)
               :body (lambda (next-method arg ...)
                        <method-body>)))

The body of the method is just a plain lambda form; it has nothing special about being a method body. So the compiler can't treat method body differently.

Instead, we introduced a general mechanism to save a list of unused argument in each closure; it is a meta-information and saved along source-code information.

The <method> initializer looks that information to find out whether next-method argument is used. If not, it sets method-leaf flag of the method object.

The method invocation code looks at the flag and omits creation of next-method object.

Let's see how effective this optimization is. This is the benchmark script.

(use gauche.time)

;; method without next-method
(define-method foo (x) #t)

;; method with next-method (for comparison)
(define-method bar (x) (when x (next-method)))

($ time-these/report 10000000
   `((foo . ,(cut foo #f))
     (bar . ,(cut bar #f))))

On 0.9.5:

Benchmark: ran foo, bar, each for 10000000 times.
  foo: 2.021 real, 3.550 cpu (3.470 user + 0.080 sys)@2816901.41/s n=10000000
  bar: 2.099 real, 3.680 cpu (3.570 user + 0.110 sys)@2717391.30/s n=10000000

On HEAD:

Benchmark: ran foo, bar, each for 10000000 times.
  foo: 1.313 real, 1.740 cpu (1.730 user + 0.010 sys)@5747126.44/s n=10000000
  bar: 2.051 real, 3.670 cpu (3.570 user + 0.100 sys)@2724795.64/s n=10000000

In wall-clock time, it's about 35% speedup (actually, averaging multiple runs show it's more like 40%). It's also interesting that CPU time is halved---since GC runs in parallel, it indicates this modification generates a lot less garbage, hence less GC time.

Note: Since 0.9.6's built-in methods will be pre-compiled by the 0.9.5 compiler, you won't see the effect of this optimization on built-in methods in the 0.9.6 release. You'll need to rebuild the source with 0.9.6.

There are still lots of room of optimization in our method dispatch mechanism, and we'll explore it more in the comping releases.

Tags: 0.9.6, next-method

More entries ...