Gauche Devlog

< Another little improvement of online doc | Method call optimization - skipping sort-applicable-methods >

2017/04/22

Method call optimization - avoiding next-method

Gauche's object system is a direct descendant of stklos, which is based on TinyCLOS. Each method takes an implicit parameter, next-method, which is bound to a next-method object and can be used to invoke the less specific method in a method chain. It is similar to the super method in class-oriented object systems. However, the next-method object must remember the actual arguments used in the method's invocation. That is, when it is called without arguments, it should pass the original arguments to the next method:

gosh> (define-method foo ((a <number>)) `((number ,a)))
#<generic foo (2)>
gosh> (define-method foo ((a <real>)) (cons `(real ,a) (next-method)))
#<generic foo (2)>
gosh> (foo 3)
((real 3) (number 3))

Here, (next-method) in (foo <real>) passes 3 to the next method, (foo <number>). Note that the next-method object is a first-class object, so for example, it can be saved to a global variable:

gosh> (define nm #f)
nm
gosh> (define-method foo ((a <real>)) (set! nm next-method) nm)
#<generic foo (2)>
gosh> (foo 3)
#<next-method (#<method (foo <number>)>)0 (3)>
gosh> nm
#<next-method (#<method (foo <number>)>)0 (3)>

And invoked later:

gosh> (nm)
((number 3))

This means that a next-method object must be allocated for every invocation of a method. In many cases, though, next-method isn't used at all in the method body, so it's a waste. Can we do something?

An easy path would be to let programmers indicate whether the method want to use next-method or not, probably using a CL-style method qualifier. But it's kind of silly---whether the method uses next-method or not is already shown in the code; why the programmer need to bother to say it again?

Fortunately, during compilation, we track how many times each local variable is referenced. So at some point in the compiler passes, we know whether next-method is used in the method body or not. We can extract that information and save in the method object, and when the method is invoked we check it and avoid next-method creation if possible.

The define-method form is expanded as follows:

(define-method name ((arg spec) ...) <method-body>)

  |
  v

(add-method! name
             (make <method>
               ...
               :specializers (spec ...)
               :body (lambda (next-method arg ...)
                        <method-body>)))

The body of the method is just a plain lambda form; it has nothing special about being a method body. So the compiler can't treat method body differently.

Instead, we introduced a general mechanism to save a list of unused argument in each closure; it is a meta-information and saved along source-code information.

The <method> initializer looks that information to find out whether next-method argument is used. If not, it sets method-leaf flag of the method object.

The method invocation code looks at the flag and omits creation of next-method object.

Let's see how effective this optimization is. This is the benchmark script.

(use gauche.time)

;; method without next-method
(define-method foo (x) #t)

;; method with next-method (for comparison)
(define-method bar (x) (when x (next-method)))

($ time-these/report 10000000
   `((foo . ,(cut foo #f))
     (bar . ,(cut bar #f))))

On 0.9.5:

Benchmark: ran foo, bar, each for 10000000 times.
  foo: 2.021 real, 3.550 cpu (3.470 user + 0.080 sys)@2816901.41/s n=10000000
  bar: 2.099 real, 3.680 cpu (3.570 user + 0.110 sys)@2717391.30/s n=10000000

On HEAD:

Benchmark: ran foo, bar, each for 10000000 times.
  foo: 1.313 real, 1.740 cpu (1.730 user + 0.010 sys)@5747126.44/s n=10000000
  bar: 2.051 real, 3.670 cpu (3.570 user + 0.100 sys)@2724795.64/s n=10000000

In wall-clock time, it's about 35% speedup (actually, averaging multiple runs show it's more like 40%). It's also interesting that CPU time is halved---since GC runs in parallel, it indicates this modification generates a lot less garbage, hence less GC time.

Note: Since 0.9.6's built-in methods will be pre-compiled by the 0.9.5 compiler, you won't see the effect of this optimization on built-in methods in the 0.9.6 release. You'll need to rebuild the source with 0.9.6.

There are still lots of room of optimization in our method dispatch mechanism, and we'll explore it more in the comping releases.

Tags: 0.9.6, next-method

Post a comment

Name: