Gauche Devlog

< Rounding 1.15 | Generalized setter inlining >

2018/03/04

Procedure inliner improvement

This is about the compiler internal change in 0.9.6. There's no user-visible changes; it might be interesting if you're curious about compiler internals.

* * *

Some of Gauche's primitive procedures are inlined as VM instructions. For example, Gauche's VM has an instruction to reference an element of a vector with immediate offset. If vector-ref's second argument is a constant, the instruction is used:

gosh> (disasm (^x (vector-ref x 0)))
CLOSURE #<closure (#f x)>
=== main_code (name=#f, code=0x20e3160, size=3, const=0 stack=0):
signatureInfo: ((#f x))
     0 LREF0                    ; x
     1 VEC-REFI(0)              ; (vector-ref x 0)
     2 RET 

This tends to be pretty effective in performance.

You can also define a procedure as inlinable, using define-inline. We've been using it to optimize Gauche runtime.

However, up to 0.9.5, inlining is limited only when the inlinable procedure appears at the procedure position of the function application. In the following example, vector-ref wasn't inlined because it does not appear literally at the procedure position:

gosh> (define-inline (foo ref) (^x (ref x 0)))
foo
gosh> (disasm (^x ((foo vector-ref) x)))
CLOSURE #<closure (#f x)>
=== main_code (name=#f, code=0x2228f50, size=9, const=2 stack=14):
signatureInfo: ((#f x))
     0 LREF0-PUSH               ; x
     1 PRE-CALL(1) 7            ; (foo vector-ref)
     3 GREF-PUSH #<identifier user#vector-ref.2238300>; vector-ref
     5 GREF-CALL(1) #<identifier user#foo.2238340>; (foo vector-ref)
     7 TAIL-CALL(1)             ; ((foo vector-ref) x)
     8 RET 

It's been bothering us, since it is natural to expect that the effect of inlining and β-reduction cascades as follows:

((foo vector-ref) v)
 ↓
(((^[ref] (^x (ref x 0))) vector-ref) v)
 ↓
((^x (vector-ref x 0)) v)
 ↓
(vector-ref v 0)

It was the old inline interface that prevented this.

* * *

Gauche's compiler has 5 passes. Pass 1 expands macros and replaces input S expressions into a graph called IForm. Pass 2-4 works on IForm to optimize, and pass 5 walks final IForm to emit VM instructions.

It is pass 1 where vector-ref was expanded. Inlinable procedures have an attached inliner procedure, which takes an input S expression and compile-time environment Cenv, and returns expanded IForm:

    Inliner :: Sexpr, Cenv -> IForm

When the compiler see a form (fn arg ...) and fn has an inliner, the form (fn arg ...) is passed to it and the returned IForm becomes the result of pass1 of the original form.

If fn is defined with define-inline, the inliner procedure is a closure that keeps IForm of the original procedure. If the compiler sees this definition:

    (define-inline (vref v) (vector-ref v 0))

We process (v (vector-ref v 0)) with pass 1 and obtain its IForm. Let's denote it as #<IForm (^v (vector-ref v 0))> for now. The inliner of vref would be something like this:

    (^[sexpr cenv]
      (#Apply #<IForm (^v (vector-ref v 0))>
              (map (cut pass1 <> cenv) (cdr sexpr))))

Here, the pseudo function "#Apply" builds an IForm that applies the first argument on the list of IForms in the second argument. The subsequent optimization passes would eliminate binding of local variable v if possible.

We could've kept the body of vref in S-expression and just expand it much like legacy macro expansion:

    (^[sexpr cenv]
      (pass1 `(apply (^v (vector-ref v 0)) ,@(cdr sexpr)) cenv))

But such way would run into the hygiene problem. We have to rename vector-ref to avoid potential name conflict. Running through pass1 resolves all the identifier references, hence it addresses hygiene issue as well.

The problem is that the inliner's input is Sexpr and output is IForm. that means we can't apply an inliner on the result of another inliner. It was the reason that the foo case above didn't inlined as expected.

There are also other opportunities of inlining during pass 3. Once we do closure optimization and redundant local variable elimination, we might get a procedure call where the procedure has attached inliner. Since we're working on IForm in pass 3, we couldn't apply inliners in such case.

* * *

The solution is to change inliner protocol. Instead of taking input Sexpr, we can make the inliner takes IForms:

   Inliner :: IForm, [IForm] -> IForm

The first IForm is the body of inlinable procedure, and the second list of IForms are the IForm from argument expressions. The information of Cenv is already folded in IForm.

In pass 1, we process the procedure and the argument with pass1 before calling the inliner. In pass 3 we can just call the inliner.

Now the effect of inlining cascades.

;; Returns a procedure that takes the first element of the given sequence,
;; using REF as the accessor.
(define-inline (foo ref)
  (^v (ref v 0)))

;; Returns accessor accoring to the symbol TYPE.
(define-inline (bar type) 
  (cond 
   [(eq? type 'v) vector-ref]
   [(eq? type 'u8) u8vector-ref]))

;; Using foo and bar to access the first element of V, using an accessor
;; designated by TYPE:
(define-inline (baz type v)
  (let ((ref (bar type)))
    ((foo ref) v)))

If type is constant, the cond expression in bar is computed at the compile time, allowing that we inline everything down to a VM instruction:

gosh> (disasm (^v (baz 'v v)))
CLOSURE #<closure (#f v)>
=== main_code (name=#f, code=0x28783e0, size=3, const=0 stack=0):
signatureInfo: ((#f v))
     0 LREF0                    ; v
     1 VEC-REFI(0)              ; (ref v 0)
     2 RET 
gosh> (disasm (^v (baz 'u8 v)))
CLOSURE #<closure (#f v)>
=== main_code (name=#f, code=0x2878340, size=4, const=0 stack=1):
signatureInfo: ((#f v))
     0 LREF0-PUSH               ; v
     1 CONSTI(0) 
     2 UVEC-REF(1)              ; (ref v 0)
     3 RET 

* * *

This change allows us to write abstraction without worring that it taxes performance.

Tags: 0.9.6, Compiler, Inlining

Post a comment

Name: