Procedure inliner improvement
This is about the compiler internal change in 0.9.6. There's no user-visible changes; it might be interesting if you're curious about compiler internals.
＊ ＊ ＊
Some of Gauche's primitive procedures are inlined as VM instructions.
For example, Gauche's VM has an instruction to reference an element
of a vector with immediate offset. If
vector-ref's second argument
is a constant, the instruction is used:
gosh> (disasm (^x (vector-ref x 0))) CLOSURE #<closure (#f x)> === main_code (name=#f, code=0x20e3160, size=3, const=0 stack=0): signatureInfo: ((#f x)) 0 LREF0 ; x 1 VEC-REFI(0) ; (vector-ref x 0) 2 RET
This tends to be pretty effective in performance.
You can also define a procedure as inlinable, using
define-inline. We've been using it
to optimize Gauche runtime.
However, up to 0.9.5, inlining is limited only when the inlinable procedure
appears at the procedure position of the function application.
In the following example,
vector-ref wasn't inlined because
it does not appear literally at the procedure position:
gosh> (define-inline (foo ref) (^x (ref x 0))) foo gosh> (disasm (^x ((foo vector-ref) x))) CLOSURE #<closure (#f x)> === main_code (name=#f, code=0x2228f50, size=9, const=2 stack=14): signatureInfo: ((#f x)) 0 LREF0-PUSH ; x 1 PRE-CALL(1) 7 ; (foo vector-ref) 3 GREF-PUSH #<identifier user#vector-ref.2238300>; vector-ref 5 GREF-CALL(1) #<identifier user#foo.2238340>; (foo vector-ref) 7 TAIL-CALL(1) ; ((foo vector-ref) x) 8 RET
It's been bothering us, since it is natural to expect that the effect of inlining and β-reduction cascades as follows:
((foo vector-ref) v) ↓ (((^[ref] (^x (ref x 0))) vector-ref) v) ↓ ((^x (vector-ref x 0)) v) ↓ (vector-ref v 0)
It was the old inline interface that prevented this.
＊ ＊ ＊
Gauche's compiler has 5 passes. Pass 1 expands macros and replaces input S expressions into a graph called IForm. Pass 2-4 works on IForm to optimize, and pass 5 walks final IForm to emit VM instructions.
It is pass 1 where
vector-ref was expanded. Inlinable
procedures have an attached inliner procedure, which takes
an input S expression and compile-time environment Cenv, and
returns expanded IForm:
Inliner :: Sexpr, Cenv -> IForm
When the compiler see a form
(fn arg ...) and
an inliner, the form
(fn arg ...) is passed to it and
the returned IForm becomes the result of pass1 of the original form.
fn is defined with
define-inline, the inliner procedure
is a closure that keeps IForm of the original procedure. If the compiler
sees this definition:
(define-inline (vref v) (vector-ref v 0))
(v (vector-ref v 0)) with pass 1 and obtain
its IForm. Let's denote it as
#<IForm (^v (vector-ref v 0))> for now.
The inliner of
vref would be something like this:
(^[sexpr cenv] (#Apply #<IForm (^v (vector-ref v 0))> (map (cut pass1 <> cenv) (cdr sexpr))))
Here, the pseudo function "#Apply" builds an IForm that applies the
first argument on the list of IForms in the second argument.
The subsequent optimization passes would eliminate binding of
v if possible.
We could've kept the body of
vref in S-expression and just expand
it much like legacy macro expansion:
(^[sexpr cenv] (pass1 `(apply (^v (vector-ref v 0)) ,@(cdr sexpr)) cenv))
But such way would run into the hygiene problem. We have to
vector-ref to avoid potential name conflict. Running
through pass1 resolves all the identifier references, hence it addresses
hygiene issue as well.
The problem is that the inliner's input is Sexpr and output is IForm.
that means we can't apply an inliner on the result of another inliner.
It was the reason that the
foo case above didn't inlined as expected.
There are also other opportunities of inlining during pass 3. Once we do closure optimization and redundant local variable elimination, we might get a procedure call where the procedure has attached inliner. Since we're working on IForm in pass 3, we couldn't apply inliners in such case.
＊ ＊ ＊
The solution is to change inliner protocol. Instead of taking input Sexpr, we can make the inliner takes IForms:
Inliner :: IForm, [IForm] -> IForm
The first IForm is the body of inlinable procedure, and the second list of IForms are the IForm from argument expressions. The information of Cenv is already folded in IForm.
In pass 1, we process the procedure and the argument with pass1 before calling the inliner. In pass 3 we can just call the inliner.
Now the effect of inlining cascades.
;; Returns a procedure that takes the first element of the given sequence, ;; using REF as the accessor. (define-inline (foo ref) (^v (ref v 0))) ;; Returns accessor accoring to the symbol TYPE. (define-inline (bar type) (cond [(eq? type 'v) vector-ref] [(eq? type 'u8) u8vector-ref])) ;; Using foo and bar to access the first element of V, using an accessor ;; designated by TYPE: (define-inline (baz type v) (let ((ref (bar type))) ((foo ref) v)))
type is constant, the
cond expression in
is computed at the compile time, allowing that we inline everything
down to a VM instruction:
gosh> (disasm (^v (baz 'v v))) CLOSURE #<closure (#f v)> === main_code (name=#f, code=0x28783e0, size=3, const=0 stack=0): signatureInfo: ((#f v)) 0 LREF0 ; v 1 VEC-REFI(0) ; (ref v 0) 2 RET gosh> (disasm (^v (baz 'u8 v))) CLOSURE #<closure (#f v)> === main_code (name=#f, code=0x2878340, size=4, const=0 stack=1): signatureInfo: ((#f v)) 0 LREF0-PUSH ; v 1 CONSTI(0) 2 UVEC-REF(1) ; (ref v 0) 3 RET
＊ ＊ ＊
This change allows us to write abstraction without worring that it taxes performance.