Gauche Devlog

< Upgrading to 0.9.7 | Nasty undefined >

2018/12/24

Unbalanced unquotes undermine utility

tl;dr - Our design choice of quasirename having implicit quasiquoting was wrong, and we'll fix it.

* * *

In Common Lisp, backquotes and commas are handled by the reader---this means (1) every comma (and comma-atmark) must have corresponding backquote that's lexically surrounding it, and (2) once S-expression is read, you never see the trace of commas and backquotes.

Scheme took a different approach. Quasiquotes and unquotes are just a shorthand of the form (quasiquote form) and (unquote form). Their interpretation is left to the semantics of these forms.

This opens tempting possibilities to expand usage of these forms. SCSH's extended process form ( https://scsh.net/docu/html/man-Z-H-3.html ) is one example. Its redirection forms are implicitly quasiquoted, and unquote forms in it are evaluated without a corresponding quasiquote.

(define *outfile* "output.txt")

;; Redirect output of my-program to the file named by the value of *outfile*
(run (my-program) (> ,*outfile*))

* * *

Gauche adopted explicit-renaming macro for the lower hygienic macro layer (ref:er-macro-transformer). While syntax-case provides pattern matching and syntactic wrapping all in one set, ER-macro provides a minimal mechanism to hides underlying macro expansion system. In practice syntax-case is handy, but its features are inseparably tied to it. For example, you can't just simply use its pattern matcher as a runtime library independent from the macro system. We prefer basic tools each of which does one thing well, and building complicated systems combining those orthogonal tools.

For the pattern matcher, we already have mighty-powerful match (ref:util.match). On the other hand, constructing macro output is rather cumbersome with bare ER-macro, as we have to apply the rename procedure to every identifier we want to avoid from name conflict:

(define-syntax when-not
  (er-macro-transformer
    (^[form rename id=?]
      (match form
        [(_ test expr1 expr ...)
         `(,(rename 'if) (,(rename 'not) ,test)
            (,(rename 'begin) ,expr1 ,@expr))]
        [_ (error "malformed when-not:" form)]))))

So we introduced quasirename (ref:quasirename) that works quasiquote with renaming:

(define-syntax when-not
  (er-macro-transformer
    (^[form rename id=?]
      (match form
        [(_ test expr1 expr ...)
         (quasirename rename
           (if (not ,test) (begin ,expr1 ,@expr)))]
        [_ (error "malformed when-not:" form)]))))

Quasirename employs implicit quasiquote. It replaces every identifier in the form with the result of applying rename procedure on it, except the unquoted (and unquoted-spliced) portion which expands to the value of the expression as is. The code can be written almost identical to the legacy macro, except replacing quasiquote with quasirename (and provide the rename procedure).

We're quite happy with it and start rewriting lots of macros using it, then we realized its shortcomings.

* * *

When quasiquote is nested, corresponding unquote should also be nested. The outermost quasiquote corresponds to the innermost unquote. It is simply implemented by keeping track of nesting levels (when you see quasiquote, increment the nest level; when you see unquote, decrement it; and keep the unquotes except zero-level ones).

For example, suppose you have the following nested quasiquote forms:

(let ((a 'outer))
  `(let ((a 'inner))
     `(list ,',a ,,'a)))

When you unwrap the outer quasiquote form, you get:

   (let ((a 'inner))
     `(list ,'outer ,a))   

And when you unwrap the inner quasiquote you get:

      (list outer inner) 

The nested unquotes may look scary but the rule is simple.

  • Count the level of unquote from left to right.
  • If you want to evaluate the form in a particular level (except the innermost level), put ', (quote - unquote).
  • Or, if you want to keep the form untouched in that level and leave it to be evaluated in higher level, put ,' (unquote - quote).

To make this mechanism work, however, every quasiquote form must know the levels of unquotes in it. Unquote forms that don't have corresponding quasiquotes would trip quasiquote forms.

Using implicit quasiquote in quasirename makes it very difficult, if not impossible, to write a quasiquote form that yields quasirename form, or other combination of nestings.

* * *

So, what shall we do?

One solution is to recognize quasirename as a built-in syntax just like quasquote; let each one know the other, and count nestings properly.

However, that will make quasirename inherently unportable Gauche-specific syntax. Furthermore, what if we want to add more implicitly quasiquoted forms in future? Do we want to change every quasi-something form expanders?

Another solution is to let quasirename require its second argument to be quasiquoted. That is, this should be the proper form:

(quasirename r
  `(form ...))

and the argument without quasiquote should be invalid.

For the backward compatibility, we could allow the form being without quasiquote for a while. The only incompatible case is that the existing code intended to yield a quasiquoted form. In that case, it should be rewritten to use double quasiquotes.

(quasirename r
  ``(form ...))

Tags: 0.9.8, quasiquote, quasirename, macro

Post a comment

Name: