Gauche Devlog

< Supporting general transformers: Step 1 | Constructing circular structures >


(cdr '#1='#1#) => ?

What should (cdr '#1='#1#) print?

This started from a blog post by zick (btw, zick is the author of Manga guide to Lisp.)

He reported Gauche segfaults after printing large number of single quotes. Here are results of other lisp/schemes.

  • CLISP (2.42) : stack overflow
  • Allegro CL (8.2) : (#1='#1#)
  • SBCL (1.0.11) : (#1='#1#)
  • MzScheme (v372) : #0=((quote . #0#))

I just fixed Gauche at r7170 so it now works just like MzScheme:

gosh> (cdr '#1='#1#)
#0=((quote . #0#))

The bug was trivial in hindsight. The write/ss uses two-pass algorithm: first it scans the structure to find shared objects, then it walks the structure again to print. In the second pass it detects the pattern such as (quote X) and using abbreviated notation:

(let/cc return
  (if (obj-is-shared? obj)
    (if (first-time-to-print? obj)
      (begin ... display #n= ...)
      (begin ... display #n# ... (return))))

  (if (and (pair? obj) (null? (cdr obj)) (eq? (car obj) 'quote))
    (begin (display #\') (write/ss (cadr obj)))
    (begin ...general print... ))

When Gauche tries to write cdr of #1=(quote #1#), the first pass marks the cell whose cdr is '() and whose car is a cell #1=(quote #1#) as a shared object (see the figure below). So it starts writing #0=(, then it recognizes the (quote X) pattern; the bug is that it just recurse to (cadr obj), skipping the check of whether (cdr obj) is shared object or not.


The fix is trivial that just add an extra check in the pattern recognition that cdr of (quote X) is not a shared object, which produces #0=((quote . #0#)).

However, Allegro and SBCL prints the structure differently (aside from their number counting from one): Are they just another notation of the same structure? As shown in the figure in zick's blog (I reproduce the figure below), they should be different.


I'm curious that what algorithm produces Allegro's result. I can't imagine one-pass algorithm, so my guess is that instead of having a separate code path for pass 1 and 2, they may use a single code walker (that handles (quote X) pattern detection) and calls it twice with different modes, once for shared-structure detection and again for output. Then it may fail to recognize that the initial object is the same as cdr of (quote X), because recursing into (cadr obj) will skip that particular cell.

Well, I can check it by looking at SBCL but this is so minor problem so I put it off for someday...

Tag: srfi-38

Post a comment