Gauche Devlog

2016/04/03

Did you mean...?

Recently I improved REPL info feature (browsing document on REPL) in a few ways. If you use Emacs, it is more convenient to jump to info directly; however, having document available on REPL is sometimes handy, for example, when you start gosh on ssh prompt to do some quick chores.

To view online doc on REPL, use info function, or toplevel ,info command:

gosh> ,info assoc
 -- Function: assq obj list
 -- Function: assv obj list
 -- Function: assoc obj list :optional key=
     [R7RS][SRFI-1] Each element in LIST must be a pair.  These
     procedures search a pair whose car matches OBJ (in the sense of
     'eq?' for 'assq', 'eqv?' for 'assv', and 'equal?' for 'assoc') from
     left to right, and return the leftmost matched pair if any.  If no
     pair matches, these return '#f'.

     If the optional argument of 'ascoc' is given, it is called instead
     of 'equal?' to check the equivalence of OBJ and each key.

Here are the new features:

  • You can now search not only function and syntax names, but also module name and class names:
gosh> ,info data.imap
 -- Module: data.imap
     This module provides a immutable data structure with O(log n)
     access and update operations (here, update means to return a new
     structure with requested changes).  The current implementation is
     based on the functional red-black tree.

gosh> ,info <tree-map>
 -- Builtin Class: <tree-map>
     Tree map class.  Tree maps are a data structure that maps key
     objects to value objects.  It's like hash tables except tree maps
     uses balanced tree internally.  Insertion and lookup is O(log n).

     Unlike hashtables, a tree map keeps the order of the keys, so it is
     easy to traverse entries in the order of keys, to find
     minimum/maximum keys, or to find a key closest to the given value.

     The '<tree-map>' class inherits '<sequence>' and
     '<ordered-dictionary>'.
  • If there are more than one entry with the same name, the system prompts you to choose:
gosh> ,info string-map
There are multiple entries for string-map:
 1. "R7RS base library"
 2. "SRFI-13 String mapping"
Select number, or q to cancel [1]: 
 -- Function: string-map proc str ...
 -- Function: string-for-each proc str ...
     [R7RS] These take different arguments from 'string-map' and
     'string-for-each' in SRFI-13 (*note SRFI-13 String mapping::), so
     provided only in 'scheme.base' module to avoid confusion.

  • If there are no entries for the given word, it suggests similar words if there's any:
gosh> ,info vectre
There is no entry for vectre.
Did you mean:
  vector=
  vector?
  vector

I hope these changes would make REPL more pleasant to use. BTW, is ,info intuitive enough? I'm also tempted to add alias ,doc, especially for clojure users.

Oh, by the way, when you're on Emacs, this little elisp code allows you to jump to info page with a few keystrokes on a word...

(defun gauche-info-index (topic)
  (interactive
   (list (read-string
          (concat "Gauche help topic : ")
          (current-word))))
  (switch-to-buffer-other-window (get-buffer-create "*info*"))
  (info "/usr/share/info/gauche-refe.info.gz")
  (Info-index topic))

(define-key global-map "\C-xH" 'gauche-info-index)

Tags: info, repl

2016/02/27

Comparators

A new comparator srfi, srfi:128, was finalized a couple of weeks ago. It is a rewrite of srfi:114, with simpler interface that appears to work with other parts of Scheme better (e.g. the comparison procedure of srfi-114, which returns -1, 0 or 1, doesn't seem Scheme-y, even though such interface is seen in other languages.)

I like srfi-128 better and I would've adopted it natively if Gauche hadn't already adopted srfi-114. At this moment, however, srfi-114 comparator is built-in and tightly coupled with Gauche's native datatypes such as built-in treemaps. Also, Gauche's internal compare API has been returning -1/0/1 scheme, so it goes better with srfi-114.

On the other hand, we expect future portable libraries will use srfi-128, and we want to encourage Gauche libraries to do so. We don't want to penalize code using srfi-128 with the overhead of adaptation layer.

So we chose to support both srfi-114 and srfi-128 natively. Srfi-114 comparator can emulate srfi-128's, and vice versa. Speed-sensitive internal code checks which flavor the passed comparator is, and uses optimized path. Most Scheme-level code would use high-level utilities such as =? and <? anyway, which hides the difference under the hood.

The only conflict in two srfis is the constructor, make-comparator. We already extended srfi-114 make-comparator to take the optional name argument, so it'll be tricky to distinguish two flavors with extra arguments. Besides, we do want to encourage new code to use srfi-128; so we decided to break the backward compatibility.

Gauche's make-comparator will be an extension of srfi-128, with optional name argument. We provide srfi-114 comparator under a different name make-comparator/compare. (The name could be confusing, but it reflects the compare generic function which is compatible to srfi-114's comparison procedure---thus, make-comparator with compare-like procedure.) If you import srfi-114, it imports make-comparator/compare as make-comparator and shadows the built-in one.

Gauche have supported subset of srfi-114 as built-in. Some of them are the same in srfi-128, some have different names, and some are newly introduced in srfi-128. Here's the correspondence.

srfi-114 srfi-128 notes
make-comparator *1
make-comparator
comparator? comparator?
comparator-comparison-procedure? comparator-ordered?
comparator-hash-function? comparator-hashable?
comparator-type-test-procedure comparator-type-test-predicate
comparator-equality-predicate comparator-equality-predicate
comparator-comparison-procedure *2
comparator-ordering-predicate *3
comparator-hash-function comparator-hash-function
comparator-test-type comparator-test-type
comparator-check-type comparator-check-type
comparator-hash comparator-hash
comparator-equal?
comparator-compare *4
default-comparator
make-default-comparator
default-hash *5
  • *1 Provided as make-comparator/compare.
  • *2 If srfi-128 comparator is given, returns a synthesized procedure that uses equality and ordering predicates.
  • *3 If srfi-114 comparator is given, returns a synthesized procedure that uses comparison procedure.
  • *4 If srfi-128 comparator is given, use its equality and ordering predicates.
  • *5 Same as Gauche's hash.

The change is currently undergone. We expect this to be the last major change before releasing 0.9.5.

Tags: Comparator, srfi-114, srfi-128

2016/01/29

Running code once and only once

It occasionally pops up---you have a piece of code you need to run once, but only once, even requested from multiple threads simultaneously. It's not a difficult problem, but can be boring enough to be worth to have a library procedure on some languages.

Every time I encounter an instance of such case it occurs to me to add a library procedure, then I think again to drop the idea, for it's just as trivial as the following in Scheme:

(define (run-once thunk)
  (let1 v (delay (thunk))
    (^[] (force v))))

(It may depend on implementations, but at least on Gauche, force is thread-safe and the above just works in multi-threaded environment.)

Do you think it is worth to have it in a library?

Tags: delay, multithread

2016/01/28

Automatic port selection for httpd

When you run a small http server temporarily, sometimes you want the server to pick an unused port number automatically, instead of your specifying one. For example, a test program may need to spawn a server, and hardcoding the port number will prevent you from running more than one instance of the test program simultaneously.

Giving 0 as the port number to make-server-socket or make-server-sockets let the system choose a port number for you. You need to know which port number is picked. With Gauche-makiki, you can use :startup-callback argument to examine server sockets to find out the port number. The following snippet shows echoing the port number to stdout after starting the server:

Note that server may open multiple sockets (e.g. one for ipv4, one for ipv6) so the callback receives a list of sockets. Their port numbers should be the same, though.

With this setup, you can spawn a server in a subprocess and talk to it, for example:

Tags: makiki, Tips

2016/01/02

Hygienic macro intro

(I just wrote the introduction of Macro chapter of Gauche manual. It may be useful for general introduction to the topic, so I put it here, too. For other Scheme implementation users: Keep in mind that ^ is lambda and (^x ...) is (lambda (x) ...) in Gauche.)


Lisp macro is a programmatic transformation of source code. A macro transformer is a procedure that takes a subtree of source code, and returns a reconstructed tree of source code.

The traditional Lisp macros take the input source code as an S-expression, and returns the output as another S-expression. Gauche supports that type of macro, too, with define-macro form. Here's the simple definition of when with the traditional macro.

(define-macro (when test . body)
  `(if ,test (begin ,@body)))

For example, if the macro is used as (when (zero? x) (print "zero") 'zero), the above macro transformer rewrites it to (if (zero? x) (begin (print "zero") 'zero)). So far, so good.

But what if the when macro is used in an environment where the names begin or if is bound?

(let ([begin list])
  (when (zero? x) (print "zero") 'zero))

The expanded result would be as follows:

(let ([begin list])
  (if (zero? x) (begin (print "zero") 'zero)))

This obviously won't work as the macro writer intended.

This is a form of variable capture. Note that, when Lisp people talk about variable capture of macros, it often means another form of capture, where the temporary variables inserted by a macro would unintentionally capture the variables passed to the macro. That kind of variable capture can be avoided easily by naming the temporary variables something that never conflict, using gensym.

On the other hand, the kind of variable capture in the above example can't be avoided by gensym, because (let ([begin list]) ...) part isn't under macro writer's control. As a macro writer, you can do nothing to prevent the conflict, just hoping the macro user won't do such a thing. Sure, rebinding begin is a crazy idea that nobody perhaps wants to do, but it can happen on any global variable, even the ones you define for your library.

Various Lisp dialects have tried to address this issue in different ways. Common Lisp somewhat relies on the common sense of the programmer---you can use separate packages to reduce the chance of accidental conflict but can't make the chance zero. (The Common Lisp spec says it is undefined if you locally rebind names of CL standard symbols; but it doesn't prevent you from locally rebinding symbols that are provided by user libraries.)

Clojure introduced a way to directly refer to the toplevel variables by a namespace prefix, so it can bypass whatever local bindings of the same name (also, it has a sophisticated quasiquote form that automatically renames free variables to refer to the toplevel ones). It works, as far as there are no local macros. With local macros, you need a way to distinguish different local bindings of the same name, as we see in the later examples. Clojure's way can only distinguish between local and toplevel bindings. It's ok for Clojure which doesn't have local macros, but in Scheme, we prefer uniform and orthogonal axioms---if functions can be defined locally with lexical scope, why not macros?

Let's look at the local macro with lexical scope. For the sake of explanation, suppose we have hypothetical local macro binding form, let-macro, that binds a local identifiers to a macro transformer. (We don't actually have let-macro; what we have is let-syntax and letrec-syntax, which have slightly different way to call macro transformers. But here let-macro may be easier to understand as it is similar to define-macro.)

(let ([f (^x (* x x))])
  (let-macro ([m (^[expr1 expr2] `(+ (f ,expr1) (f ,expr2)))])
    (let ([f (^x (+ x x))])
      (m 3 4))))    ; [1]

The local identifier m is bound to a macro transformer that takes two expressions, and returns a form. So, the (m 3 4) form [1] would be expanded into (+ (f 3) (f 4)). Let's rewrite the above expression with the expanded form. (After expansion, we no longer need let-macro form, so we don't include it.)

(let ([f (^x (* x x))])
  (let ([f (^x (+ x x))])
    (+ (f 3) (f 4))))  ; [2]

Now, the question. Which binding f in the expanded form [2] should refer? If we literally interpret the expansion, it would refer to the inner binding (^x (+ x x)). However, Scheme uniformly adopts lexical scoping---if the binding of m were ordinary let, the f in it would have referred to the outer binding (^x (* x x)), no matter where m is actually used.

In order to keep the consistency, we need some way to mark the names inserted by the macro transformer m---which are f and +---so that we can distinguish two f's (we can also mark + as free, which would refer to the toplevel binding.)

For example, if we would rewrite the entire form and renames corresponding local identifiers as follows:

(let ([f_1 (^x (* x x))])
  (let-macro ([m (^[expr1 expr2] `(+ (f_1 ,expr1) (f_1 ,expr2)))])
    (let ([f_2 (^x (+ x x))])
      (m 3 4))))

Then the naive expansion would correctly preserve scopes; that is, expansion of m refers f_1, which wouldn't conflict with inner name f_2:

(let ([f_1 (^x (* x x))])
  (let ([f_2 (^x (+ x x))])
    (+ (f_1 3) (f_1 4))))

(You may notice that this is similar to lambda calculus treating lexical bindings with higher order functions.)

The above example deal with avoiding f referred from the macro definition (which is, in fact, f_1) from being shadowed by the binding of f at the macro use (which is f_2).

Another type of variable capture (the one most often talked about, and can be avoided by gensym) is that a variable in macro use site is shadowed by the binding introduced by a macro definition. We can apply the same renaming strategy to avoid that type of capture, too. Let's see the following example:

(let ([f (^x (* x x))])
  (let-macro ([m (^[expr1] `(let ([f (^x (+ x x))]) (f ,expr1)))])
    (m (f 3))))

The local macro inserts binding of f into the expansion. The macro use (m (f 3)) also contains a reference to f, which should be the outer f, since the macro use is lexically outside of the let inserted by the macro.

We could rename f's according to its lexical scope:

(let ([f_1 (^x (* x x))])
  (let-macro ([m (^[expr1] `(let ([f_2 (^x (+ x x))]) (f_2 ,expr1)))])
    (m (f_1 3))))

Then expansion unambiguously distinguish two f's.

(let ([f_1 (^x (* x x))])
  (let ([f_2 (^x (+ x x))])
    (f_2 (f_1 3))))

This is, in principle, what hygienic macro is about (well, almost). In reality, we don't rename everything in batch. One caveat is in the latter example---we statically renamed f to f_2, but it is possible that the macro recursively calls itself, and we have to distinguish f's introduced in every individual expansion of m. So macro expansion and renaming should work together.

There are multiple strategies to implement it, and the Scheme standard doesn't want to bind implementations to single specific strategy. The standard only states the properties the macro system should satisfy, in two concise sentences:

If a macro transformer inserts a binding for an identifier (variable or keyword), the identifier will in effect be renamed throughout its scope to avoid conflicts with other identifiers.

If a macro transformer inserts a free reference to an identifier, the reference refers to the binding that was visible where the transformer was specified, regardless of any local bindings that surround the use of the macro.

It may not be obvious how to realize those properties, and the existing hygienic macro mechanisms (e.g. syntax-rules) hide the how part. That's probably one of the reason some people feel hygienic macros are difficult to grasp. It's like continuations---its description is concise but at first you have no idea how it works; then, through experience, you become familiarized yourself to it, and then you reread the original description and understand it says exactly what it is.

This introduction may not answer how the hygienic macro realizes those properties, but I hope it showed what it does and why it is needed. In the following chapters we introduce a couple of hygienic macro mechanisms Gauche supports, with examples, so that you can familiarize yourself to the concept.

Tag: macro

More entries ...