Gauche Devlog

2013/09/18

Macro system extension

I finally added syntax-rules extensions (srfi:46) to Gauche, that makes Gauche's hygienic macro system compatible to R7RS (except a few known bugs).

The current hygienic macro expander is written in C which is an ugly pile of spaghetti. Originally I planned to ditch the legacy code and to write an explicit-renaming macro expander as the new basis of our hygineic macro system, then to implement syntax-rules on top of it.

I like ER-macro since it's transparent to what it is doing for hygienity. It doesn't necessary to be the easiest one to use--- destructuring the input form, then renaming identifiers explicitly would be cumbersome for day-to-day programming. But those things can be easily alleviated by combining other tools. For example, we can just use util.match matcher to destructure the input form (instead of yet another pattern matcher tied to macro system).

In fact, in er-macro branch in the repo I implemented ER-macro expander to some extent. But it turned out I need some more time to substitute the low-level macro layer completely. A major issue is to keep compatibility between ER-macro, which allows raw symbols inserted by the macro expander to capture symbols in macro calls, and the current syntax-rules implementation which turns all symbols into identifiers. (The same issue is described by mjt here, in Japanese.)

Since I'd like to push out R7RS compatible release sooner, I just went into the legacy code and added some more spaghetti to make it work as srfi:46.

* * *

I realized this enhancement makes syntax-rules a lot more useful. I also adapted define-values form to R7RS, which allows generic formals, as follows:

(define-values (x y . z) (values 1 2 3 4))

z => (3 4)

With R7RS syntax-rules it's not difficult to distinguish proper list and inproper list (see Gauche:lib/gauche/defvalues.scm).

Tags: macro, r7rs, srfi-46

2013/08/01

.gaucherc

When gosh is started in the interactive REPL mode, it loads ~/.gaucherc if it exists. I suppose it may be handy if the user needs his own local setup, even though I personally haven't used the rc file yet---I guess it's a sort of traditional Unix culture.

Recently I realized this feature interferes with R7RS mode. The .gaucherc file is loaded into #<module user>, but what's visible from the user module differs greatly when gosh is invoked with -r7 option. It'll be quite difficult to write .gaucherc that can work both in traditional Gauche mode and r7rs mode.

(Note: I say r7rs mode and Gauche mode, but it's not that there are two separate modes, except the planned reader compatibility modes. You can load R7RS library from standard Gauche program and load Gauche library from standard R7RS program, no matter whether you start gosh with -r7 option or not. The -r7 option merely specifies which environment you're in at the time interactive REPL starts.)

I considered a few options:

  • If -r7 option is given, try to load a different rc file, e.g. ~/.gaucherc-r7. This option is less appealing: It scatters more rc files in the home directory. Besides, I expect things you want to do in rc file are likely to need to access Gauche-specific features (e.g. add-load-path) and you can't do that easily from R7RS environment. You would need to create a separate module, e.g. mysetup.scm for the setup code, then (import (mysetup)) from .gaucherc-r7.
  • Let rc file be loaded in a module other than user, say, gauche.user module. Then you can use Gauche features in .gaucherc, regardless of -r7 option. This is clean, but adding a new module just for the rc file seems a bit overkill. Besides, it is incompatible to the current version if a user defines something in .gaucherc and expect it visible from the user module.
  • Drop .gaucherc support. This is a tempting solution, for it makes things simpler. But who knows? Sometimes this kind of hook comes handy unexpectedly.

Eventually I settled on somewhat compromised design.

  • We load .gaucherc to #<module user>, as we have been doing.
  • When gosh is started with -r7 option, the initial module will be #<module r7rs.user>, not #<module user>.

It looks a bit ad-hoc solution, but let's give a shot.

Tags: gosh, r7rs, gaucherc

2013/07/06

It seems working. But typing S-expr in a small keypad is such a pain.

[image]

Tag: iOS

2013/05/22

R7RS support

We don't have an official announcement yet, but it seems that R7RS is ratified. Yay! Great thanks to the WG members for long and hard work to realize it.

I couldn't participate in discussions as much as I did for R6RS mainly due to time constraints, but another reason is that I was generally happy about the drafts, unlike what I felt during R6RS development.

I don't hate R6RS; they have some parts I like (e.g. I/O system) and I expect them to be in R7RS-large. I just think R6RS was too ambitious; it tried so hard to plug all the loopholes that some of its parts were introduced prematurely, IMHO. R7RS-small isn't perfect; but it fixes some of the biggest shortcomings of R5RS and "good enough" to move on. I believe, in order to fix the remaining defects, it's better to wait quasi-standard SRFIs that are adopted by most active implementations. The standard can come later, merely to codify the de-facto and proven ways, as R7RS did for some SRFIs.

* * *

The developlemnt HEAD of Gauche already has some R7RS support. If you invoke gosh as gosh -r7, it starts REPL with R7RS environment. Currently it implicitly imports all the R7RS-small libraries. You can also load files containing define-library form.

(The -r7 option only sets up the default behavior, and it's not that there's a distinct R7RS language mode. You'll be able to use R7RS library from Gauche code, and import Gauche library from R7RS code. Aside from the reader mode described below, the difference between R7RS and Gauche are merely namespaces.)

However, it's not quite ready yet to load portable R7RS libraries. The biggest obstacle is the lexical syntax---the \xNN; style escaping in strings and symbols are not supported yet, because of the backward compatibility problem. Gauche has been using \xNN (two-digits fixed, no semicolon terminator) style. It doesn't generally appear in the source code (the unicode escape, \uNNNN, is preferred), but it appears in datafiles dumped by write. Changing it would break existing datafiles, which would be a disaster.

There are also a few minor reader incompatibilities. For example, Gauche treats single quote as delimiters, so abc'def is parsed as a symbol abc and a list (quote def). In R7RS, this is a reader error.

My plan is to provide a few reader modes:

  • Legacy Gauche: Completely backward compatible
  • r7rs-compatible: Accepts both format, preferring r7rs when ambiguous
  • r7rs-strict: Reject syntax that doesn't comply r7rs

There are also small number of unsupported library functions and syntaxes, which I'm implementing gradually at my spare time. See lib/r7rs.scm to check what aren't supported yet.

The high-level macro also need to be enhanced to comply R7RS. Internal define-syntax is yet to be supported.

* * *

The R7RS import form works differently from Gauche's import. Gauche's one purely works on on-memory module objects and doesn't involve loading files. R7RS import is rather similar to Gauche's use, which is explained as require and (Gauche's) import.

I pondered a few options for some time: Overload import form with dual functionalities? Change Gauche's import so that it work like R7RS import? Finally I decided to implement completely separate forms. Gauche's import is mostly used in define-module form, which isn't R7RS, so I expect there's not much confusion. We can always rename Gauche's import to something like import-module in future.

Tags: r7rs, import

2013/05/09

Export-time renaming

Recently I implemented rename feature in the export form. With the import options (see Import options: part one and Import options: part two), this completes the infrastructure to support R[67]RS's modules on top of our module system.

The syntax of export-time renaming is the same as R[67]RS. If you have the following module:

(define-module example1
  (export (rename foo boo))
  (define foo 3))

Then the name foo in example1 module can be referred as boo from the modules that imports it.

gosh> (import example1)
#<undef>
gosh> boo
1

* * *

During the course, I changed ScmModule structure to manage the exported symbols.

A module is a map from names (symbols) to locations (global locations, or GLOCs).Essentially it's just a hashtable. Visibility (whether a name can be seen from others that import this module) is an auxiliary information.

Initially ScmModule had a list of exported symbols. When an identifier was looked up, we scanned the exported list of each imported modules, and if we found a match, we looked up the hashtable to get the corresponding GLOC.

Obviously this didn't scale when export list got longer. So several years ago I switched to put a flag in each binding to indicate whether the symbol was exported. Then I only needed to look up a hashtable and check a flag. But what does the binding mean? Conceptually, it is the association of a name and a GLOC, which is a hash table entry in our implementation. There's no place to add a flag in a hash table entry itself. So I made GLOC to have the exported flag.

There I stepped into a shady area. If a GLOC can be shared among different modules, there might be a case that it's exported in one module but not in another. We didn't have such cases in good old days, but the import options introduced GLOC-sharing cases. It wasn't a problem so far, since import options operates only on exported symbols. Yet this kind of hack reeks, and may bites back down the road.

Then it comes to the export renaming. A GLOC can now have multiple names, and we need to choose which name to look up depending on whether we're searching exported symbols or not. A straightforward way is to have two tables, one for exported names, and another for internal names.

And if we have a separate table for exported names, then the mere fact that the name is registered to the table indicates the fact that the name is exported---we don't need an extra flag in GLOC. Yay!

So the flag is removed from GLOC, and a new table for exported names is added to ScmModule. The module-exports introspection API returns a list of exported names for the backward compatibility, but now it calculates the result list from the exported name table every time it is called.

There's one caveat, caused by the openness of Gauche module.

In Gauche, a programmer can export symbols of existing modules at any time, using with-module. (During development, sometimes I do (with-module foo.internal (export-all)) so that I can call internal procedures of foo.internal easily.) This wasn't a problem before, since exporting already exported symbol was just a no-op.

With export-time renaming, it's no longer true. An internal symbol foo may have been exported as boo, but now it can be exported as voo. How should this situation be handled?

  • Should the previous export be removed? I decided not. It's costly to search if the internal symbol foo has been exported in another name. (we could have a reverse map, but it seems unnecessary complexity.) Plus, any code that counts on the external name boo may break.
  • What if another symbol has already been exported as voo? This also would break code that counts on the previous voo, but the operation may be intentional (e.g. hot-patching). I assume such case shouldn't happen in normal circumstances, but needed in emergency. So I make a warning issued but allow the meaning of external name voo to be updated to point to foo.

* * *

I already implemented R7RS module system on top of this, and I'll describe it in the next entry.

Tags: Module, export

More entries ...