Gauche Devlog


Another little improvement of online doc

Here's another little feature to be in 0.9.6. You can search info entries using regexp.

gosh> ,i #/^symbol-/
symbol->string           Symbols:62
symbol-append            Symbols:96
symbol-hash              Hashing:154
symbol-interned?         Symbols:54
symbol-sans-prefix       Symbols:87

Since module builds the table from entry names to locations in info documents, it's just easy to pick entry names that matches the given regexp.

This raises an interesting question: We already have apropos which can search symbols. What's the difference?

On systems such as CL or Clojure, where docstring is tied to symbols, it's reasonable to let apropos for searching, and documentation/doc for retrieving the actual document.

In Gauche, we chose not to adopt docstring convention; instead, we provide a way to lookup separately provided document. This allows you to browse the document of symbols that are not loaded into the current repl, which is handy, since often you need to read doc before finding which module to import.

We can consider apropos more as an introspection tool into the current running process. With that view, there could be some options to enhance apropos---e.g. showing visibility of each binding from the current module, and if it's visible, why (e.g. this is visible because the current module imports this module which inherits this module, etc.)

Tags: Documentation, repl


Little improvement of online doc

As of 0.9.5, online document (info command on REPL) only shows the specified entry, instead of the entire info page that contains the entry.

gosh> ,info make-queue
 -- Function: make-queue
     Creates and returns an empty simple queue.

It is easier to read than the entire page, but has one drawback---from the entry alone, it is not clear which module I should import. If you're reading it in info, it's easy to look up which section you're in, and the section shows the module name. The online doc is out of such context.

I've been mulling about it and finally decided to go for a kind of brute-force solution; add the module name to every entry. In HEAD (which is to be 0.9.6), it will show as follows:

gosh> ,info make-queue
 -- Function: make-queue
     {data.queue} Creates and returns an empty simple queue.

It may be a bit annoying when you're reading it in info document, for the module name is repeated in every entry. But it is more frustrating that necessary info isn't shown.

Tags: Documentation, repl


Splitting generators

(This entry is inspired by a private email exchange.)

Generators (ref:gauche.generator - Generators, srfi:121) are very handy abstraction to construct a data-flow pipeline. You can put together small parts, each of which modifies data streams locally, to carry out the whole computation. You can also concatenate (ref:gappend, ref:gconcatenate, ref:gflatten) or merge (ref:gmerge) generators.

But how about splitting? If you think in terms of data-flow, you sometimes want to split a stream, say, you have an incoming stream of integers and want to have two output streams, one with odd numbers and another with even integers. Why don't we have gpartition?

Once you start writing it, it becomes immediately apparent that you need infinite buffering. You split generator of integers to odds and evens. You try to read from odds. But what if the input generator yields even numbers consecutively? You have to save them up so that they can be retrieved later when you read from "even" generators.

It's doable, using a pair of queues for example, but it'll be a bit messy.

In fact, there's a lot simpler solution.

(use gauche.lazy)
(use gauche.generator)

(define (gpartition pred input-gen)
  (let1 input-lseq (generator->lseq input-gen)
    (values (list->generator (lfilter pred input-lseq))
            (list->generator (lfilter (complement pred) input-lseq)))))

That is, convert the input generator to a lazy seq (lseq), creates two lazy sequences from it, and convert them back to generators. The buffering is implicitly handled as a realized portion of the lazy sequence.

Let's split stream of integers into odds and evens:

gosh> (define-values (o e) (gpartition odd? (grange 0)))
gosh> (o)
gosh> (o)
gosh> (o)
gosh> (o)
gosh> (o)
gosh> (e)
gosh> (e)
gosh> (o)

However, if you do this, you might as well build the whole network using lseqs, without bothering to convert generators and lseqs back and forth.

This is a sort of unfortunate situation that you have to choose. In languages like Haskell, there's no question that you'll go with lazy sequences from the first place. In Gauche, and in Scheme general, lazy streams incur certain overhead compared to eager solutions and you need to speculate, before starting to write code, which way would lead a good balance between code clarity and performance.

Fully lazy stream (like srfi:40 and srfi:41) needs to allocate a thunk for each element, unless the implementation has special optimizations. Generators can eliminate per-item allocation at all. Lightweight lseqs (such as Gauche's ref:Lazy sequences and srfi:127) can be a good compromise, for it requires to allocate just one pair per item, and allocation of pairs is likely to be optimized in wider range of implementations. At this moment, I'd suggest to go with lseqs as a first choice, and if you can't afford allocations at all you'd switch to full-generator design.

Btw, with srfi-127, gpartition can be written as follows:

(use srfi-127)

(define (gpartition pred input-gen)
  (let1 input-lseq (generator->lseq input-gen)
    (values (lseq->generator (lseq-filter pred input-lseq))
            (lseq->generator (lseq-remove pred input-lseq)))))

Gauche has srfi-127 in the development HEAD.

Tags: generator, lseq, srfi-121, srfi-127


0.9.5, and what's next

Released 0.9.5, finally. Every release I say I'll make next release quicker, but slip away every time, so I don't say it this time. (I did have hard time to put the release notes together, so I do hope shorter release notes next time.)

There are a few things I marked "after 0.9.5", so I guess I start with them. Here are top priority items:

  • Finishing up line-editing feature: It's "experimental" in 0.9.5, with a few known issues. Let's make it official. Completion feature is also nice to have.
  • Rewriting internal legacy macros with ER-macros. I can only do this after 0.9.5, since the source must be compilable by the latest release. I'm also hoping to rewrite syntax-rules expander in Scheme.
  • I like to look into better instrument to examine what's going on in the running program---e.g. debugger and profiler. The advantage of dynamically typed language is that you can examine and modify the running program without stopping it. We need to take maximum advantage of it.
  • Better way to organize and maintain library packages written by others---a central site to register and track them, I'm thinking.

Other thoughts, in more long term, but before 1.0:

  • Performance - I haven't visited this area for a while, but the technology in this field is advancing, so it might be a good time to revisit it once more. The current VM instructions was designed when ScmWord is 32bit and it's not likely to optimal for 64bit architecture; especially, lots of instruction space is wasted. I might redesign VM instructions, with possibility of adopting JIT.
  • Deployment - there are several experimental features---such as precompiling Gauche program, or statically linking Gauche runtime---but there's no standard workflow yet.

Tag: 0.9.5


Better error reporting during loading/compiling

As the release of 0.9.5 nears, we'd like to post some miscellaneous topics about the features introduced in 0.9.5 so that it can be referred back.

If you've been using HEAD, you might have noticed slightly different error message when you hit something wrong during loading or compiling. This is how it looks like (I added random thingy in rfc.uri just to trigger an error.):

gosh> (use rfc.http)
*** ERROR: unbound variable: error!
    While loading "../lib/rfc/uri.scm" at line 344
    While compiling "../lib/rfc/http.scm" at line 46: (define-module rfc.http (use srfi-11) (use srfi-13) (use rfc.822) (use rfc.uri) (use rfc.base64) (use ...
    While loading "../lib/rfc/http.scm" at line 76
    While compiling "(standard input)" at line 1: (use rfc.http)
Stack Trace:
  0  (eval expr env)
        at "../lib/gauche/interactive.scm":282

It shows that Gauche just evaluated the form typed in to REPL, which is (use rfc.http), which caused to load ../lib/rfc/http.scm, which triggers to compile (define-module ...), which in turn loads ../lib/rfc/uri.scm, which had a problem.

We implement this feature using compound conditions.

  1. load-from-port and compile catches an error, and re-raise a compound condition with the original condition plus a mixin condition that holds the context of loading or compilation.
  2. The mixin condition, <load-condition-mixin> or <compile-error-mixin>, inherits <mixin-condition>.
  3. The standard error reporting procedure, report-error, prints the information of thrown condition before the stack trace. If the thrown condition consists of a main condition (e.g. <error> and several mixin conditions, it first shows the main condition, then lists the mixin condition info using a method report-mixin-condition. In the above example, the ERROR: line is the main condition, and each While ... line is generated by report-mixin-condition method.

This mechanism is open to the user---if you want to add some context information to the error, all you need to do is:

  1. Define a condition as a subclass of <mixin-condition>.
  2. Capture the error while you're doing stuff and re-raise a condition with additional mixin condition.
    (guard (e [(condition? e)
               (raise ($ make-compound-condition e
                         $ make <my-mixin-condition> :key val ...))])
  3. Define report-mixin-condition specialized to your mixin condition.
    (define-method report-mixin-condition ((e <my-mixin-condition>) port)

You can see how <load-mixin-condition> and <compile-error-mixin> condition are handled in src/libomega.scm.

Tags: Condition, 0.9.5

More entries ...