2012/03/18
Consumers for generators
I've been trying out generators for
day-to-day scripts and have repeatedly seen a common pattern---
to repeat reading from a generator and process values until it
is exhausted.
I wrote a bunch of utilities
in gauche.generator
but they are mostly a sort of filters,
taking generators and yielding another generator. What I needed
is a kind of "terminator" of the chain of generators, where
the generated values were actually consumed.
Since a generator is just a thunk, I can use already existing
tools---actually, port-fold
, port-map
and port-for-each
expect thunks (in spite of their names), and we can pass
a generator directly.
(port-for-each (^v ...) generator-expr)
Probably I should rename them to indicate they are actually about generators, not ports.
(Another option is to convert a generator to a lazy sequence
by generator->lseq
, so that we can use all list-handling
utilities without wasting space for intermediate lists. But it
still has some overhead (not only CPU-wise, but mentally for
programmers who read and write the code). So I think it makes
sense to have specialized utilities for generators.)
For lists, we have for-each
. But sometimes it is handy
and more readable to use dolist
macro, taken from CL.
We can have its equivalent for generators:
(do-generator [v generator-expr] ... do something with v ...)
What if we want to run across multiple generators, or
even to mix generators and other sequences? Instead of
inventing more macros, we can extend the existing srfi-42
mechanism to handle generators.
(do-ec (:parallel (: x '(a b c)) (: y (gdrop-while skip-preamble (file->generator "foo")))) (do-something x y))
For the latter, we added :generator EC-qualifier (so the second
clause of :parallel
in the above example can also be written as
(:generator y ...)
). The generic
dispatcher of :
dispatches to the :generator
if
its argument is an applicable entity taking zero arguments.
Those changes have been pushed to the master.
Tags: Generator, gauche.generator, srfi-42
2012/02/08
Threads, build process, lazy sequences
Haven't updated this blog for a while (again). Here's a quick progress report.
Windows threads
On Windows/MinGW, threads are finally supported. It uses Windows thread (not pthread-win32.) Configure with --enable-threads=win32 on MinGW&MSYS environment.
./configure --enable-threads=win32
By 0.9.2, you might have checked the availability of threads by gauche.sys.pthreads
feature identifier in cond-expand
. This feature identifier isn't available on Windows threads build. We created a new feature identifier, gauche.sys.threads
, which is defined in both pthreads build and Windows threads build, and suitable for the Scheme programs that wants to switch behavior based on the availability of threads.
(cond-expand [gauche.sys.threads ... code using threads ...] [else ... code not using threads ...])
The feature id gauche.sys.pthreads
is still available on pthreads build, and a new feature id gauche.sys.wthreads
is available on Windows threads build, in case if you need finer distinction of the thread library (but there's hardly visible difference in Scheme level).
Better thread safety
Kirill Zorin is using Gauche in a way that is a good stress test for Gauche's thread support, and he has identified and fixed numerous thread-related bugs. Thanks, Kirill! I won't surprise if we find more, but I'm quite happy about the improvement.
Out-of-source-tree build
In the very early stage, we made it so that Gauche can be build in separate directory from the source tree, but it has been broken since then. It's not trivial to fix since Gauche needs multi-stage build, that is, one process generates a source file to be compiled, and running the compiled one generates another source file, and so on. Now I finsished updating various *.in files so that it could compile out of source tree again.
To build in a different directory, just call the 'configure' script of the source tree from the build directory, e.g.
$ ls Gauche $ mkdir Gauche-build $ cd Gauche-build $ ../Gauche/configure $ make
Lazy sequences
Finally documented lazy sequences and generators. Check out the chapter of "Lazy evaluation", and also "gauche.sequence" and "gauche.lazy" modules.
Tags: Threads, gauche.lazy, gauche.sequence
2011/12/27
Gc update
The development HEAD now uses bdwgc 7.2alpha6.
It is working on Linux, MinGW and MaxOSX, but a problem is reported on NetBSD.
I want to ship 0.9.3 with this version of gc, but I'll backport some of more recent fixes in gc whenever needed.
It is said that Windows thread support on MinGW is much improved in this gc version, so I'm also trying to make threads available on Gauche/MinGW.
2011/12/12
Action items for 0.9.3
I wish I could write solid articles explaining upcoming features one by one, but if I wait for having enough time to do so, it's not likely to happen in near future. So I just write this down as a casual memorandum.
There are three features I definitely want to put in; the first two are already coded and working, but lack documentation (and it is often the case that I find better ideas and/or flaws while writing the docs, so in a sense they're just half way through.) The third one is working on Linux but needs some more work to make sure it work on other platforms:
- Generators:
gauche.generator
module provides generic operations on generators, a thunk that yields a value at a time. For example,read-char
works as a generator (if we call it without arguments). We can have operations like filter, take, drop, append, zip, etc. on those generators. The places where generators themselves are useful may be limited, but they are basis of the lazy sequence, described next.
- Lazy sequence: Yes, Gauche now have a lazy list. From the Scheme code, it just look like an ordinary list, but the list's cdr is calculated as needed. It's different from
util.stream
, which provides a disjoint type for lazy streams. That is, you can usecar
/cdr
and all other list operations on lazy sequences. In fact, it responds#t
topair?
, so you cannot distinguish ordinary lists and lazy sequence.
- TLS/SSL support: In
tls
branch we have native TLS support working on Linux, using axTLS (thanks to Kirill Zorin). We no longer need to rely on externalstunnel
process.
Then, here's a list of features I did some work, but not sure 0.9.3 have full of it.
- List library reorganization: I'm tired writing
(use srfi-1)
in every Gauche code I write. There are already some srfi-1 procedures moved into core, for they are required by the compiler. Until 0.8.3 or so, we needed to keep the set of built-in procedures small, since they needed to be written in C. However, it's been quite some time that we can precompile Scheme procedures and link tolibgauche.so
, so there's not much reason we need to have useful list procedures into a separate library. I'm moving most srfi-1 list procedures into core (probably I keep list-as-set operations in srfi-1). I'll also reviseutil.list
, for it is just confusing to have a bunch of list libraries.
- PEG parser. It was originally written by Rui Ueyama several years ago, and I heavily modified it to run faster on Gauche. However, I didn't like the way it is optimized---it's ugly. The code has been already in Gauche distribution for some time, and actually
rfc.json
is written on top of it, but I refrained from making PEG API official. Now, with the lazy sequence support and some advanced optimizations in the compiler, I'm revisiting the PEG parser to make it simple yet efficient. In my local branch it is working, but the performance isn't satisfying. It looks like I have to work more on the compiler to optimize it. 0.9.3 will likely to contain this new code, but I'm not sure I make it official or not.
... and there are plenty of other ideas sitting on the shelf. I started to write them down but it got pretty long so I just keep them for the next time.
Tags: 0.9.3, srfi-1, LazySequence, peg
2011/11/20
Two-argument reverse
Again I kept silence for over 4 months. I released 0.9.2 in August. Since then, quite a few changes are made into the development HEAD, and I'd like to start explaining them as I approach to release 0.9.3.
Some of the changes are small in the code, but can have great impact---meaning, it may change how you code in Gauche. However, they take a lot of words to explain; I need some more time.
For today, I pick a rather small topic to get up to the speed. It may even seem too trivial to mention, but somehow, it matters to me. Maybe because it is related to how I approach to a program.
★ ★ ★
A common pattern to transform a list is to tail-recurse the input with consing the results, and at the end reverse the result list to return:
(let loop ([xs xs] [rs '()]) (if (null? xs) (reverse rs) (loop (cdr xs) (cons (do-something (car xs)) rs))))
It is not necessarily a good style. Olin Shivers commented in his srfi-1 reference implementation that, this style is just trading execution stack for cons cells, and in stack-based system you should rather use a natural (non-tail) recursion:
(let rec ([xs xs]) (if (null? xs) '() (cons (do-something (car xs)) (rec (cdr xs)))))
In Gauche, the latter is just as fast as the former to a list up to about a thousand elements. However, if the input exceeds the limit and recursion gets too deep, Gauche starts moving the stack frames to the heap (a kind of stack GC), so it gets worse than using intermediate lists and reverse at the end. So I typically use cons-then-reverse idiom in Gauche.
Now, it is also common to write a function that adds the result
to the existing pile of data.
If we do non-tail recursive calls, it is trivial; I just
replace ()
with the existing pile of data:
(let rec ([xs xs]) (if (null? xs) existing-pile-of-data (cons (do-something (car xs)) (rec (cdr xs)))))
If we use tail-recursive call and reverse, however,
we need to reverse the accumulated results, then append
the existing pile of data. Srfi-1 has a procedure
specifically for this pattern, append-reverse
:
(let loop ([xs xs] [rs '()]) (if (null? xs) (append-reverse rs previous-results) (loop (cdr xs) (cons (do-something (car xs)) rs))))
There's nothing wrong with this code;
the append-reverse
procedure is meant to avoid
unnecessary copying. Except that I always feel
something awkward in the name: I feel it somewhat
miss the point by thinking the operation as
"reverse, then append".
Look at this:
(reverse xs) == (fold cons '() xs) (append-reverse xs init) == (fold cons init xs)
It looks like reverse
is really a special
case of append-reverse
with the init
to be ()
. Thinking the operation as
"reverse, then append" is actually upside-down.
It may be more natural to think
the operation represented by append-reverse
as a fundamental one, and reverse
is a
shorthand for the special case.
Then, wouldn't it be reasonable to extend reverse
this way?
(define (reverse xs :optional (tail '())) (fold cons tail xs))
If you stick to the idea that reverse
is an operation
to reverse the order of the given list, this extension
seems contrived. However, how about thinking reverse
as a list constructor, which happens to take
initial content from xs but in the reverse order?
Actually, 0.9.2's reverse
is already extended to
take an optional tail argument, though it's not
documented.
Tradition has great inertia, so I don't expect this
extension gets widely accepted; after all, you can
use srfi-1's append-reverse
if you want this
functionality. But I did get a little piece of mind
with this extension.
Tags: reverse, srfi-1, append-reverse
Comments (0)