2010/11/16
Partial continuations and fun of stack fiddling
The 0.9.1 release will officially support partial continuations natively. The feature was added six months ago, but I finally managed to document it.
Partial continuations have been used in the continuation-based web application framework Kahua. It had a makeshift implementation of shift and reset in terms of full continuations, as shown in Gasbichler and Sperber's ICFP02 paper ( http://citeseer.ist.psu.edu/gasbichler02final.html ).
The implementation needs to copy full stack for every reset and shift, but it just worked, and it didn't seem to be a significant bottleneck. So I wasn't bothered to implement partial continuations natively.
It had come from unexpected corner. It was just a byproduct of something else.
In May I was working on a small project using Gauche, and it seemed I could write it neatly if I passed continuations among threads back and forth.
The implementation then, however, did not allow me to do that; a full continuation captured by call/cc grabs the chain of activation records; most of the elements in it were independent from the thread and the VM it was executed in. I had made so, expecting some day I'd want to invoke a continuation in different thread. However, there were a pitfall: When Scheme code is invoked from C routine, a reference to the C stack frame is recorded in the first activation record of VM. It is used to prevent the Scheme code from invoking a continuation which had become obsolete (since the C stack, on top of which the continuation was captured, could have been popped at the time the continuation was invoked.)
★ ★ ★
I tend to lose track while thinking about continuations, so let me go over my thoughts step by step. This is the initial state: a C stack ('C') at the bottom.
C
A C routine calls a Scheme routine, and the Scheme routine calls other Scheme routines, pushing Scheme stacks ('S') into the chain of activation records.
C <- S <- S <- S
Sometimes a Scheme routine calls back to C routine, which may in turn calls other Scheme routines. So the chain of activation records may look like this:
C <- S <- S <- S <- C <- S <- S <- S
When call/cc is called, it creates a continuation object that keeps the pointer to this chain.
C <- S <- S <- S <- C <- S <- S <- S
^
#<continuation>
When the continuation is invoked, the VM will restore this chain to the VM stack (actually, we copy activation records when a continuation is captured, but when it is invoked we directly refer activation records on heap. So there's no copy back. But that's an implementation detail.)
Now, suppose we return from the Scheme routines to the intermediate C callback. From C's point of view, it looks like it is returned from one of several APIs such as Scm_Eval.
C <- S <- S <- S <- C
We further returns from this intermediate C frame:
C <- S <- S
What happens if we invoke the continuation captured above here? VM can restore the chain of Scheme activation frames, since we have total control over VM stack frames. However, we cannot restore C stack frame... we can't save C stack frame totaly portably. There's a nasty technique to save C stack frames (SCM does that) but even using that, many C routines are not written in a way that function calls return more than once.
So, Gauche used to raise an error when an invoked continuation contained a reference to a C stack frame that was not in the current executing environment.
This, off course, prevented a continuation captured on one thread to be invoked on another thread, since C stack frames of the latter were totally different from the former.
However, this was unnecessarily strict limitation. What we wanted to prevent was to return to an obsoleted C stack. Even a continuation contained a reference to an obsoleted C stack, if we would never returned to it---e.g. by invoking another continuation before returning to the obsoleted C stack---we could safely execute the continuation.
C <- S <- S
1) Captures a continuation (*)
C <- S <- S
*
2) Calls the continuation captured far above. It has
a reference to an obsoleted C stack frame (@)
C <- S <- S <- S <- C <- S <- S <- S
* @
3) Returns from Scheme routines:
C <- S <- S <- S <- C <- S
@
4) Before returning to the obsoleted C stack frame, invokes
the continuation captured by 1. It is safe.
C <- S <- S
*
4') If we don't invoke the continuation(*) and the Scheme
routine returns to the obsoleted C stack, we raise an error.
C <- S <- S <- S <- C
* @
So, instead of raising an error at the step 2, we just allowed to invoke the continuation, and only checks the validity of C stack frames when we return from Scheme to C.
★ ★ ★
While pondering about this change, an idea occurred to me. What if we just never record references to the C stack frames? Instead, the first Scheme frame directly called from C routine can just have dangling end of activation record chain (indicated by 'o' below).
o <- S <- S <- S
If a Scheme routine calls a C routine and it calls back to a Scheme routine, we make another chain.
o <- S <- S <- S (C-routine) o <- S <- S <- S
The intermediate C routine knows which Scheme routine it should return to, so we can return normally even the activation chain itself isn't continuous.
This idea didn't replace the existing continuations because of other issues after all. But I realized that if we call the above (C-routine) as reset, we just get a partial continuation.
Entering reset form:
o <- S <- S <- S <- reset
Executing some more expressions, then calls (shift k ...)
o <- S <- S <- S <- reset o <- S <- S <- S
We capture a partial coninuation bound to k.
o <- S <- S <- S <- k
The actual implementation was a bit more involved, but in a couple of days we had both continuations that can be passed to another thread, and a native support of partial continuations.
Executing the amb example in Gasbichler&Sperber's paper is 3.5 times faster in this native shift/reset compared to the version emulating them with full continuations.
Tags: Continuation, shift, reset
2010/11/12
Exact sqrt
Several months ago one of our users asked me why (sqrt 4) yields 2.0; if the argument is an exact square number, should the result be exact as well?
Yes, it should, although coercing the result to inexact numbers is allowed in R5RS, I believe. It was just as it was because I was lazy: "Fix it when it becomes a problem." It seemed that the time had come.
It turned out to be an interesting little exercise to implement a 'proper' sqrt that returns an exact result when the input is a square of an exact number. That includes rational numbers. Maybe it's a nice quiz for CS freshmen. If you haven't implemented one, just think how you'll do it.
Here are some results in the current Gauche (the long digit numbers are wrapped around):
gosh> (sqrt 144) 12 gosh> (sqrt 169/81) 13/9 gosh> (sqrt 340282366920938463463374607431768211456) 18446744073709551616 gosh> (sqrt 32317006071311007300714876688669951960444102669 71548403213034542752465513886789089319720141152291346368871 79609218980194941195591504909210950881523864482831206308773 67300996091750197750389652106796057638384067568276792218642 61975616183809433847617047058164585203630504288757589154106 58086075523991239303855219143333896683424206849747865645694 94856176035326322058077805659331026192708460314150258592864 17711672594360371846185735759835115230164590440369761323328 72312271256847108202097251571017269313234696785425806566979 35045997268352998638215525166389437335543602135433229604645 318478604952148193555853611059596230656) 17976931348623159077293051907890247336179769789423065727343 00811577326758055009631327084773224075360211201138798713933 57658789768814416622492847430639474124377767893424865485276 30221960124609411945308295208500576883815068234246288147391 31105408272371633505106845862982399472459384797163048353563 29624224137216
Let's forget about non-integer rationals and think about exact integers. I ended up handling three cases depending on the range of the input.
- If the input is less than 2^52, the integer can be represented exactly in IEEE double floating point number, and if FP sqrt yields an integer we know it's the result. FP sqrt which is fast on the modern hardware. (Edit: This part is corrected on 2013/03/20 02:14:15 UTC, thanks to Mark H Weaver. See this entry.)
- Otherwise we run Newton-Rhapson with exact arithmetic.
- If the input is representable by IEEE double (e.g. up to approx. 2.0^1024), we can get an approximated result by using FP sqrt, and we use an integer close to it as the initial approximation.
- Otherwise we use 2^floor((log2(input)+1)/2) as the initial approximation. That's because log2(input) may be calculated quickly.
Other than utilizing FP sqrt, this is pretty straightforward. If you know clever implementation of exact integer sqrt, let me know.
For exact rationals, I naively apply exact integer sqrt to both its numerator and its denominator. I guess that's faster than running Newton-Rhapson directly with exact rationals, for Gauche's rational artihmetic isn't optimized at all.
Oh, btw, R6RS's exact-integer-sqrt also returns the remainder if the input is not an exact square of an integer. That adds another little fun to the exercise.
Tags: sqrt, exact-integer-sqrt
2010/11/11
More control on redirection in run-process
Whew, these days I can only work on Gauche in smaller chunks of time spans, and there are lots of small changes that are not documented. Today I finally managed to document the new interface of run-process in gauche.process module. Now you can specify I/O redirections of the child process as flexible as you can in unix shells.
For example, you can consolidate subprocess's stdout and stderr:
(let1 p (run-process '(command arg ...)
:redirects '((> 1 out) (>& 2 1))
(begin0 (process->string (process-output p 'out))
(process-wait p)))
Here (> 1 out) specifies process's file descriptor 1 is fed to a pipe identified by a symbol 'out. Another end of the pipe is retrieved by process-output. (>& 2 1) makes process's file descriptor 2 refer to the same sink as its file descriptor 1.
If you do (> 1 "out"), then the output is sent to a file named "out". You can also use (>> 1 "out") to append, just like the shell.
Using (<< fd "data") or (<< fd #u8(data ...)) allows you to feed the immediate data to the child process. A slight variation (<<< fd obj) uses written representation of obj as the process's input.
It is a upper-compatible extension to the existing mechanism. The current :input, :output and :error redirection arguments are interpreted like the following redirection specification:
:input x = :redirects '((< 0 x)) :output x = :redirects '((> 1 x)) :error x = :redirects '((> 2 x))
The current :input :pipe etc are interpreted as follows:
:input :pipe = :redirects '((< 0 stdin)) :output :pipe = :redirects '((> 1 stdout)) :error :pipe = :redirects '((> 2 stderr))
That is, they'll create pipes identified by symbols stdin, stdout and stderr, respectively. If you call (process-input process) or (process-output process), omitting the pipe name, then stdin and stdout are assumed respectively. So the existing code keeps working.
See the updated manual entry in svn repo for the full spec.
The next step will be to handle multiple processes and piping among them. Something like scsh's process forms. Originally I planned to port scsh's process operations, but now I'm thinking a bit different, more functional (combinatorial) approach than using macros and effectively introducing mini-DSL like scsh. That is, the base function creates a process runner closure, which can be combined and wired together by combinators, then you kick the whole thing to run. This idea is still vague and I'm not sure it can be in to the 0.9.1. But the I/O extension described here will be.
Tags: gauche.process, run-process, scsh
2010/09/19
Code is soft, interface is hard
I was expecting to push 0.9.1 out of the door sometime in June or July, which has long past. The development trunk has 0.9.1_pre1 version tag for months, and I'm using new features are actively without a problem. So what's pulling back the release?
There are couple of new API sets that I'm not sure are right. I like an API that are simple but powerful; easier to do typical things but allowing to do complicated things if you want.
Thanks to the flexibility of Scheme it is relatively easy to have simple calls that covers easy use cases, and later adding options that augment them to handle atypical situations.
However, that approach is susceptible to creeping featurism; ending up tons of options that modifies the behaviour of API in various ways, sometimes incompatibly each other. It forces a programmer to look up references constantly, and when a new atypical situation arises, it is likely that he can't deal with it with existing options and end up modifying the library code to support a new option.
So, it is often the case that difficult challenges are not introducing the first version of a new module that supports the most basic functionalities, but releasing a second version to support complicated use cases.
And while code is malleable, interface is not. Once you release a public interface and other people's writes code on top of that, it becomes very difficult to introduce incompatible changes. The difficulty is that, an interface design needs to be evaluated by the actual applications that uses it, but to have enough applications you have to make the interface widely available, hence making the change more difficult.
At this moment, the biggest issue I'm stuck at is in rfc.http. The original API covers simple use cases and works well for the job. But it has shown its shortcomings as I write more applications; especially HTTP becomes a substrate of more involved protocols.
What I want in the revised rfc.http module:
- Persistent connections: This hasn't been implemented, but the infrastructure is there, and I expect it can be introduced as a natural extension of the exiting API. The current API will be extended to take <http-connection> object in place of the 'server' argument.
- Secure connections: There are practical needs for this, so I hacked up an easy solution, though it may be extended later. Right now I call "stunnel" as external process and delegates SSL handling to it; at API level, a new keyword argument :secure switches the use of SSL. In future, if Gauche supports SSL as in-process extension, we can just switch to it. This option is orthogonal to other options, so it's acceptable.
- Parameterized http method: Current API has different functions for each http method, e.g. http-get, http-post etc. This is good if I just want to retrieve an html page or post an html form. However, when some other protocol uses http as a lower layer, it gets a bit awkward; it is more natural for the code to give http method as a parameter to a single http handling function. This motivates me to create a single point of entry (in the trunk, it is http-request), and redefine existing http-get etc. as convenience procedures on top of it.
- Flexible handling of message body: This is the biggest issue, and I haven't resolved it yet. I'll discuss it in a separate entry.
2010/06/18
div, mod, div0, mod0
In the process of implementing R6RS compatibility functions, I'm trying to represent R6RS's integer division div, mod, div0 and mod0 in terms of R5RS's integer division quotient, modulo and remainder, but I'm always confused with their behaviors when the divisor and/or the divident is/are negative. So I visualize them.
R5RS integer divisions:
R6RS integer divisions:
I have a mixed feeling towards R6RS but I like this particular change; div/mod looks simpler and easier to explain (besides, not restricting arguments to integers is an interesting generalization).
To represent div/mod/div0/mod0 in terms of quotient/modulo/remainder, it looks that I have to branch for each case of 4 combinations of signs of divisor and dividend.

Comments (0)