Gauche Devlog

2010/12/06

To quote or not to quote

I've been bitten by this twice, so I write it down to avoid another bite.

Short summary: The popular way to pass compiler command-line options by command/parameter substitution does not work with arguments including whitespaces.

Here I'm talking about the typical Makefile idioms such as the following:

  gcc `gauche-config -I` ...

Or like this:

  CFLAGS=`gauche-config -L`
  gcc $(CFLAGS) ...

The commands gauche-config -I and gauche-config -L produce the -I and -L flag(s) to give to the compiler, respectively. Typically there's one -I flag, but there may be more than one -L flags. So it should be interspersed into the command line. That is, we can't quote outside of substitution like this:

  gcc "`gauche-config -L`"

On Windows, Gauche may be installed under a path that contains whitespaces. In fact, Gauche Windows installer uses C:\Program Files\Gauche as the default. To pass such pathnames to -I and -L options, each option must already be quoted right after substitution.

So, initially I naively changed the output of gauche-config to quote the pathname:

  -I"c:\Program Files\Gauche\lib\gauche-0.9\0.9.1_pre2\include"

Here came a twist. The modern way to compile extension modules is to use gauche-package compile command, which takes care of gory details and makes Makefile simpler; you just list the source files and the script takes care of compiling and linking, with proper options. Internally it uses gauche.config module to obtain the same information as output of gauche-config -I and gauche-config -L.

I used Windows installer to install Gauche under C:\Program Files\ and compiled Gauche-gl using it with MinGW/MSYS. Everything worked smoothly. I was satisfied.

Then, a few days later I was testing 0.9.1 prerelease on my Linux box, and found some extension modules didn't compile. Their makefile didn't use gauche-package, but directly invoked gcc with `guache-config -I` for the arguments.

I remembered I had been tripped with the same problem a few years ago and had given up. This time I wanted to solve it once for all. I thought my quoting scheme was wrong, and fiddled with gauche-config output for some time. I couldn't managed it to work. I carefully read the man page of bash. And leaned this:

  • Quote processing is done before command/parameter substitution.
  • The result of command/parameter substitution is subject of word splitting, unless the argument (before substitution) is quoted.
  • Word splitting honor neither quotes, nor escaping (such as backslash). - it is simple string splitting with the characters specified by $IFS.

This means that, as far as we use shell-level command-line substitution, the output of gauche-config cannot contain whitespaces inside each argument. Tools using the same scheme, such as pkg-config, have the same limitation, and in fact it is documented in pkg-config manual.

If we can have intermediate step to preprocess the command line, and passes the quoted pathnames to the shell, it works. It is what gauche-package does. An alternative way could be to invoke shell within Makefile, and let make consturct the command line:

  INCLUDES = $(shell gauche-config -I)
  gcc $(INCLUDES) ...

With this, shell sees already expanded options, so it can process quotes correctly.

However, there may be a case that an extention can't use gauche-package compile because it requires special build process, and also it can't use GNU make. For the backward compatibility, I keep gauche-config -I and gauche-config -L not to quote pathnames. Hence, they are inherently unsafe way to construct a command line.

So, what should be the proper way for the extension makefile and gauche-config to handle pathnames with spaces? I don't know yet. For the time being, I added --incdirs and --archdirs options to gauche-config. They return pathnames separated by colon (or semincolon on Windows), and gauche-package constructs command line arguments from them. I'm not satisfied with it, though.

Tags: extensions, makefile, gauche-config, gauche-package

2010/11/19

Some improvements of constant propagation

Most of this feature was implemented long time ago (December 2009 - April 2010) but have no appropriate place to describe except in release notes, which haven't yet come. It is probably worth to mention it here.

A constant propagation code in the compiler was overhauled and now it can precompute a lot wider range of the code. It can recognize constant bindings defined by define-constant, and many built-in functions are precomputed if it is side-effect free and arguments are constant expressions.

For example, in the following code (logior ...) is precomputed at the compile time and becomes a constant value, as most C programmers would expect:

(define-constant *flag-0* (ash 1 0))
(define-constant *flag-3* (ash 1 3))

(func (logior *flag-0* *flag-3*))

(Note that macros can't save this case, since a macro can only see the code it contains; it can't know whether *flag-0* is constant binding, if it is defined outside of the macro, unless you put all the constant bindings to macro expansion phase, which has its own inconvenience.)

If you are unsure, you can use disasm to check.

gosh> (disasm (lambda () (func (logior *flag-0* *flag-3*))))
CLOSURE #<closure #f>
main_code (name=#f, code=0x20f7820, size=4, const=1, stack=4):
args: #f
     0 CONSTI-PUSH(9) 
     1 GREF-TAIL-CALL(1) #<identifier user#func>; (func (logior *flag-0* *flag-3*))
     3 RET 

Precomputation isn't limited to numeric computations.

gosh> (define-constant *data* '(#(a b c) #(d e f)))
*data*
gosh> (disasm (lambda () (vector-ref (cadr *data*) 1)))
CLOSURE #<closure #f>
main_code (name=#f, code=0x1024b70, size=2, const=1, stack=0):
args: #f
     0 CONST-RET e

Currently only built-in SUBRs (procedures directly implemented in C) are subject of precomputation. Note that some elementary functions are defined in Scheme to handle complex numbers, on top of real-only SUBR version.

gosh> (use math.const)
#<undef>

;; this doesn't fold constants
gosh> (disasm (lambda () (func (/ (sqrt pi) 2))))
CLOSURE #<closure #f>
main_code (name=#f, code=0x21d4ea0, size=12, const=3, stack=11):
args: #f
     0 PRE-CALL(1) 6
     2 CONST-PUSH 3.141592653589793
     4 GREF-CALL(1) #<identifier user#sqrt>; (sqrt pi)
     6 PUSH 
     7 CONSTI(2) 
     8 NUMDIV2                  ; (/ (sqrt pi) 2)
     9 PUSH-GREF-TAIL-CALL(1) #<identifier user#func>; (func (/ (sqrt pi) 2))
    11 RET 

;; ... but this does:
gosh> (disasm (lambda () (func (/ (%sqrt pi) 2))))
CLOSURE #<closure #f>
main_code (name=#f, code=0x21cf540, size=5, const=2, stack=4):
args: #f
     0 CONST-PUSH 0.8862269254527579
     2 GREF-TAIL-CALL(1) #<identifier user#func>; (func (/ (%sqrt pi) 2))
     4 RET 

I hope this restriction is removed soon, since at least for standard functions, programmer shouldn't care if they are implemented in Scheme or as SUBRs. I'm not sure I can do that by 0.9.1 release, though. Anyway, most of the time you don't need to care, but when you're writing a performance sensitive code it may worth to play with disasm a bit to find out what the compiler can do for you.

(Added 2010/11/20 20:03:14 UTC): Oh, by the way, constant folding is done only when the bindings of built-in procedures haven't been altered at the time of compilation. This is a subtle issue; if you value dynamism, that you can change any parts of the system anytime, then you might expect that the argument of func should be recalculated when you redefine / or %sqrt some later time. R6RS made library modules closed, that is, you cannot alter its exported bindings afterwards. That's a possible solution. I like Gauche to be a bit more flexible, though. I think it is reasonable to expect programmers to consider the risk of overriding existing bindings. Yes you can do that, but if you do so, do it boldly. At the beginning of the module write redefinitions explicitly and prominently so that not only the compilers but the readers of your program will also notice you're attempting some trick.

★ ★ ★

This precomputation once caused an interesting event. I had a function like the following, which worked fine in 0.9 but stopped working after I put the new constant folding code.

(define (fn ...)
  (define tree '(root))

  (define (populate! node)
    ... a code that destcructively modifies cdr of node ...)

  (populate! tree)
  (cdr tree))

I represented a tree in which each node was something like (<name> <node> ...). The variable tree began with an empty root node, the populate! function grew the tree, and finally fn returned a list of root's children.

The above code is buggy, since it destructively modifies a literal list. Gauche doesn't check mutation on a literal pairs, since doing so would be costly.

On 0.9 the code seemed working, since fn was actually called only once during program execution. So mutated literal won't affect other parts.

With the improved constant folding, the compiler deduced that the return value of fn was a cdr of constant, thus computable at compile time. It eliminated all the code in fn as dead code and made fn a constant function that merely returned ().

Ultimately the compiler should warn if it can detect the code may mutate a constant. Doing so generally is difficult, but there may be some obvious cases easy to catch.

★ ★ ★

Actually, the real fun begins when this extended constant folding meets procedure inlining. Then there are some interesting code transformation happening; such that conditional expressions can be eliminated because the compile knows the condition always be satisfied (or never satisified).

Unfortunately inlining feature is still picky and I hesitate to expose it publicly yet. I hope I can write about it soon.

Tag: compiler

2010/11/16

Partial continuations and fun of stack fiddling

The 0.9.1 release will officially support partial continuations natively. The feature was added six months ago, but I finally managed to document it.

Partial continuations have been used in the continuation-based web application framework Kahua. It had a makeshift implementation of shift and reset in terms of full continuations, as shown in Gasbichler and Sperber's ICFP02 paper ( http://citeseer.ist.psu.edu/gasbichler02final.html ).

The implementation needs to copy full stack for every reset and shift, but it just worked, and it didn't seem to be a significant bottleneck. So I wasn't bothered to implement partial continuations natively.

It had come from unexpected corner. It was just a byproduct of something else.

In May I was working on a small project using Gauche, and it seemed I could write it neatly if I passed continuations among threads back and forth.

The implementation then, however, did not allow me to do that; a full continuation captured by call/cc grabs the chain of activation records; most of the elements in it were independent from the thread and the VM it was executed in. I had made so, expecting some day I'd want to invoke a continuation in different thread. However, there were a pitfall: When Scheme code is invoked from C routine, a reference to the C stack frame is recorded in the first activation record of VM. It is used to prevent the Scheme code from invoking a continuation which had become obsolete (since the C stack, on top of which the continuation was captured, could have been popped at the time the continuation was invoked.)

★ ★ ★

I tend to lose track while thinking about continuations, so let me go over my thoughts step by step. This is the initial state: a C stack ('C') at the bottom.

  C

A C routine calls a Scheme routine, and the Scheme routine calls other Scheme routines, pushing Scheme stacks ('S') into the chain of activation records.

  C <- S <- S <- S

Sometimes a Scheme routine calls back to C routine, which may in turn calls other Scheme routines. So the chain of activation records may look like this:

  C <- S <- S <- S <- C <- S <- S <- S

When call/cc is called, it creates a continuation object that keeps the pointer to this chain.

  C <- S <- S <- S <- C <- S <- S <- S
                                     ^
                               #<continuation>

When the continuation is invoked, the VM will restore this chain to the VM stack (actually, we copy activation records when a continuation is captured, but when it is invoked we directly refer activation records on heap. So there's no copy back. But that's an implementation detail.)

Now, suppose we return from the Scheme routines to the intermediate C callback. From C's point of view, it looks like it is returned from one of several APIs such as Scm_Eval.

  C <- S <- S <- S <- C

We further returns from this intermediate C frame:

  C <- S <- S

What happens if we invoke the continuation captured above here? VM can restore the chain of Scheme activation frames, since we have total control over VM stack frames. However, we cannot restore C stack frame... we can't save C stack frame totaly portably. There's a nasty technique to save C stack frames (SCM does that) but even using that, many C routines are not written in a way that function calls return more than once.

So, Gauche used to raise an error when an invoked continuation contained a reference to a C stack frame that was not in the current executing environment.

This, off course, prevented a continuation captured on one thread to be invoked on another thread, since C stack frames of the latter were totally different from the former.

However, this was unnecessarily strict limitation. What we wanted to prevent was to return to an obsoleted C stack. Even a continuation contained a reference to an obsoleted C stack, if we would never returned to it---e.g. by invoking another continuation before returning to the obsoleted C stack---we could safely execute the continuation.

  C <- S <- S

1) Captures a continuation (*)

  C <- S <- S
            *

2) Calls the continuation captured far above.  It has
 a reference to an obsoleted C stack frame (@)

  C <- S <- S <- S <- C <- S <- S <- S 
            *         @

3) Returns from Scheme routines:

  C <- S <- S <- S <- C <- S
                      @

4) Before returning to the obsoleted C stack frame, invokes
 the continuation captured by 1.  It is safe.

  C <- S <- S
            *

4') If we don't invoke the continuation(*) and the Scheme
 routine returns to the obsoleted C stack, we raise an error.

  C <- S <- S <- S <- C
            *         @

So, instead of raising an error at the step 2, we just allowed to invoke the continuation, and only checks the validity of C stack frames when we return from Scheme to C.

★ ★ ★

While pondering about this change, an idea occurred to me. What if we just never record references to the C stack frames? Instead, the first Scheme frame directly called from C routine can just have dangling end of activation record chain (indicated by 'o' below).

  o <- S <- S <- S

If a Scheme routine calls a C routine and it calls back to a Scheme routine, we make another chain.

  o <- S <- S <- S   (C-routine)   o <- S <- S <- S

The intermediate C routine knows which Scheme routine it should return to, so we can return normally even the activation chain itself isn't continuous.

This idea didn't replace the existing continuations because of other issues after all. But I realized that if we call the above (C-routine) as reset, we just get a partial continuation.

 Entering reset form:

  o <- S <- S <- S <- reset

 Executing some more expressions, then calls (shift k ...)

  o <- S <- S <- S <- reset  o <- S <- S <- S

 We capture a partial coninuation bound to k.

                             o <- S <- S <- S <- k

The actual implementation was a bit more involved, but in a couple of days we had both continuations that can be passed to another thread, and a native support of partial continuations.

Executing the amb example in Gasbichler&Sperber's paper is 3.5 times faster in this native shift/reset compared to the version emulating them with full continuations.

Tags: Continuation, shift, reset

2010/11/12

Exact sqrt

Several months ago one of our users asked me why (sqrt 4) yields 2.0; if the argument is an exact square number, should the result be exact as well?

Yes, it should, although coercing the result to inexact numbers is allowed in R5RS, I believe. It was just as it was because I was lazy: "Fix it when it becomes a problem." It seemed that the time had come.

It turned out to be an interesting little exercise to implement a 'proper' sqrt that returns an exact result when the input is a square of an exact number. That includes rational numbers. Maybe it's a nice quiz for CS freshmen. If you haven't implemented one, just think how you'll do it.

Here are some results in the current Gauche (the long digit numbers are wrapped around):

gosh> (sqrt 144)
12
gosh> (sqrt 169/81)
13/9
gosh> (sqrt 340282366920938463463374607431768211456)
18446744073709551616
gosh> (sqrt 32317006071311007300714876688669951960444102669
71548403213034542752465513886789089319720141152291346368871
79609218980194941195591504909210950881523864482831206308773
67300996091750197750389652106796057638384067568276792218642
61975616183809433847617047058164585203630504288757589154106
58086075523991239303855219143333896683424206849747865645694
94856176035326322058077805659331026192708460314150258592864
17711672594360371846185735759835115230164590440369761323328
72312271256847108202097251571017269313234696785425806566979
35045997268352998638215525166389437335543602135433229604645
318478604952148193555853611059596230656)
17976931348623159077293051907890247336179769789423065727343
00811577326758055009631327084773224075360211201138798713933
57658789768814416622492847430639474124377767893424865485276
30221960124609411945308295208500576883815068234246288147391
31105408272371633505106845862982399472459384797163048353563
29624224137216

Let's forget about non-integer rationals and think about exact integers. I ended up handling three cases depending on the range of the input.

  • If the input is less than 2^52, the integer can be represented exactly in IEEE double floating point number, and if FP sqrt yields an integer we know it's the result. FP sqrt which is fast on the modern hardware. (Edit: This part is corrected on 2013/03/20 02:14:15 UTC, thanks to Mark H Weaver. See this entry.)
  • Otherwise we run Newton-Rhapson with exact arithmetic.
    • If the input is representable by IEEE double (e.g. up to approx. 2.0^1024), we can get an approximated result by using FP sqrt, and we use an integer close to it as the initial approximation.
    • Otherwise we use 2^floor((log2(input)+1)/2) as the initial approximation. That's because log2(input) may be calculated quickly.

Other than utilizing FP sqrt, this is pretty straightforward. If you know clever implementation of exact integer sqrt, let me know.

For exact rationals, I naively apply exact integer sqrt to both its numerator and its denominator. I guess that's faster than running Newton-Rhapson directly with exact rationals, for Gauche's rational artihmetic isn't optimized at all.

Oh, btw, R6RS's exact-integer-sqrt also returns the remainder if the input is not an exact square of an integer. That adds another little fun to the exercise.

Tags: sqrt, exact-integer-sqrt

2010/11/11

More control on redirection in run-process

Whew, these days I can only work on Gauche in smaller chunks of time spans, and there are lots of small changes that are not documented. Today I finally managed to document the new interface of run-process in gauche.process module. Now you can specify I/O redirections of the child process as flexible as you can in unix shells.

For example, you can consolidate subprocess's stdout and stderr:

(let1 p (run-process '(command arg ...) 
                     :redirects '((> 1 out) (>& 2 1))
  (begin0 (process->string (process-output p 'out))
          (process-wait p)))

Here (> 1 out) specifies process's file descriptor 1 is fed to a pipe identified by a symbol 'out. Another end of the pipe is retrieved by process-output. (>& 2 1) makes process's file descriptor 2 refer to the same sink as its file descriptor 1.

If you do (> 1 "out"), then the output is sent to a file named "out". You can also use (>> 1 "out") to append, just like the shell.

Using (<< fd "data") or (<< fd #u8(data ...)) allows you to feed the immediate data to the child process. A slight variation (<<< fd obj) uses written representation of obj as the process's input.

It is a upper-compatible extension to the existing mechanism. The current :input, :output and :error redirection arguments are interpreted like the following redirection specification:

:input x   = :redirects '((< 0 x))
:output x  = :redirects '((> 1 x))
:error x   = :redirects '((> 2 x))

The current :input :pipe etc are interpreted as follows:

:input :pipe   = :redirects '((< 0 stdin))
:output :pipe  = :redirects '((> 1 stdout))
:error :pipe   = :redirects '((> 2 stderr))

That is, they'll create pipes identified by symbols stdin, stdout and stderr, respectively. If you call (process-input process) or (process-output process), omitting the pipe name, then stdin and stdout are assumed respectively. So the existing code keeps working.

See the updated manual entry in svn repo for the full spec.

The next step will be to handle multiple processes and piping among them. Something like scsh's process forms. Originally I planned to port scsh's process operations, but now I'm thinking a bit different, more functional (combinatorial) approach than using macros and effectively introducing mini-DSL like scsh. That is, the base function creates a process runner closure, which can be combined and wired together by combinators, then you kick the whole thing to run. This idea is still vague and I'm not sure it can be in to the 0.9.1. But the I/O extension described here will be.

Tags: gauche.process, run-process, scsh

More entries ...