2020/05/30
C API and promise
This ate up my whole afternoon so I write it down not to fall into it again.
I've got a really weird bug. We have a parameter (say P). P has a default value, but it may not be available at the initialization time. Basically, what we want is to delay evaluation of EXPR below until the value of P is actually taken:
(define P (make-parameter EXPR))
Simply wrapping EXPR with delay
did't cut it, for the user of P expected it to contain a value which wasn't a promise. We couldn't go to every place where P was used to wrap it with force
.
So we added a special flag in the parameter, which applies force
on the value whenever the value is taken. The feature isn't available from the Scheme world, though. It's only through C API, for we're not sure
if such feature is a good idea yet.
Anyway, P got such a flag, so we could also say (P (delay EXPR))
to alter the value of P, with the actual computation of EXPR is delayed. And it seemed working.
However, we ran into an issue when some code takes the value of P from C API. The internal of parameter object is a bit complicated, but you can assume there's an C API that retrieves the value of the given parameter.
Through C API, however, P's value looked like #<closure ...>
, whereas when I took P's value from the Scheme world, it returned the value of EXPR.
I started tracking it down and it was like
a rabbit hole. Scheme interface eventually calls the internal Scheme
procedure %primitive-parameter-ref
, which directly calls
C API Scm_PrimitiveParameterRef
. I inserted a debug stub
to show the result of C call. The C API returns the mysterious
closure, yet in the Scheme world it returns the desired value.
Does Gauche runtime intercept the return value from C world to Scheme
world? Nope. It's directly returned to the Scheme world. I have
no idea where this #<closure...>
came from, neither
how the value changes to the desired one.
Furthermore, I found that if I evaluate (P)
second time,
C API returns the desired value. But no code is called to actually
replacing P's value!
I poke around C stub generators, VM code, parameter code,... in vain.
Finally, I opened up the source of Scm_Force
, the C API
for force
. And BANG! The answer was there.
C runtime doesn't like call/cc
. C procedures return either
exactly once, or never. So, when you call back Scheme code from C,
you have to choose one of these two strategies:
- Restrict the called Scheme code to returns at most once. If a continuation captured within the Scheme code is invoked again later, and tries to return to the C code again, an error is thrown.
- Split your C code to two, before the callback (A) and after the callback (B). Both A and B are ordinary C function. A arranges B to be called after the Scheme callback returns. Effectively, you write it as a continuation-passing style. With this, a continuation captured within the Scheme callback can be re-invoked, which just calls B again.
Most of Gauche runtime in C adopts the latter strategy, so that
call/cc works seamlessly. By convention, the C API functions
that use the strategy are named Scm_VM***
. The caller
of such C API can't expect to get the final result as the C return
value, since such function may need more calculation (Scheme code
and B part) to get the final result.
Scm_Force
is that type of function, too. I only forgot to name
it as Scm_VMForce
.
Scm_PrimitiveParameterRef
casually called Scm_Force
when it has the delayed evaluation flag, expecting that it returns
the final value. But in fact, Scm_Force
can only be used
in conjunction of Scheme VM to obtain the final result.
Tag: BugStories
2020/05/22
Next release will be 0.9.10
We haven't reached what we expected for 1.0, but we've got quite a few useful changes so we're gonna put 0.9.10 out. Some notable features to be included:
- All libraries of R7RS-large Red and Tangerine edition. (Exact complex numbers aren't in yet.)
- Immutable pairs are supported natively.
- PEG parser library will be finally documented and ``offical''.
- Native string cursor support, thanks to @pclouds.
- Input editing feature is enhanced to include online key-binding help. Probably still early to make it default, but we'll probably add command-line switch to test it easier.
Tag: 0.9.10
2020/01/23
Remaining tasks for 1.0
In the last entry I listed a couple of items for 1.0. After that I remembered a few more, so I put them down here.
- Enhancement of extended pairs. Currently we only use extended
pairs to keep source information (source code location and
the macro input). One big drawback of extended pairs is that
it is costly to check if a given pair is extended (occupies 4 words)
or not (occupies 2 words). Currently we need to look up the allocation
region table to see the size of the object.
If we can make the check faster, we can let modifiers (set-car!
andset-cdr!
) check if the pair has some special attributes (while cost ofcar
andcdr
stays the same.) It opens up a few interesting possibilities:- We can implement immutable pairs. Just raise an error in the modifiers.
- We can implement typed lists, that is, a list whose elements
are guaranteed to have certain type. A constructor, say,
(typed-cons type car cdr)
can check if car is of type and cdr is of type List type, and modifiers can check the type constraint.
- Integrate pattern matcher into core. Most of my code now impors
util.match
. It's so fundamental and I think it should be supported as built-in.- An interesting possibility is to support pattern match in formal arguments by default.
- Support FFI. Currently there's a C-wrapper ( https://bitbucket.org/nkoguro/c-wrapper/src/default/ ) by Naoki Koguro. It'd be a good time to incorporate it.
Depending on how quick I can work on them, we might have another 0.9.x release.
Tag: 1.0
2019/12/14
0.9.9 is out
What's left for 1.0?
The goal of 0.9.x releases has been to stabilize the binary API, so that after 1.0 release we can be sure that extension libraries won't break during 1.0.x series.
By now, we feel API is pretty stable; just want to make sure the exposed structure has enough information so that it won't hinder future development of various runtime analyzing tools such as debuggers.
Other than that, the items on the table so far are:
- Full support of R7RS-Large Red and Tangerine Edition
- This includes support of exact complex numbers
- Tweaking continuation frame handling. We expect it to open up several pending features, such as the issue of lost stack trace ( https://github.com/shirok/Gauche/issues/521 ), support of continuation marks (srfi-157), integration of stack trace and call trace, and better inspector-debugger.
Well, it doesn't seem a lot, but you never know.
So we hope the next release to be 1.0, but it can be 0.9.10 if we're sidetracked by some unforeseen issues. Let's see.
Tag: 0.9.9
2019/12/08
Definition is now a compile-time construct
As of 0.9.9, toplevel definitions insert bindings to the current module at compile time. Of course the value of the binding isn't known at compile time, so first the variable is marked as uninitialized. At runtime, the actual value is calculated and bound to the variable.
This is for consistent behaivor with modern module systems (the issue https://github.com/shirok/Gauche/issues/549 was the trigger of this change). Basically, defines in the same toplevel must first make those names in the same scope, then proceed to calculate the values. R6RS is clear about it, while R7RS allows implementations some leeway.
If you happen to use the value of the variable before it is bound to actual value, you'll get an error saying the variable is not initialized.
This doesn't make any difference if each definition is
a toplevel form by its own. However, if multiple toplevel
definitions are enclosed in a form such as begin
,
define-module
or define-library
, you'll see the difference.
This may affect an idiom, once popular until R5RS era:
(define orig-error error) (define (error . args) (write args) (newline) (apply orig-error args))
The intention is to save the original error
procedure in
orig-error
, then redefines error
to show the arguments
then calls the original error
.
In R5RS where there's one toplevel, the (define (error ...) ...)
is
understood as reassignment to the original variable, so it works.
However, since R6RS, we have multiple toplevels as separate lexical scopes, and we have ambiguity.
(import (scheme base) ; imports 'error' binding (in R6RS, its (rnrs)) (scheme write)) (define orig-error error) ; which 'error' should we refer? (define (error . args) (write args) (newline) (apply orig-error args))
With the lexical scoping rule (toplevel definitions are treated as if
in letrec*
bindings; see R6RS section 10, for example),
the error
in the second line must refer to the error
defined in the third line.
And it's a violation to take a value of a variable that hasn't been
calculated, hence the code above is invalid.
In R7RS, it's implementation dependent.
Actually, R6RS also prohibits defining a toplevel variable that conflicts with imported names, so the above code can't work in that sense, too. In R7RS, it's implementation dependent so portable code can't do that.
The proper way is to use renaming import:
(import (except (scheme base) error) (rename (scheme base) (error r7rs:error)) (scheme write)) (define (error . args) (write args) (newline) (apply r7rs:error args))
Now, Gauche has been rather permissive to this kind of
implementation-dependent behaviors. And the above orig-error
example still works if the file is load
ed---in which case,
Gauche processes each toplevel form one by one, so
when it sees (define orig-error error)
it doesn't know yet if error
would be defined in this scope or not.
So it refers to the imported error
.
However, if the file is include
d (which effectively wraps
all the forms by begin
), or you write similar code
within define-library
or define-module
, all the
definitions are compiled at once, and then executed. In that case
you'll see "uninitialized variable" error in 0.9.9.
We strongly recommend to avoid such ambiguous code. However,
in case if you're using existing code that happens to rely
on the old behavior, you can switch back to the old behavior
by defining an enviornment variable
GAUCHE_LEGACY_DEFINE
.
Comments (0)