2024/06/28
Running prebuilt Gauche on GitHub workflow
The setup-gauche
action installs Gauche on GitHub workflow runners for you
(Using Gauche in GitHub Actions). But it downloaded source tarball and compiled, which took time.
Especially if your repo is a small library, it feels waste of time
compiling Gauche every time you push to the repo.
Now, setup-gauche
can use a prebuilt binary on ubuntu-latest
and
macos-latest
platforms. Just give prebuilt-binary: true
as the parameter:
name: Build and test on: [push, pull_request] jobs: build-and-test: runs-on: ubuntu-latest timeout-minutes: 10 steps: - uses: actions/checkout@v3 - uses: practical-scheme/setup-gauche@v5 with: prebuilt-binary: true - name: Install dependencies run: | sudo apt install -y gettext - name: Build and check run: | ./configure make make -s check
Installing prebuilt binary takes around 10s or so; huge time saving.
Note that the prebuilt binary is provided with the latest Gauche release
only. Other parameters of setup-gauche
are ignored if you use
the prebuilt binary.
(You may have noticed that the repository name is now under
practical-scheme
instead
of shirok
--I made practical-scheme
organization and am gradually
moving Gauche repositories to there, for easier maintenance. The URL
is redirected from shirok
so you don't need to update immediately,
but just FYI.)
The following is for those who are curious about behind-the-scene.
Prebuilt binaries are prepared in a different repository: https://github.com/practical-scheme/setup-gauche-binary
It has GitHub actions that fetches the latest release tarball, build in GitHub runner, and upload the result as the assets of the repo's release. That ensures the binary runs on GitHub runners.
2024/01/17
Caching formatter procedure
Lisp's format
procedure is very un-Schemy. Instead of having a set of
composable, orthogonal, do-one-thing-well procedures, format
introduces
a mini-language that's syntactically and semantically separate from the base
language. It is not extendable, loaded with obscure features from the past.
Yet it is handy for typical trivial tasks and that's why Gauche (and other Schemes,
plus a couple fo SRFIs) offer it.
(And to be honest, there's some pleasure to tinker such mini-language implementations.)
Aside from the non-composability, another glaring drawback of format
is
that it needs to interpret the mini language (format string) at runtime.
Most format
calls have a literal format string, and it is waste of time
to parse it every time format
is called. An obvious optimization
is to recognize the literal format string and translates the call to format
by simpler procedures at compile-time. I believe most CL implemenations do so.
However,
Gauche, as well as some other Scheme implementations and SRFI-48, allows the
port argument to be omitted. It is convenient,
but it indeed makes compile-time transformation difficult. If the first
argument of format
is a non-literal expression (it is the case
if you're passing a port), it is diffuclt for the compiler to recognize
if the format string is a constant, even the second argument is a literal
string that looks like a format string. If the first expression yields
a string at runtime, that is the format string and the literal
string is an argument to be shown.
Despite these difficulties, we can still take advantage of literal format string, by caching the format string compilation result at run-time.
It is not exactly the same as memoization. It is difficult to control amount of memoized results, and we only want to cache literal format strings, which needs to be determined at compile time.
So, we implemented a hybrid solution. The compiler macro attached
to format
checks if possible format string is a literal, and if so,
it transforms the call into an internal procedure that takes an extra
argument. The extra argument contains the position of the possible literal
format string, and a mutable box. The following is the core part of
the compile-time transformation:
(define-syntax make-format-transformer (er-macro-transformer (^[f r c] (match f [(_ shared?) (quasirename r `(er-macro-transformer (^[f r c] (define (context-literal pos) `(,',shared? ,pos ,(box #f))) (match f [(_ (? string?) . _) (quasirename r `(format-internal ',(context-literal 0) (list ,@(cdr f))))] [(_ _ (? string?) . _) (quasirename r `(format-internal ',(context-literal 1) (list ,@(cdr f))))] [(_ _ _ (? string?) . _) (quasirename r `(format-internal ',(context-literal 2) (list ,@(cdr f))))] [_ f]))))]))))
(NB: shared?
flag is used to share the routine with format
and
format/ss
. We need to check the literal string in first, second and third
position, for Gauche's format
allows two optional arguments before the
format string.)
At run-time, the internal function can see if the literal string is
indeed a format string. If so, it computes a formatter procedure
based on the format string, and stores it to the mutable box. Subsequent
calls will use the computed formatter procedure, skipping parsing and
compiling the format string. The caching occurs per-call-site, much like
the global variable lookup (we cache the <gloc>
object, the result
of lookup, in the code vector).
The format-internal
procedure checks optional arguments, and calls
format-2
. Its first argument can be a mutable box introduced
by the above macro, if we do know the format string is literal.
(define (format-2 formatter-cache shared? out control fmtstr args) (let1 formatter (if formatter-cache (or (unbox formatter-cache) (rlet1 f (formatter-compile fmtstr) (set-box! formatter-cache f))) (formatter-compile fmtstr)) (case out [(#t) (call-formatter shared? #t formatter (current-output-port) control args)] [(#f) (let1 out (open-output-string) (call-formatter shared? #f formatter out control args) (get-output-string out))] [else (call-formatter shared? #t formatter out control args)])))
A micro benchmark shows it's effective. In real code, the effect may not be so prominent, but it does remove worries that you're wasting time for parsing format string.
(define (run p) (dotimes [n 1000000] (format p "n=~7d 1/n=~8,6f\n" n (/. n)))) (define (main _) (time (call-with-output-file "/dev/null" run)) 0)
With caching off:
;(time (call-with-output-file "/dev/null" run)) ; real 19.796 ; user 19.790 ; sys 0.000
With caching on:
;(time (call-with-output-file "/dev/null" run)) ; real 10.313 ; user 10.310 ; sys 0.000
Tag: format
2023/09/30
Pipeworks
Ports are very handy abstraction of data source and sink. In Gauche libraries, you can find many utitlies that reads from input port or writes to output port, and then another utilities (e.g. convert from/to string) are built on top of them.
While they are useful, it becomes tricky when you want to compose those
utilities. Suppose you have a procedure f
that writes to an output port,
and a procedure g
that read from an input port. You want to feed the
output of f
to g
while make
f
and g
run concurrently, so some threading is involved. You can write
such a pipe using procedural ports but it is cumbersome to do so for
every occasion. I want something that's as easy as Unix pipe.
So I initially started to writing a pipe utility using procedural ports. Then I realised I also want a device dual to it; while a pipe flows data from an output port to an input port, the co-pipe, or pump, pulls data from an input port and push it to an output port. An example is that you run a subprocess and feed its error output to your current output port. When you invoke a subprocess (ref:gauche.process), you can get its error output from an input port. So you need to read it actively and feed the data to your current output port.
Then you might want to peek the error output to find out a specific error message appears. So your contraption reads actively an input port, and feed the data to an output port, and you can read whatever data flows through it from another input port to monitor.
There are many variations, and mulling over it for some time, I wrote a library that abstracts any of such configurations. I call the device plumbing (draft:control.plumbing).
You can also create an output port that feeds the data to multiple outputs, or gather multiple input port into one input port. Refer to the manual to see what you can do.
Tags: 0.9.13, control.plumbing
2023/09/29
Real numerical functions
Scheme devines a set of elementary functions that can handle complex numbers.
In Gauche, complex elementary functions is built on top
of real domain functions. Up to 0.9.12, we had real-only version
with the name such as %sin
or %exp
. As the percent prefix
suggests, they are not meant to be used directly; sin
ro exp
are built on top of them.
However, sometimes you want to use real-only versions to avoid overhead
of type testing and dispatching complex numbers. srfi:94 defines
real-domain functions, so we decided to adapt them. Now you have real-sin
,
real-exp
etc. (draft:real-exp) as built-in.
Note that scheme.flonum
also provides "flonum-only"
version of elementary functions, e.g. flsin
(ref:scheme.flonum).
They won't even accept exact numbers. Since it is in R7RS-large,
you may want to use them for portable code.
Although the names %sin
etc. are undocumented and not meant to be
directly used, they were visible by default, so some existing code
are relying on it. It needs some effort to rewrite all occurrences
of such functions with the new real-sin
etc, so we provide
a compatibility module, compat.real-elementary-functions
. Just
using it in your code provides compatibility names. If you want
to make your code work on both 0.9.12 and 0.9.13, you can use
cond-expand
:
(cond-expand ((library compat.real-elementary-functions) (use compat.real-elementary-functions)) (else))
Tags: 0.9.13, NumericFunctions
2023/09/28
Pretty print indentation
Yet another small thing good to have. You can now specify base indentation to the pretty printer (ref:pprint). It is applied to the second line and after.
gosh> (pprint (make-list 100 'abc) :indent 20) (abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc) #<undef>
To say more precisely, when the pretty printer spills data to another line, it inserts "a newline + whitespace * indent", plus any indent computed by the pretty printer.
The benefit easiest to see is when the pretty printer is used inside
format
. When a pretty printing triggered by the ~:w
directive,
it sets the base indentation at the column it starts printing. Hence
the entire pretty print is indented to align nicely:
gosh> (format #t "Long list: ~:w\n" (make-list 100 'abc)) Long list: (abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc) #<undef>
Since pretty printing is built-in to the core printer (pprint
is just
a simple interface to use), other output routines such as write
can
also use base indentation. You can set indent
slot of a write-controls.
gosh> (write (make-list 100 'abc) (make-write-controls :pretty #t :indent 20 :width 79)) (abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc abc)#<undef>
Tags: 0.9.13, pretty-printing
Comments (2)