Gauche Devlog

2026/01/06

Extension package registry

We just renewed the Gauche homepage. It's mostly cosmetics, but one notable change is the Extension Packages page

It's been in our todo list for very long time to create some system to track Gauche extension packages. It is trivial to create a site where users can put the info. What's not trivial is how to keep the info updated.

It's a burden to the user if we ask them to keep updating such info whenever they update their extension package.

If a user puts their website/email for the package, but then moves away from Gauche development, and eventually the site/email become inactive and go away, we don't know what to do with the entry; it'd be also difficult if somebody wants to take over the project.

Should anybody be able to update the package's info? Limiting it to the original authors becomes an issue if they go inactive and out of touch. Allowing it may cause a security issue if someone replaces the distribution URL to malicious one.

To vet the users entering info, we need some mechanism of user registration and authentication, which adds another database to maintain.

These implications kept us from implementing the official mechanism to provide the extension package registry.


Things has changed in the last decate or so.

First, distributed VCS and their hosting services have become a norm. Instead of having personal websites to serve extension package tarballs and documents, developers can put their repository on one of those services and make it public.

Recent Gauche provides a standard framework of building extensions. One important aspect of it is package.scm in the source tree to keep meta information about the package, including version number, authors, "official" repository url, dependencies, etc.

So, once we know the package's repository URL, we can get its meta information!

The author updates package.scm as the development proceeds, because it is a part of the source. No need to update information on the registry separately.

Anybody can create account on those services, but the service gives certain identity to each one and the place to interact with each other. Sure, people move away eventually, but it's rarely that they bother to remove the repositories; and it's easy to inherit the abandoned project.

We already have a official way to state such transfer of control in package.scm (superseded-by slot). If the successor can contact the original author/maitainer/committer, the package.scm in the original repository can have superseded-by field pointing to the new repository. It is not mandatory, but it can make it clear where is the "official" successor.

In other words, we can use the existing public repositories as the metadata database, and merely maintain pointers to them by ourselves.


So, how to manage those pointers? We don't have thousands of extension packages updated daily, so we don't need a sophisticated database server for it.

We decided to piggyback on the public DVCS service again. Gauche package repository index github repo maintains the list of package urls under its packages directory. If you want your packages to be listed, just fork it, add your package, and send a pull request. (If you don't want to use GitHub, just send a patch via email.)

Which repository is added when, by whose request, is recorded in the commits of that repo.

Currenly, pulling metadata and reflecting it on the webpage is done in occasional batch process. We'll adjust the frequency as it goes. If we ever get very popular and receiving tons of new package registration requests, we might need to upgrade the system, but until then, this will be the least-maintenance-cost solution.


To be in the registry, your extension package needs package.scm. I scanned through the existing list on wiki (WiLiKi:Gauche:Packages) and added packages that I could find the public repository with package.scm.

If your extension is old enough not to have package.scm, a convenient way is to run gauche-package populate in the top directory of the source tree. It gives you a template package.scm with some fields filled by whatever information it can find.

Tag: Extensions

2025/12/11

Alternative external formats for arrays and complex numbers

Gauche now recognizes a few different external format for arrays and complex numbers. For writing, it can be selected by <write-controls> object, or print-mode in REPL.

Array literals

Gauche has been supporting srfi:25 arrays, but it does not define external representations. Gauche uses srfi:10 #, mechanism to allow to write arrays that can be read back, but it is not very user-frendly.

gosh$ (tabulate-array (shape 0 4 0 4) *)
#,(<array> (0 4 0 4) 0 0 0 0 0 1 2 3 0 2 4 6 0 3 6 9)

Now we have this:

gosh$ (tabulate-array (shape 0 4 0 4) *)
#2a((0 0 0 0)
    (0 1 2 3)
    (0 2 4 6)
    (0 3 6 9))

The #2a(...) notation is defined in srfi:163 Enhanced Array Literals, and in its simplest form, it is also compatible to Common Lisp's array literals. From 0.9.16, it is the default output format of the array.

You can also make Gauche reports the lengthes of each dimension:

gosh$ ,pm array dimensions
Current print mode:
       length :  50         pretty :  #t     bytestring :  #f
        level :  10           base :  10          array : dimensions
        width :  80   radix-prefix :  #f        complex : rectangular
string-length : 256  exact-decimal :  #f
gosh$ (tabulate-array (shape 0 4 0 4) *)
#2a:4:4((0 0 0 0)
        (0 1 2 3)
        (0 2 4 6)
        (0 3 6 9))

The reader recognizes all of those formats.

Complex literals

There was an asymmetry in input/output of complex literals. For reading, both the rectangular notation 1.4142135623730951+1.4142135623730951i and the polar notation 2@0.7853981633974483 are recognized, but for printing, it is alyways in the rectangular notation. Now you can choose the output format.

Gauche also extended the polar notation by adding suffix pi, e.g. 2@0.25pi to specify the phase by the multiple of pi.

The following session shows how a complex number is printed with different print-mode:

gosh> (expt -16 1/4)
1.4142135623730951+1.4142135623730951i
gosh> ,pm polar
Current print mode:
       length :  50           base :  10  exact-decimal :  #f
        level :  10   radix-prefix :  #f          array : compact
       pretty :  #t  string-length : 256        complex : rectangular
        width :  79     bytestring :  #f
gosh> (expt -16 1/4)
2.0@0.7853981633974483
gosh> ,pm complex polar-pi
Current print mode:
       length :  50           base :  10  exact-decimal :  #f
        level :  10   radix-prefix :  #f          array : compact
       pretty :  #t  string-length : 256        complex : polar-pi
        width :  79     bytestring :  #f
gosh> (expt -16 1/4)
2.0@0.25pi

Furthermore, Gauche also supports Common-Lisp style complex notation, #c(...). This is particulary useful to exchange data between Gauche and CL programs.

gosh> ,pm complex vector
Current print mode:
       length :  50           base :  10  exact-decimal :  #f
        level :  10   radix-prefix :  #f          array : compact
       pretty :  #t  string-length : 256        complex : vector
        width :  79     bytestring :  #f
gosh> (expt -16 1/4)
#c(1.4142135623730951 1.4142135623730951)

The reader can read all the complex formats.

Tags: 0.9.16, REPL, printer, array, complex

2025/04/13

Exact and repeating decimals

Novice programmers are often perplexed by most programming languages being not able to add 0.1 ten times ``correctly'':

s = 0
for i in range(10):
   s += 0.1
print(s)

# prints: 0.9999999999999999

"Floating point numbers are inexact, that's why," tells a tutor. "You should expect some errors."

Gauche isn't an exception, for decimal notation is read as inexact numbers:

gosh> (apply + (make-list 10 0.1))
0.9999999999999999

However, Scheme also has exact numbers. Numbers without a decimal point or exponent, or rational numbers, are read as exact numbers. You can also prefix decimal numbers with #e to make them exact. Using exact numbers, you can have an exact result.

gosh> (apply + (make-list 10 #e0.1))
1

The trick is that Gauche reads #e0.1 as an exact rational number 1/10, and perform computation as exact rationals. It is revealed when the result is not a whole number:

gosh> (+ #e0.1 #e0.1)
1/5

It is incovenient, though, when you want to perform exact computation with decimal numbers, i.e. adding prices with dollars and cents. If you add $15.15 and $8.91, you want to see the result as 24.06 instead of 1203/50.

;; Inexact
gosh> (+ 15.15 8.91)
24.060000000000002

;; Exact
gosh> (+ #e15.15 #e8.91)
1203/50

So, we added a new REPL print mode, exact-decimal. If you set it to #t, Gauche tries to print exact non-integer result as decimal notation whenever possible.

gosh> ,pm exact-decimal #t
Current print mode:
        length :  50
         level :  10
        pretty :  #t
         width :  79
          base :  10
         radix :  #f
 string-length : 256
    bytestring :  #f
 exact-decimal :  #t

Let's see:

gosh> (+ #e15.15 #e8.91)
#e24.06

We can always have exact decimal notation of rational numbers whose denominator's factor contains only 2 and 5.

gosh> 1/65536
#e0.0000152587890625

As far as we use addition, subtraction, and multiplication of exact decimal notated numbers, the result is always representable with exact decimal notation.

But what if division is involved? Isn't it a shame that we have an exact value (as a rational number), but can't print it as a decimal exactly?

Decimal notation of rational numbers whose denominator contains factors other than 2 and 5 becomes repeating decimals. Hence if we have a notation of repeating decimals, we can cover such cases.

So, here it is. If a numeric literal contains # followed by one or more digits, we understand the digits after # repeating infinitely.

gosh> 0.#3
0.3333333333333333
gosh> 0.0#123
0.012312312312312312
gosh> 0.#5
0.5555555555555556
gosh> 0.1#9
0.2

(Note: If no digits follows #, it is "insignificant digit" notation in R5RS.)

The above examples have limited number of digits because they're inexact numbers (note that we didn't prefix them with #e). For exact numbers, we can represent any rational numbers exactly with this notation:

gosh> 1/3
#e0.#3
gosh> 1/7
#e0.#142857
gosh> (* 1/7 2)
#e0.#285714
gosh> (* #e0.#3 #e0.#142857)
#e0.#047619

Note that the length of repetition can be arbitrarily long, so there are numbers that can't practically be printed in this notation. For the time being, we have a hard limit of 1024 for the length of repetition. If the result exceeds this limitation, we fall back to rational notation.

;; 1/2063 has repeating cycle of 1031 digits
gosh> (/ 1 2063)
1/2063

Tags: Numbers, Syntax

2024/06/28

Running prebuilt Gauche on GitHub workflow

The setup-gauche action installs Gauche on GitHub workflow runners for you (Using Gauche in GitHub Actions). But it downloaded source tarball and compiled, which took time. Especially if your repo is a small library, it feels waste of time compiling Gauche every time you push to the repo.

Now, setup-gauche can use a prebuilt binary on ubuntu-latest and macos-latest platforms. Just give prebuilt-binary: true as the parameter:

name: Build and test

on: [push, pull_request]

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    steps:
    - uses: actions/checkout@v3
    - uses: practical-scheme/setup-gauche@v5
      with:
        prebuilt-binary: true
    - name: Install dependencies
      run: |
        sudo apt install -y gettext
    - name: Build and check
      run: |
        ./configure
        make
        make -s check

Installing prebuilt binary takes around 10s or so; huge time saving.

Note that the prebuilt binary is provided with the latest Gauche release only. Other parameters of setup-gauche are ignored if you use the prebuilt binary.

(You may have noticed that the repository name is now under practical-scheme instead of shirok--I made practical-scheme organization and am gradually moving Gauche repositories to there, for easier maintenance. The URL is redirected from shirok so you don't need to update immediately, but just FYI.)


The following is for those who are curious about behind-the-scene.

Prebuilt binaries are prepared in a different repository: https://github.com/practical-scheme/setup-gauche-binary

It has GitHub actions that fetches the latest release tarball, build in GitHub runner, and upload the result as the assets of the repo's release. That ensures the binary runs on GitHub runners.

Tags: github, CI

2024/01/17

Caching formatter procedure

Lisp's format procedure is very un-Schemy. Instead of having a set of composable, orthogonal, do-one-thing-well procedures, format introduces a mini-language that's syntactically and semantically separate from the base language. It is not extendable, loaded with obscure features from the past. Yet it is handy for typical trivial tasks and that's why Gauche (and other Schemes, plus a couple fo SRFIs) offer it. (And to be honest, there's some pleasure to tinker such mini-language implementations.)

Aside from the non-composability, another glaring drawback of format is that it needs to interpret the mini language (format string) at runtime. Most format calls have a literal format string, and it is waste of time to parse it every time format is called. An obvious optimization is to recognize the literal format string and translates the call to format by simpler procedures at compile-time. I believe most CL implemenations do so.

However, Gauche, as well as some other Scheme implementations and SRFI-48, allows the port argument to be omitted. It is convenient, but it indeed makes compile-time transformation difficult. If the first argument of format is a non-literal expression (it is the case if you're passing a port), it is diffuclt for the compiler to recognize if the format string is a constant, even the second argument is a literal string that looks like a format string. If the first expression yields a string at runtime, that is the format string and the literal string is an argument to be shown.

Despite these difficulties, we can still take advantage of literal format string, by caching the format string compilation result at run-time.

It is not exactly the same as memoization. It is difficult to control amount of memoized results, and we only want to cache literal format strings, which needs to be determined at compile time.

So, we implemented a hybrid solution. The compiler macro attached to format checks if possible format string is a literal, and if so, it transforms the call into an internal procedure that takes an extra argument. The extra argument contains the position of the possible literal format string, and a mutable box. The following is the core part of the compile-time transformation:

(define-syntax make-format-transformer
  (er-macro-transformer
   (^[f r c]
     (match f
       [(_ shared?)
        (quasirename r
          `(er-macro-transformer
            (^[f r c]
              (define (context-literal pos) `(,',shared? ,pos ,(box #f)))
              (match f
                [(_ (? string?) . _)
                 (quasirename r
                   `(format-internal ',(context-literal 0) (list ,@(cdr f))))]
                [(_ _ (? string?) . _)
                 (quasirename r
                   `(format-internal ',(context-literal 1) (list ,@(cdr f))))]
                [(_ _ _ (? string?) . _)
                 (quasirename r
                   `(format-internal ',(context-literal 2) (list ,@(cdr f))))]
                [_ f]))))]))))

(NB: shared? flag is used to share the routine with format and format/ss. We need to check the literal string in first, second and third position, for Gauche's format allows two optional arguments before the format string.)

At run-time, the internal function can see if the literal string is indeed a format string. If so, it computes a formatter procedure based on the format string, and stores it to the mutable box. Subsequent calls will use the computed formatter procedure, skipping parsing and compiling the format string. The caching occurs per-call-site, much like the global variable lookup (we cache the <gloc> object, the result of lookup, in the code vector).

The format-internal procedure checks optional arguments, and calls format-2. Its first argument can be a mutable box introduced by the above macro, if we do know the format string is literal.

(define (format-2 formatter-cache shared? out control fmtstr args)
  (let1 formatter (if formatter-cache
                    (or (unbox formatter-cache)
                        (rlet1 f (formatter-compile fmtstr)
                          (set-box! formatter-cache f)))
                    (formatter-compile fmtstr))
    (case out
      [(#t)
       (call-formatter shared? #t formatter (current-output-port) control args)]
      [(#f) (let1 out (open-output-string)
              (call-formatter shared? #f formatter out control args)
              (get-output-string out))]
      [else (call-formatter shared? #t formatter out control args)])))

A micro benchmark shows it's effective. In real code, the effect may not be so prominent, but it does remove worries that you're wasting time for parsing format string.

(define (run p)
  (dotimes [n 1000000]
    (format p "n=~7d 1/n=~8,6f\n" n (/. n))))

(define (main _)
  (time (call-with-output-file "/dev/null" run))
  0)

With caching off:

;(time (call-with-output-file "/dev/null" run))
; real  19.796
; user  19.790
; sys    0.000

With caching on:

;(time (call-with-output-file "/dev/null" run))
; real  10.313
; user  10.310
; sys    0.000

Tag: format

More entries ...