2011/02/03
Bitten by floating point numbers again
When I read about PHP hanging while parsing 2.2250738585072011e-308, I did what most language implementers must have done---typed in the number to the Gauche prompt. Gauche wasn't vulnerable to this number. Good.
It is reported that PHP problem was that Intel's 80bit
extended floating point calculation was used
where IEEE double precision calculation should have been used.
Like PHP, Gauche refers to Clinger's paper (PS), but
ours is rather naive implementation of Clinger's AlgorithmR
and using exact integer arithmetic, thus we have nothing to do
with extended precision calculation.
So when I heard about Java hangs reading 2.2250738585072012e-308 shortly after, I didn't bother checking. Ah the same problem again.
Well, it wasn't.
Jens Thiele reported that Gauche had the same issue as Java.
gosh> 2.2250738585072012e-308 ;; => hangs
Let's see what is happening. For simplicity we only consider positive numbers, but the discussion applies to negative numbers as well by reverting comparisons (e.g. less than -> greater than).
The floating point number reader must map the input real number to the closest IEEE double precision number ("IEEE double" from now on). In most cases, each IEEE double covers the region of real numbers between 1/2 LSB greater than and 1/2 LSB less than the exact real number it represents, as shown below.
If the input falls on the exact center of two IEEE doubles, we use round-to-even rule.
There are exceptions on the number 2^m. The adjacent IEEE double below 2^m is closer than the one above, since we switch exponent below this boundary, so the numbers below it have greater precision. Those IEEE doubles on the boundary covers between 1/2 LSB greater than, but only 1/4 LSB less than, the exact number it represents. Clinger's paper, and thus Gauche, correctly account this boundary case.
However, there is one exception among these exceptions: The minimum normalized number, 2^(-1022). In this case, the adjacent IEEE double below is just as far apart as the one above, since we don't have more precision anymore.
Our implementation of AlgorithmR
missed this condition, which created a gap in the coverage.
The number 2.2250738585072012e-308 falls into this gap. When the algorithm refines approximation, it thinks the input number is too small to round to 2^(-1022), so it decreases the estimate. Then it finds the input is too large to round down, so it increases the estimate. Repeat ad infinitum.
This anormality only occurs here, since no denormalized numbers satisfy our boundary condition check for the 2^m cases.
The fix is just one line.
--- src/number.c (revision 7350) +++ src/number.c (working copy) @@ -3543,6 +3543,7 @@ case -1: /* d2 < y */ if (Scm_NumCmp(m, SCM_2_52) == 0 && sign_d < 0 + && k > -1074 && Scm_NumCmp(Scm_Ash(d2, 1), y) > 0) { goto prevfloat; } else {
Tag: Flonums
2011/01/07
Queue of zero length
Does it make sense to have a queue whose maximum length is zero?
I thought it didn't, so when I wrote <mtqueue>
(ref:util.queue) I defined the range of mtqueue-max-length
be one or more. (Zero means the queue has unlimited length,
which is a bit kludgy in retrospect).
A queue with zero max length would be always empty and full.
It seemed that it'd be no use.
Now it occurs me that actually it may be useful as a synchronization device.
The multi-thread queue <mtqueue>
can be used to synchronize
consumer threads and producer threads. A consumer thread blocks on
dequeue/wait!
when there's no data in the queue, and unblocks
when some data is put in the queue.
A producer thread blocks on enqueue/wait!
when the queue is
full, and unblocks when the data in the queue is consumed and
there is some room in it. So,
an <mtqueue>
with max-length == 1 can be used as
a synchronizing variable, like MVar
in Haskell.
If we had an <mtqueue>
with max-length == 0, how would it
work? A possible behavior would be as follows:
A consumer thread would block on dequeue/wait!
unless there's a producer thread waiting on enqueue/wait!
.
A producer thread would block on enqueue/wait!
unless
there's a consumer thread waiting on dequeue/wait!
.
That is, it allows passing data from the producer to the consumer directly, without putting the data into a buffer.
It can be useful, since once a piece of data is put in the queue, it is left untouched until a consumer consumes it. When there's something happens in consumer threads, such that all of them are taking extremely long computation, the queued data will be held in the queue indefinitely. If you want to guarantee that a piece of data is either processed timely or otherwise timeout, you need to put a separate logic to watch the queue.
With a zero-length queue, the producer can set timeout, so it is easy to implement the behavior to timeout when consumers are busy.
This is a kind of special interpretation of the behavior of
<mtqueue>
. With a simple definition---enqueue/wait!
blocks when the queue is full, and dequeue/wait!
blocks
when the queue is empty---a straightforward interpretation
is that both enqueue and dequeue always block and do nothing
useful. So it is arguable that we should provide a different
data structure for this non-buffering synchronization.
Besides, the current implementation interpretes zero max-length as "unlimited length". It would be an incomaptibile change if we support zero-length queue.
I'm still undecided, but for now, I feel non-buffering synchronization
is a natural extension for the behavior of <mtqueue>
with
zero max-length,
and will be better to have it than to have different synchronization
primitives. Since <mtqueue>
was just introduced in 0.9.1,
it may be not too late to change it.
However, I might be overlooking something. I'm open for discussion. Also I'm curious if other implementations/languages have this non-buffering synchronization primitives.
(Added on 2011/01/08 01:17:47 UTC): Rui pointed me Java's
SynchronousQueue. This is probably
the same as I expect in zero-length mtqueue. In Java it appears
its tradition to separate classes if implementations differ,
but in Gauche I think it's more natural to implement it as
an extention of existing <mtqueue>
.
Tags: util.queue, mtqueue, threads
2010/12/16
Looking for alternative read-time constructor syntax
I rely on srfi:10 read-time constructor #,(tag datum ...)
a lot.
It does have a dark corner (there's no clear semantics to
make sure a particular tag be available at the time
it is read), but having a uniform syntax and a standard way
to extend it is indispensable for practical applications.
So, it is very unfortunate that R6RS made an incompatible choice.
#,X
is taken for (unsyntax X)
.
Although I don't plan to make Gauche fully comform R6RS,
I'd like to make it compatible to R6RS as much as possible,
and it is desirable to be able to read R6RS code.
The plan is to switch the reader: In R6RS mode, #,
is for
unsyntax
. Otherwise, #,
is
for srfi-10. After all, you can write unsyntax without using
abbreviation, but you cannot write read-time constructor
in other ways.
However, there will be a time that someone wants to write abbreviated unsyntax and read-time constructor in one file. It won't harm to have alternative read-time constructor syntax for more flexibility.
Specifically, I'm thinking to make records (ref:gauche.record)
printable by default. Just like Common Lisp's struct,
but it would be better to use existing srfi-10 syntax
instead of inventing a new syntax. If records have
standard external representation,
I expect the srfi-10 syntax appear in the data and code a lot more
frequently. If Gauche adopts syntax-case
, the demand
of using abbreviated unsyntax will also grow. I see
potential of conflict here.
★ ★ ★
What would be a good choice for alternative syntax of read-time constructor? I don't have a concrete idea yet. I just record some ideas here for future reference and discussion.
#.(tag datum ...)
: Borrows from read-time eval syntax of Common Lisp. I bet the chance that Scheme standard adopts read-time evaluation is a lot smaller than it adopts read-time constructor: The former opens a big can of worms on what environment the expression should be evaluated. The similarity of read-time evaluation and read-time construction, however, could lead more confusion than other choices.#!ctor(tag datum ...)
: The ctor word can be different. This is a valid syntax in R6RS, in which#!ctor
part is just treated as a comment and the whole expression is read as a list. I'm not sure whether it is a good thing or not, though. It is also more verbose than other choices.#!(tag datum ...)
: Some implementations (and past RnRS's) uses#!
as a prefix for special data, e.g.#!null
. This choice can be seen as an extention to it. A disadvantage: If this appears at the top of the file, it can be mistaken to be an interpreter line.#@(tag datum ...)
: The character@
is kind of arbitrary. ChezScheme uses this prefix for fasl objects. It gives me a sort of "internal representation" feeling. Maybe too arbitrary.#$(name datum ...)
: I think this more as a dedicated syntax for records. Well, it looks like Common Lisp's#S(...)
, and it would be more compact than#,(record name datum ...)
. Chicken uses#$
for a special purpose so we conflict with it.
Tags: R6RS, srfi-10, syntax, gauche.record
2010/12/13
Release and test
Whew, finally I put 0.9.1 out of the door. There are several loose ends I couldn't managed to tie; for example, the new interface of rfc.http wasn't made public since I couldn't tested it enough. We still had enough stuff to make a long release notes, though.
The bottleneck of release cycle is testing. Yes we have unit tests, but they tests mostly the internal consistency---it tests that whatever you do inside Gauche you won't get wrong results.
The kind of tests before releases are focused on the external consistency---how Gauche interacts with the external world. Does it behave the same on different OSes? Does it work consistently when combined with other programs, and communicating over the network?
This time I happened to find a bug in gauche-config
program
right after the release. The bug would affect where extension
packages are installed, and fixing things after many extension
packages are placed in wrong directory would be messy. So I
replaced the release with the fix.
How can I prevent problems like this, and ensure checking
other stuff that interacts with the outside world?
I had added some unit tests for utility scripts gauche-install
,
gauche-config
and gauche-package
, but they were not enough to
catch the error I had.
One idea is to have a script that automates packaging, installing, and checking integration of the external world. It should be automated, since the release test is taking longer and longer as more external programs interact with Gauche. I'm curious how other projects manage the release testing.
2010/12/11
New directory structure
Until 0.9, you had to recompile Gauche extensions for every new release of Gauche, even if it was a micro version up (i.e. 0.8.12 -> 0.8.13). It was because C API could be changed between them.
After 1.0, it is our goal to keep API & ABI compatibility at least for minor version level, so that extension modules compiled for version X.Y.z1 will work for X.Y.z2 where z2 >= z1. 0.9.x series is the run-through rehearsal for the stable 1.0 release, to see our scheme really works after 1.0.
At 0.9 release, however, I overlooked one thing:
Full Gauche version number (major.minor.micro) was embedded
in the pathnames where architecture-depended files
are installed; extensions' DSOs are
in /usr/lib/gauche/site/X.Y.Z/${arch}/
(if Gauche was installed with --prefix=/usr
.)
This wouldn't work if we want to share extensions among
different micro versions.
An easy way could be to keep using "0.9" throughout 0.9.x
series for the 'site' directory. That is, while Gauche "core"
files would still go to /usr/lib/gauche/X.Y.Z/${arch}/
,
extension modules install their files to
/usr/lib/gauche/site/X.Y/${arch}/
. Version 0.9's
structure already fit this scheme, so transition would
be smooth.
However, this scheme somewhat mixed gauche version
(X.Y.Z) and ABI version (X.Y). That is,
in /usr/lib/gauche/0.9/
, the "0.9" part would mean gauche version,
while in /usr/lib/gauche/site/0.9/
, the "0.9" part would mean ABI
version. It might not matter in practice, but I didn't quite
like such mixture.
And there came another issue: The Gauche runtime DSO
(libgauche.so
) is placed in the system's common library directory,
such as /usr/lib
. It is common to append versions after the
name (e.g. libgauche.so.0
and libgauche.so.0.9
) and make versionless
name a symlink to the versioned name. However, the common assumption
for this scheme is that binary compatibility is kept among
the same major versions; if something is compiled with libgauche.so.1.0
,
it is expected to work with libgauche.so.1.1
as well.
At this moment I want to reserve the possibility to change ABI
between Gauche version 1.0 and 1.1.
So, I decided to name the Gauche runtime DSO as libgauche-X.Y.so
,
where X.Y is the ABI version. Then it is clear that which version
is compatible to which DSO. It also allows to install more than
one versions of Gauche with different ABI versions.
If we name DSO as libgauche-X.Y
, then why don't we name
other runtime libraries as well? That is, we can install
stuff under /usr/lib/gauche-X.Y/
and /usr/share/gauche-X.Y
.
Then it is clear that which ABI version the file(s) are for,
and it is easier to manage when you have multiple versions
of Gauche with different ABI versions (e.g. it's easy to delete
files for old versions).
So, from 0.9.1, I adopt the following naming scheme:
- Gauche runtime DSO
${exec_prefix}/libgauche-X.Y.so
${exec_prefix}/libgauche-X.Y.so.0
(soname)${exec_prefix}/libgauche-X.Y.so.0.Z
(realname)
- Architecture dependent files
${exec_prefix}/gauche-X.Y/X.Y.Z/${arch}/*
;; core's *.so${exec_prefix}/gauche-X.Y/X.Y.Z/include/*
;; core's *.h${exec_prefix}/gauche-X.Y/site/${arch}/*
;; extensions' *.so${exec_prefix}/gauche-X.Y/site/include/*
;; extensions' *.h
- Architecture independent files
${datadir}/gauche-X.Y/X.Y.Z/*
;; core's *.scm${datadir}/gauche-X.Y/site/*
;; extensions' *.scm
During 0.9.x period, the old versionless directories are automatically
added to the library search path, and symlink are created from
libgauche.so
etc. to the versioned DSOs. So, the extension
modules you've installed for Gauche 0.9 should keep working
after you install 0.9.1.
(Note for those who are chasing development trunk: If you've installed extensions for 0.9.1_pre1 or 0.9.1_pre2, the official 0.9.1 release may not be able to find them, since I tweaked structure at the last minute. Sorry.)
Tags: 0.9.1, DirectoryStructure
Comments (0)