2013/03/19
When the inexact square root of an integer is exact
In Exact sqrt entry, I wrote that, for a natural number smaller than 2^53, we could use double-precision floating point sqrt
and take the answer exact if the result was an integer. As Mark H Weaver pointed out, it was wrong.
Suppose we have nonnegative integers N, M, m, and non-zero real number e, where M = m^2 < 2^53, N = (m*(1+e))^2 < 2^53.
If the absolute value of e isn't greater than 2^-53,
square root of N, which is m*(1+e), can be rounded to
m in the double-precision sqrt
calcluation.
The result becomes an integer, but not exact. That's the case
we want to exclude.
N - M = (2e+e^2)M. The maximum bound of (2e+e^2) is (2^-52 + 2^-106) when e = 2^-53, so if M is greater than 2^52, N - M can exceed 1 and we might incorrectly recognize N as a square number of m.
The greatest square number below 2^52 is (2^26-1)^2 = 4503599493152769, and (2^-52 + 2^-106)*4503599493152769 is 81129635996755064984528658366465/81129638414606681695789005144064, which is smaller than 1.
2^52 = 4503599627370496 is also a square number, and
calculating (sqrt (inexact (+ (expt 2 52) 1)))
yields
a rounded integer 67108864.0
.
But the lower bound, (-(2^52) + 2^-106)*4503599627370496 is
-18014398509481983/18014398509481984, which is grater than -1,
so applying inexact sqrt on 2^52-1 won't lead us to the rounded
integer. (Indeed, (sqrt (inexact (- (expt 2 52) 1)))
yields
67108863.99999999
.)
So we can use inexact square root when the input is smaller than 2^52.
Tags: Flonums, sqrt, exact-integer-sqrt
2013/03/16
Checking scripts
I write small scripts in Gauche, all the time. They're not so 'serious' as to require unit-tests and configure scripts and other scaffolds, but they're not one-time throwaway scripts, either. So it's annoying to find out a silly bug like misspelling in less-executed code paths later.
Here's a command I run to quickly check such scripts.
gosh -ugauche.test -l script-name -E "test-module 'user" -Eexit
Tag: Script
2013/03/12
And here comes random data generators
I just checked in data.random
, a collection of random
data generators and their combinators. The names of API functions
are not yet fixed, but I think the overall it's in a good shape.
(Since 0.9.4 is overdue, I might be going to release it without
making data.random
official. I'm not sure yet.)
Here's the code: http://gauche.git.sourceforge.net/git/gitweb.cgi?p=gauche/Gauche;a=blob;f=lib/data/random.scm;hb=HEAD
It provides a bunch of primitive random generators such as followings.
- uniform distribution
(integer size :optional (start 0))
returns a generator that produces random integer between start and start+size-1, uniformly.(integer-between lo hi)
returns a generator that produces random integer between lo and hi (both inclusive).int8
,uint8
etc. are preset generators to produce the range their name suggest.(char :optional cset)
returns a generator of random characters from a character set. When omitted, we use#[A-Za-z0-9]
as the default character set.- We also have
boolean
,real
,real-between
. - We want to have exact rational generators and complex generators, but I wonder how the range and distribution should be specified.
- nonuniform distribution
- For discrete sampling, we have geometric and poisson distribution.
- For continuous sampling, we have normal and exponential distribution.
Then, those generators can be combined to make more complex generators.
- random choice
(one-of generators)
returns a generator that picks one generator in generators randomly to produce the next value.(weighted-sample weight&generators)
allows you to specify weight of selection probability for each generators.
- aggregate data
(pair-of gen1 gen2)
,(tuple-of gen ...)
list-of
,vector-of
,string-of
- these combinators can be called in two different forms, e.g.(list-of sizer item-gen)
: sizer can be an integer, or an integer-generator, to give the length of the resulting list. item-gen is a generator to produce elements.(list-of item-gen)
: If sizer is omitted, we use some default generator to determine the length of the resulting list. Currently I use(poisson 4)
provisionally.
I also have permutation-of
and combination-of
, which takes
a list of items (not item generators).
What I like about the current shape is that those generators can be
combined using gauche.generator
framework as well; e.g. you
can have series of sum of two dice rolling by:
(gmap + (integer-between 1 6) (integer-between 1 6))
or apply a filter:
(gfilter (cut < 0 <> 1) (exponential 1))
or taking some values into a list:
(generator->list (poisson 5) 10)
Here are some elements about API I'm still pondering about:
- We have procedures that creates a generator (e.g.
integer
,real
,char
) and pre-created generators (e.g.fixnum
,int8
). Without the static typing support, this kind of layers could be confusing. Shall we use some naming convention to distinguish these two layers? - There's an idea rolling in my head to provide plural names as an
alias, e.g.
chars
forchar
. It plays nicely with the combinators, e.g.(list-of fixnums)
or(string-of 5 (chars))
. But I also feel this is just a superficial convenience; we double the number of exported names to get nothing added functionally. - The handling of omitted argument of
list-of
etc. is also different from Gauche's convention of optional arugments.
If you have data generator ideas to be thrown in to this module, let me know.
Now I'm writing a generative test framework, using this module as a data generators.
Tags: data.random, Generators
2013/03/11
Math fun
Recently I added a few math functions in the core of Gauche, for they were needed to implement some more domain-specific functions.
Just committed are gamma
and lgamma
,
Gamma function and natural logarithm of
absolute value of it.
(What prompted me to add them was the random variable of
Poisson distribution; the algorithm I found needed log(n!)
.
And the reason I was doing Poisson distribution thing was
that I was writing a general random data generator module and
I wanted to cover typical distributions. And the reason that
I was writing such a module was that they were needed for
generative test framework. It's like a domino effect; I'm writing
X, then finds X needs Y, then Y needs Z, etc...)
Some time ago I also added expt-mod
, that calculates modulus
exponentiation.
I used to be worried for adding too much stuff to the core library, and tried to put functions together into a module whenever I could attach a suitable name for the bunch. However, some functions are just fundamental enough that appear in different fields now and then, and the core library may be the suitable place for them, after all.
2013/02/23
Land of Lisp on Gauche
I've been working on translating Conrad Barski's "Land of Lisp" to Japanese, and it finally hit the bookstores in Japan (published from O'Reilly Japan). To celebrate that, I ported the code in the book to Gauche.
Since Gauche adopts quite a few Common Lisp idioms, the port is pretty straightforward. The code may serve as an example to show how to translate CL-style code into Gauche.
Here are some observations.
- Most of CL's
loop
use cases can be written using srfi:42.- The trick is that, in CL, you think "OK, I loop over this (list, vector,
integer, etc.) and I want to get this (list, sum, etc.)", while
in srfi-42, you first think what you want (ok, I want list so I use
list-ec
/ ok, I want to stop iteration when any of the element satisfies the condition, so I useany?-ec
), then you think about what to iterate.
- The trick is that, in CL, you think "OK, I loop over this (list, vector,
integer, etc.) and I want to get this (list, sum, etc.)", while
in srfi-42, you first think what you want (ok, I want list so I use
- CL has many idioms that take advantage of () being boolean false, and
car
/cdr
ofnil
is stillnil
. It is not a good idea to carry over that idiom to Scheme, or you'll be frustrated feeling that Scheme's distinction of#f
and()
gets in your way. It just needs a different mindset (complaining about it is like a C programmer complaining 0 isn't false in some languages).- For example, in CL, you can access to the value associated to the key
in alist by the following code, without worrying the case when key isn't
found:
(cdr (assoc key alist))
You'll be annoyed that in Scheme, assoc returns#f
when key isn't found, andcdr
barfs. In Gauche, recommended way is as follows:(assoc-ref alist key :optional default)
- Another common CL idiom is to build a list conditionally:
(append (list up down) (unless (zerop (mod pos *board-size*)) (list (1- up) (1- pos))) (unless (zerop (mod (1+ pos) *board-size*)) (list (1+ pos) (1+ down))))
This idiom uses the fact thatunless
returnsnil
when the condition is satisfied. In Gauche,unless
returns#<undef>
so this doesn't work. Instead of falling back to more verboseif
, you can usecond-list
.(cond-list [#t @ (list up down)] [(not (zero? (mod pos *board-size*))) @ (list (- up 1) (- pos 1))] [(not (zero? (mod (+ pos 1) *board-size*))) @ (list (+ pos 1) (+ down 1))])
- For example, in CL, you can access to the value associated to the key
in alist by the following code, without worrying the case when key isn't
found:
- The lazy evaluation stuff built in Chapter 18 (
lazy.lisp
) comes free in Gauche, asutil.stream
module. Thelazy.scm
file in Gauche version only shows the correspondence of two APIs. - I have a plan to make
format
CL-compatible. It's not done yet, so I had to rewrite some code using advanced formatting directives with the plain Scheme code. - The CL's
defstruct
code is translated todefine-record-type
straightforwardly, except that our record type can't set the default value when the constructor argument is omitted. (R6RS records can do that using protocol. I prefer easier way, though.)
Tag: CommonLisp
Comments (0)