# Random Numbers in C

## A Ciphers By Ritter Page

Random number generators (pseudo-random, of course) in C; mostly fast one-liners for "inline" use (to eliminate call / return overhead). If you just want to use these RNG's, it might be a good idea to start at the last message.

### Contents

• 1999-01-12 George Marsaglia: "This posting ends with 17 lines of C code that provide eight different in-line random number generators, six for random 32-bit integers and two for uniform reals in (0,1) and (-1,1)."
• 1999-01-12 Charles Bond: "Do you know if any theoretical work has been done since Knuth's book to justify SWB?"
• 1999-01-12 George Marsaglia: "It hasn't been very many years since I invented the subtract-with-borrow method, and developed theory for establishing the periods."
• 1999-01-12 Charles Bond: "I did not mean to imply that Knuth's subtractive generator was *the same* as your subtract with borrow, only that it was *similar* (high speed, no multiplications)."
• 1999-01-12 Mike Oliver: "Could you give us a pointer to information about why these RNGs [lagged Fibonacci generators using xor] are unsatisfactory and what sort of test they tend to fail?"
• 1999-01-13 Eric Backus: "I have a small problem with the definition of LFIB4 and SWB. In an attempt to make these a single line of C code, they both use "++c" in the same expression as they use "c". A C compiler is free to rearrange the order in which it calculates the intermediate terms of these expressions, so the expressions can produce different results depending on the compiler."
• 1999-01-15 George Marsaglia: "The generators MWC, KISS, LFIB4 and SWB seem to pass all tests. By themselves, CONG and SHR3 do not, but using CONG+SHR3 provides one of the fastest combinations that satisfy the DIEHARD battery of tests."
• 1999-01-15 Dmitri Zaykin: "Shouldn't these be...."
• 1999-01-01 Radford Neal: "This doesn't work either."
• 1999-01-15 Jeff Stout: "The preprocessor is just doing a stupid text substituation, its the C compiler that is ambigous about the interpretation."
• 1999-01-01 Radford Neal: "...although one can indeed create lots of junk using the C pre-processor, one cannot, in general, use it to create in-line procedures."
• 1999-01-16 Dmitri Zaykin: "It is probably true that the posted random number generators are better implemented as regular functions."
• 1999-01-16 Dmitri Zaykin: "These should work...."
• 1999-01-17 Ramin Sina: "Could someone please post a short example code on how to use these in practice."
• 1999-01-18 Ed Hook: "No -- 'SHR3' and 'LFIB4' are OK, but your version of 'SWB' still invokes the dreaded undefined behavior."
• 1999-01-19 Orjan Johansen: "Surely in an expression of the form y=..., the evaluation of ... is guaranteed to happen before the assignment."
• 1999-01-19 Horst Kraemer: "Sorry. No sequence point is needed between reading an writing to an lvalue."
• 1999-01-17 Herman Rubin: "I think it should be clarified, and probably written out in some more detail. But the procedure call overhead would be comparable to the computing cost...."
• 1999-01-17 Duncan Murdoch: "I don't know if that's true or not, but I don't think it is really something to worry about."
• 1999-01-18 Herman Rubin: "It certainly is. The cost of one procedure call could far exceed the cost of a uniform, or even many nonuniform, random variables."
• 1999-01-01 Radford Neal: "Even on a CISC machine like a VAX, I doubt the C procedure call overhead would be more that twice the time for the actual work, and on modern RISC machines the ratio will tend to be less."
• 1999-01-19 Herman Rubin: "The code discussed was not in making a call for an array, but for a single step in the generation procedure."
• 1999-01-19 Duncan Murdoch: "I'm not sure what you have in mind."
• 1999-01-19 Herman Rubin: "Can we afford range checking?" "Basic random tools should be as a bit stream, not as floating numbers, for many reasons."
• 1999-01-20 Duncan Murdoch: "...all languages that I know (including C) provide ways of forcing the order of evaluation: just put the things that need to be done first in an earlier statement."
• 1999-01-17 David W. Noon: "I don't mind translating all this stuff into assembler, but anybody not using an Intel 80486 or better running a 32-bit version of OS/2 will not be able to use my code."
• 1999-01-18 Dmitri Zaykin: "Another option is to re-write macros as C++ member functions."
• 1999-01-18 David W. Noon: "The problem with C++ is that VMT calling mechanisms are almost invariably slower than the calling mechanisms used by C, PL/I, FORTRAN, Pascal, etc."
• 1999-01-18 Dmitri Zaykin: "Well, in this particular case all that does not apply, since there are no virtual functions in the code."
• 1999-01-19 Duncan Murdoch: "...but just as with any other aspect of the program, speed isn't as important as correctness."
• 1999-01-20 Dmitri Zaykin: "To check if C-macros for these random number generators do indeed always produce faster, and maybe 'less bloated' code than inlined C++ member functions, I did a little experiment...."
• 1999-01-21 Dmitri Zaykin:
• 1999-01-21 John E. Davis: "Using your code (t.cc and t.c), my conclusion is the opposite...."
• 1999-01-21 Dmitri Zaykin: "I used egcs-1.1.1 compiler. It is is an improvement over gcc-2.8.1."
• 1999-01-18 Mike McCarty:
• 1999-01-18 Herman Rubin: "Decent random number code cannot be portable."
• 1999-01-19 David W. Noon: "I meant portable across software platforms, not hardware."
• 1999-01-19 Herman Rubin: "It is by no means clear that HLL generated code will do this, as the machine instructions involved are different." "What was meant was the distance to the next one in a stream of random bits. In other words, generate a geometric (.5) random variable using only the number of bits required by information."
• 1999-01-19 David W. Noon: "C++ just isn't ideally suited to the task under current implementations of the language."
• 1999-01-20 Dmitri Zaykin: "Templates can be an alternative to C-macros in terms of genericity, not inlining."
• 1999-01-18 Mike McCarty: "There are no 'idiot-proof languages'."
• 1999-01-18 Herman Rubin: "A language has a built in inefficiency in every situation in which the optimal use of machine instructions cannot be addressed in that language."
• 1999-01-21 Mike McCarty: "The checking I had in mind costs *nothing* during the execution of the program."
• 1999-01-16 Dmitri Zaykin: "I see one more problem with the code."
• 1999-01-19 qscgz@my-dejanews.com: "DIEHARD-tests fail on...."
• 1999-01-20 George Marsaglia: "My offer of RNG's for C was an invitation to dance; I did not expect the Tarantella." "Numerous responses have led to improvements; the result is the listing below...."
```
Subject: Random numbers in C: Some suggestions.
Date: Tue, 12 Jan 1999 09:37:37 -0500
From: George Marsaglia <geo@stat.fsu.edu>
Message-ID: <369B5E30.65A55FD1@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis,sci.crypt,sci.physics
Lines: 243

This posting ends with  17  lines of
C code that provide eight different
in-line random number generators, six for
random 32-bit integers and two for uniform
reals in (0,1) and (-1,1).
code. Various combinations of the six in-line
integer generators may put in C expressions to
provide a wide variety of very fast, long period,
well-tested RNG's. I invite comments, feedback,
verifications and timings.

First, there is narrative giving background
for this posting; you may want to skip it.

Narrative:

Having had experience in producing and
testing for randomness in computers,
I am frequently asked to suggest good
random number generators (RNG's), test
RNG's, or comment on existing RNG's.  Many
recent queries have been in two areas:
(1) requests for implementations in C and
(2) comments on generators with immense periods,
particularly the Mersenne Twister.

This posting is mainly for category (1),
for which I suggest a set of C implementations
of RNG's I have developed.  C implementations
of my DIEHARD battery of tests will be
discussed elsewhere, and Narasimhan's GUI
version is expected to be released soon.

For (2), I merely repeat what I have said
in response to various queries: the Mersenne
Twister looks good, but it seems to be essentially
a lagged Fibonacci RNG using the exclusive-or
(xor) operation, and experience has shown that
lagged Fibonacci generators using xor provide
unsatisfactory 'randomness' unless the lags are
very long, and even for those with very long lags,
(and even for those using + or - rather than xor),
many people (I among them) are inclined to be
cautious about sequences based on such a simple
operation as: each new integer is the xor, (or sum,
or difference), of two earlier ones.  To be sure,
the resulting integers can be "twisted", but not,
I think, as simply or as safely as combining, say
by addition, with members of a sequence from a
(shorter period) generator that has itself passed
extensive tests of randomness.

I also reply that it does not take an immense
program (as, for example, in posted listings
of Twister) to produce a more satisfactory RNG
with an immense period, and give this example,
on which I will expand below: Inclusion of

#define SWB ( t[c+237]=(x=t[c+15])-(y=t[++c]+(x<y)) )

together with suitably initialized seeds in

static unsigned long x,y,t[256]; unsigned char c;

will allow you to put the string SWB in any C
expression and it will provide, in about 100 nanosecs,
a 32-bit random integer with period  2^7578. (Here
and below, ^ means exponent, except in C expressions,
where it means xor (exclusive-or).

Now for the (2) part, in which I suggest a number
of C implementations and invite comment and feedback.
Most of these were previously developed and tested
via Fortran versions.  I list eight RNG's, each of
them by means of C's powerful #define device. This
provides  fast, compact implementation, allows
insertion of the required random variable directly
into an expression, and, finally, provides a good
selection of RNG's for use individually or in
combination.  The latter makes it possible to
further confirm what empirical results suggest:
combining two or more RNG's provides better,
(or no worse) randomness, and for encryption enthusiasts:
combination generators are harder to "crack".

For those wishing to try these eight RNG's:

At the top of your C program, include these
definitions and the static variables that follow.
Everything past this line is either C code or comment.
--------------------------------------------------

#define UL unsigned long
#define znew  ((z=36969*(z&65535)+(z>>16))<<16)
#define wnew  ((w=18000*(w&65535)+(w>>16))&65535)
#define MWC   (znew+wnew)
#define SHR3  (jsr=(jsr=(jsr=jsr^(jsr<<17))^(jsr>>13))^(jsr<<5))
#define CONG  (jcong=69069*jcong+1234567)
#define KISS  ((MWC^CONG)+SHR3)
#define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[++c+178])
#define SWB   (t[c+237]=(x=t[c+15])-(y=t[++c]+(x<y)))
#define UNI   (KISS*2.328306e-10)
#define VNI   ((long) KISS)*4.656613e-10
/*  Global static variables: */
static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;
static UL t[256];
static UL x=0,y=0; static unsigned char c=0;

/* Random seeds must be used to reset z,w,jsr,jcong and
the table t[256]  Here is an example procedure, using KISS: */

void settable(UL i1,UL i2,UL i3,UL i4)
{ int i; z=i1;w=i2,jsr=i3; jcong=i4;
for(i=0;i<256;i++)  t[i]=KISS;        }

/*  End of C code;  Only comments follow. Stick the above
17 lines in your simulation programs, initialize the table,
and have a variety of promising RNG's at your disposal.  */

/* You may want use more complicated names for the
above simple 1-letter variable names: z,w,x,y,t,c,
to avoid clashing with favorites in your code.    */

/* Any one of KISS, MWC, LFIB4, SWB, SHR3, or CONG
can be used in an expression to provide a random
32-bit integer, and UNI in an expression will
provide a random uniform in (01), or VNI in (-1,1).
For example, for int i, float v; i=(MWC>>24); will
provide a random byte, while v=4.+3.*UNI; will
provide a uniform v in the interval 4. to 7.
For the super cautious, (KISS+SWB) in an expression
would provide a random 32-bit integer from
a sequence with period > 2^7700, and would only
add some 300 nanoseconds to the computing
time for that expression.                         */

/* The KISS generator, (Keep It Simple Stupid), is
designed to combine the two multiply-with-carry
generators in MWC with the 3-shift register SHR3
and the congruential generator CONG, using
is one of my favorite generators.   */

/* The  MWC generator concatenates two 16-bit
multiply-with-carry generators, x(n)=36969x(n-1)+carry,
y(n)=18000y(n-1)+carry  mod 2^16, has  period  about
2^60 and seems to pass all tests of randomness. A favorite
stand-alone generator---faster than KISS, which contains it.*/

/* SHR3 is a 3-shift-register generator with
period 2^32-1. It uses
y(n)=y(n-1)(I+L^17)(I+R^13)(I+L^5), with the
y's viewed as binary vectors, L the 32x32
binary matrix that shifts a vector left 1, and
R its transpose.  SHR3 seems to pass all except
the binary rank test, since 32 successive
values, as binary vectors, must be linearly
independent, while 32 successive truly random
32-bit integers, viewed as binary vectors, will
be linearly independent only about 29% of the time.   */

/* CONG is a congruential generator with the
widely used 69069 as multiplier:
x(n)=69069x(n-1)+1234567.  It has period 2^32.
The leading half of its 32 bits seem to pass
all tests, but bits in the last half are too
regular.                               */

/* LFIB4 is an extension of the class that I have
previously defined as  lagged Fibonacci
generators: x(n)=x(n-r) op x(n-s), with the x's
in a finite set over which there is a binary
operation op, such as +,- on integers mod 2^32,
* on odd such integers, exclusive-or (xor) on
binary vectors. Except for those using
multiplication, lagged Fibonacci generators
fail various tests of randomness, unless the
lags are very long.  To see if more than two
lags would serve to overcome the problems of 2-
lag generators using +,- or xor, I have
developed the 4-lag generator LFIB4:
x(n)=x(n-256)+x(n-179)+x(n-119)+x(n-55) mod 2^32.
Its period is 2^31*(2^256-1), about 2^287, and
it seems to pass all tests---in particular,
those of the kind for which 2-lag generators
using +,-,xor seem to fail.  For even more
confidence in its suitability,  LFIB4 can be
combined with KISS, with a resulting period of
about 2^410: just use (KISS+LFIB4) in any C
expression.                               */

/* SWB is a subtract-with-borrow generator that I
developed to give a simple method for producing
extremely long periods:
x(n)=x(n-222)-x(n-237)-borrow mod 2^32.
The 'borrow' is 0 unless set to 1 if computing
x(n-1) caused overflow in 32-bit integer
arithmetic. This generator has a very long
period, 2^7098(2^480-1), about 2^7578. It seems
to pass all tests of randomness, but,
suspicious of a generator so simple and fast
(62 nanosecs at 300MHz), I would suggest
combining SWB with KISS, MWC, SHR3, or CONG. */

/* Finally, because many simulations call for
uniform random variables in 0<v<1 or -1<v<1, I
use #define statements that permit inclusion of
such variates directly in expressions:  using
UNI will provide a uniform random real (float)
in (0,1), while VNI will provide one in (-1,1).  */

/* All of these: MWC, SHR3, CONG, KISS, LFIB4,
SWB, UNI and VNI, permit direct insertion of
the desired random quantity into an expression,
avoiding the time and space costs of a function
call. I call these in-line-define functions.
To use them, static variables z,w,jsr and
jcong should be assigned seed values other than
their initial values.  If LFIB4 or SWB are
used, the static table t[256] must be
initialized.  A sample procedure follows. */

/* A note on timing:  It is difficult to provide
exact time costs for inclusion of one of these
in-line-define functions in an expression.
Times may differ widely for different
compilers, as the C operations may be deeply
nested and tricky. I suggest these rough
comparisons, based on averaging ten runs of a
routine that is essentially a long loop:
for(i=1;i<10000000;i++) L=KISS; then with KISS
replaced with SHR3, CONG,... or KISS+SWB, etc.
The times on my home PC, a Pentium 300MHz, in
nanoseconds: LFIB4=64; CONG=90; SWB=100;
SHR3=110; KISS=209; KISS+LFIB4=252; KISS+SWB=310.     */

Subject: Re: Random numbers in C: Some suggestions.
Date: Tue, 12 Jan 1999 07:13:47 -0800
From: Charles Bond <cbond@ix.netcom.com>
Message-ID: <369B66AB.6CDED9F8@ix.netcom.com>
References: <369B5E30.65A55FD1@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis,sci.crypt,sci.physics
Lines: 253

Thanks for the post. I want to comment on the SWB routine. I've been
using
a similar routine in high speed simulations for years. Small departures
from
statistically correct randomness are not a problem for my application,
but
speed is. I believe Knuth briefly discussed the method with guarded
approval -- constrained by the concern that there was no real theory
behind it. Do you know if any theoretical work has been done since
Knuth's
book to justify SWB?

George Marsaglia wrote:

> This posting ends with  17  lines of
> C code that provide eight different
> in-line random number generators, six for
> random 32-bit integers and two for uniform
> reals in (0,1) and (-1,1).
> Comments are interspersed with that
> code. Various combinations of the six in-line
> integer generators may put in C expressions to
> provide a wide variety of very fast, long period,
> well-tested RNG's. I invite comments, feedback,
> verifications and timings.
>
> First, there is narrative giving background
> for this posting; you may want to skip it.
>
> Narrative:
>
> Having had experience in producing and
> testing for randomness in computers,
> I am frequently asked to suggest good
> random number generators (RNG's), test
> RNG's, or comment on existing RNG's.  Many
> recent queries have been in two areas:
> (1) requests for implementations in C and
> (2) comments on generators with immense periods,
> particularly the Mersenne Twister.
>
> This posting is mainly for category (1),
> for which I suggest a set of C implementations
> of RNG's I have developed.  C implementations
> of my DIEHARD battery of tests will be
> discussed elsewhere, and Narasimhan's GUI
> version is expected to be released soon.
>
> For (2), I merely repeat what I have said
> in response to various queries: the Mersenne
> Twister looks good, but it seems to be essentially
> a lagged Fibonacci RNG using the exclusive-or
> (xor) operation, and experience has shown that
> lagged Fibonacci generators using xor provide
> unsatisfactory 'randomness' unless the lags are
> very long, and even for those with very long lags,
> (and even for those using + or - rather than xor),
> many people (I among them) are inclined to be
> cautious about sequences based on such a simple
> operation as: each new integer is the xor, (or sum,
> or difference), of two earlier ones.  To be sure,
> the resulting integers can be "twisted", but not,
> I think, as simply or as safely as combining, say
> by addition, with members of a sequence from a
> (shorter period) generator that has itself passed
> extensive tests of randomness.
>
> I also reply that it does not take an immense
> program (as, for example, in posted listings
> of Twister) to produce a more satisfactory RNG
> with an immense period, and give this example,
> on which I will expand below: Inclusion of
>
> #define SWB ( t[c+237]=(x=t[c+15])-(y=t[++c]+(x<y)) )
>
> together with suitably initialized seeds in
>
> static unsigned long x,y,t[256]; unsigned char c;
>
> will allow you to put the string SWB in any C
> expression and it will provide, in about 100 nanosecs,
> a 32-bit random integer with period  2^7578. (Here
> and below, ^ means exponent, except in C expressions,
> where it means xor (exclusive-or).
>
> Now for the (2) part, in which I suggest a number
> of C implementations and invite comment and feedback.
> Most of these were previously developed and tested
> via Fortran versions.  I list eight RNG's, each of
> them by means of C's powerful #define device. This
> provides  fast, compact implementation, allows
> insertion of the required random variable directly
> into an expression, and, finally, provides a good
> selection of RNG's for use individually or in
> combination.  The latter makes it possible to
> further confirm what empirical results suggest:
> combining two or more RNG's provides better,
> (or no worse) randomness, and for encryption enthusiasts:
> combination generators are harder to "crack".
>
> For those wishing to try these eight RNG's:
>
> At the top of your C program, include these
> definitions and the static variables that follow.
> Everything past this line is either C code or comment.
> --------------------------------------------------
>
> #define UL unsigned long
> #define znew  ((z=36969*(z&65535)+(z>>16))<<16)
> #define wnew  ((w=18000*(w&65535)+(w>>16))&65535)
> #define MWC   (znew+wnew)
> #define SHR3  (jsr=(jsr=(jsr=jsr^(jsr<<17))^(jsr>>13))^(jsr<<5))
> #define CONG  (jcong=69069*jcong+1234567)
> #define KISS  ((MWC^CONG)+SHR3)
> #define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[++c+178])
> #define SWB   (t[c+237]=(x=t[c+15])-(y=t[++c]+(x<y)))
> #define UNI   (KISS*2.328306e-10)
> #define VNI   ((long) KISS)*4.656613e-10
> /*  Global static variables: */
>  static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;
>  static UL t[256];
>  static UL x=0,y=0; static unsigned char c=0;
>
> /* Random seeds must be used to reset z,w,jsr,jcong and
> the table t[256]  Here is an example procedure, using KISS: */
>
>  void settable(UL i1,UL i2,UL i3,UL i4)
>  { int i; z=i1;w=i2,jsr=i3; jcong=i4;
>  for(i=0;i<256;i++)  t[i]=KISS;        }
>
> /*  End of C code;  Only comments follow. Stick the above
>    17 lines in your simulation programs, initialize the table,
>    and have a variety of promising RNG's at your disposal.  */
>
> /* You may want use more complicated names for the
>    above simple 1-letter variable names: z,w,x,y,t,c,
>    to avoid clashing with favorites in your code.    */
>
> /* Any one of KISS, MWC, LFIB4, SWB, SHR3, or CONG
>    can be used in an expression to provide a random
>    32-bit integer, and UNI in an expression will
>    provide a random uniform in (01), or VNI in (-1,1).
>    For example, for int i, float v; i=(MWC>>24); will
>    provide a random byte, while v=4.+3.*UNI; will
>    provide a uniform v in the interval 4. to 7.
>    For the super cautious, (KISS+SWB) in an expression
>    would provide a random 32-bit integer from
>    a sequence with period > 2^7700, and would only
>    add some 300 nanoseconds to the computing
>    time for that expression.                         */
>
> /* The KISS generator, (Keep It Simple Stupid), is
>    designed to combine the two multiply-with-carry
>    generators in MWC with the 3-shift register SHR3
>    and the congruential generator CONG, using
>    is one of my favorite generators.   */
>
> /* The  MWC generator concatenates two 16-bit
>   multiply-with-carry generators, x(n)=36969x(n-1)+carry,
>   y(n)=18000y(n-1)+carry  mod 2^16, has  period  about
>   2^60 and seems to pass all tests of randomness. A favorite
>   stand-alone generator---faster than KISS, which contains it.*/
>
> /* SHR3 is a 3-shift-register generator with
>    period 2^32-1. It uses
>    y(n)=y(n-1)(I+L^17)(I+R^13)(I+L^5), with the
>    y's viewed as binary vectors, L the 32x32
>    binary matrix that shifts a vector left 1, and
>    R its transpose.  SHR3 seems to pass all except
>    the binary rank test, since 32 successive
>    values, as binary vectors, must be linearly
>    independent, while 32 successive truly random
>    32-bit integers, viewed as binary vectors, will
>    be linearly independent only about 29% of the time.   */
>
> /* CONG is a congruential generator with the
>    widely used 69069 as multiplier:
>    x(n)=69069x(n-1)+1234567.  It has period 2^32.
>    The leading half of its 32 bits seem to pass
>    all tests, but bits in the last half are too
>    regular.                               */
>
> /* LFIB4 is an extension of the class that I have
>    previously defined as  lagged Fibonacci
>    generators: x(n)=x(n-r) op x(n-s), with the x's
>    in a finite set over which there is a binary
>    operation op, such as +,- on integers mod 2^32,
>    * on odd such integers, exclusive-or (xor) on
>    binary vectors. Except for those using
>    multiplication, lagged Fibonacci generators
>    fail various tests of randomness, unless the
>    lags are very long.  To see if more than two
>    lags would serve to overcome the problems of 2-
>    lag generators using +,- or xor, I have
>    developed the 4-lag generator LFIB4:
>    x(n)=x(n-256)+x(n-179)+x(n-119)+x(n-55) mod 2^32.
>    Its period is 2^31*(2^256-1), about 2^287, and
>    it seems to pass all tests---in particular,
>    those of the kind for which 2-lag generators
>    using +,-,xor seem to fail.  For even more
>    confidence in its suitability,  LFIB4 can be
>    combined with KISS, with a resulting period of
>    about 2^410: just use (KISS+LFIB4) in any C
>    expression.                               */
>
> /* SWB is a subtract-with-borrow generator that I
>    developed to give a simple method for producing
>    extremely long periods:
>    x(n)=x(n-222)-x(n-237)-borrow mod 2^32.
>    The 'borrow' is 0 unless set to 1 if computing
>    x(n-1) caused overflow in 32-bit integer
>    arithmetic. This generator has a very long
>    period, 2^7098(2^480-1), about 2^7578. It seems
>    to pass all tests of randomness, but,
>    suspicious of a generator so simple and fast
>    (62 nanosecs at 300MHz), I would suggest
>    combining SWB with KISS, MWC, SHR3, or CONG. */
>
> /* Finally, because many simulations call for
>    uniform random variables in 0<v<1 or -1<v<1, I
>    use #define statements that permit inclusion of
>    such variates directly in expressions:  using
>    UNI will provide a uniform random real (float)
>    in (0,1), while VNI will provide one in (-1,1).  */
>
> /* All of these: MWC, SHR3, CONG, KISS, LFIB4,
>    SWB, UNI and VNI, permit direct insertion of
>    the desired random quantity into an expression,
>    avoiding the time and space costs of a function
>    call. I call these in-line-define functions.
>    To use them, static variables z,w,jsr and
>    jcong should be assigned seed values other than
>    their initial values.  If LFIB4 or SWB are
>    used, the static table t[256] must be
>    initialized.  A sample procedure follows. */
>
> /* A note on timing:  It is difficult to provide
>    exact time costs for inclusion of one of these
>    in-line-define functions in an expression.
>    Times may differ widely for different
>    compilers, as the C operations may be deeply
>    nested and tricky. I suggest these rough
>    comparisons, based on averaging ten runs of a
>    routine that is essentially a long loop:
>    for(i=1;i<10000000;i++) L=KISS; then with KISS
>    replaced with SHR3, CONG,... or KISS+SWB, etc.
>    The times on my home PC, a Pentium 300MHz, in
>    nanoseconds: LFIB4=64; CONG=90; SWB=100;
>    SHR3=110; KISS=209; KISS+LFIB4=252; KISS+SWB=310.     */

Subject: Re: Random numbers in C: Some suggestions.
Date: Tue, 12 Jan 1999 13:56:41 -0500
From: George Marsaglia <geo@stat.fsu.edu>
Message-ID: <369B9AE9.52C98810@stat.fsu.edu>
References: <369B66AB.6CDED9F8@ix.netcom.com>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis,sci.crypt,sci.physics
Lines: 59

Charles Bond wrote:

> Thanks for the post. I want to comment on the SWB routine. I've been
> using
> a similar routine in high speed simulations for years. ...

&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
Many years?  It hasn't been very many years
since I invented the subtract-with-borrow method,
and developed theory for establishing the periods.
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

>   . I believe Knuth briefly discussed the method with guarded
> approval -- constrained by the concern that there was no real theory
> behind it. Do you know if any theoretical work has been done since
> Knuth's book to justify SWB?

> &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
>    This business of providing 'theoretical support' for
>   RNG's tends to be overdone---perhaps
>   because of the influence of Knuth. His marvelous
>   expository and poweful mathematical skills have
>   Many people do not understand that such theory
>   is based on the simple fact that congruential
>   random numbers "fall mainly in the planes",
>   that is,  points whose coordinates are succesive
>   elements from the sequence lie on a lattice
>   with a huge unit-cell volume, (m^(n-1) in n-dimensions),
>   compared to the  unit-cell volume of 1 for truly random
>   integers.  So the "lattice test", as I called it,
>   applies to congruential generators, although the ideas have
>   been extended to the near-lattice-like patterns of
>   certain other kinds of generators.  But there seem
>   to be no such lattice-like patterns for many other
>   kinds of generators, and even if there were, it
>   is an easy matter to destroy such  patterns by
>   combining with generators having disparate mathematical
>   structures.
>
>   The quote from my Keynote Address at the 1984
>   Computer Science and Statistics: Symposium on the
>   Interface,  still applies:
>
>   "A random number generator is like sex:
>      When it's good, its wonderful;
>      And when it's bad, it's still pretty good."
>
>   Add to that, in line with my recommendations
>   on combination generators;
>
>     "And if it's bad, try a twosome or threesome."
>
> George Marsaglia

Subject: Re: Random numbers in C: Some suggestions.
Date: Tue, 12 Jan 1999 16:09:29 -0800
From: Charles Bond <cbond@ix.netcom.com>
Message-ID: <369BE439.92E0E011@ix.netcom.com>
References: <369B9AE9.52C98810@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis,sci.crypt,sci.physics
Lines: 69

For the record, I did not mean to imply that Knuth's subtractive
generator was *the same* as your subtract with borrow, only that
it was *similar* (high speed, no multiplications). I gladly acknowledge
your claim on it. But you seem a little skeptical of it yourself, and I
was just curious.

Regards,

C. Bond

George Marsaglia wrote:

> Charles Bond wrote:
>
> > Thanks for the post. I want to comment on the SWB routine. I've been
> > using
> > a similar routine in high speed simulations for years. ...
>
> &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
>      Many years?  It hasn't been very many years
>       since I invented the subtract-with-borrow method,
>       and developed theory for establishing the periods.
> &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
>
> >   . I believe Knuth briefly discussed the method with guarded
> > approval -- constrained by the concern that there was no real theory
> > behind it. Do you know if any theoretical work has been done since
> > Knuth's book to justify SWB?
>
> > &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
> >    This business of providing 'theoretical support' for
> >   RNG's tends to be overdone---perhaps
> >   because of the influence of Knuth. His marvelous
> >   expository and poweful mathematical skills have
> >   Many people do not understand that such theory
> >   is based on the simple fact that congruential
> >   random numbers "fall mainly in the planes",
> >   that is,  points whose coordinates are succesive
> >   elements from the sequence lie on a lattice
> >   with a huge unit-cell volume, (m^(n-1) in n-dimensions),
> >   compared to the  unit-cell volume of 1 for truly random
> >   integers.  So the "lattice test", as I called it,
> >   applies to congruential generators, although the ideas have
> >   been extended to the near-lattice-like patterns of
> >   certain other kinds of generators.  But there seem
> >   to be no such lattice-like patterns for many other
> >   kinds of generators, and even if there were, it
> >   is an easy matter to destroy such  patterns by
> >   combining with generators having disparate mathematical
> >   structures.
> >
> >   The quote from my Keynote Address at the 1984
> >   Computer Science and Statistics: Symposium on the
> >   Interface,  still applies:
> >
> >   "A random number generator is like sex:
> >      When it's good, its wonderful;
> >      And when it's bad, it's still pretty good."
> >
> >   Add to that, in line with my recommendations
> >   on combination generators;
> >
> >     "And if it's bad, try a twosome or threesome."
> >
> > George Marsaglia

Subject: Re: Random numbers in C: Some suggestions.
Date: Tue, 12 Jan 1999 15:54:10 -0800
From: Mike Oliver <oliver@math.ucla.edu>
Message-ID: <369BE0A2.CA197F7B@math.ucla.edu>
References: <369B5E30.65A55FD1@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis,sci.crypt,sci.physics
Lines: 17

George Marsaglia wrote:

> [...] and experience has shown that
> lagged Fibonacci generators using xor provide
> unsatisfactory 'randomness' unless the lags are
> very long, and even for those with very long lags,
> (and even for those using + or - rather than xor),

Could you give us a pointer to information about
why these RNGs are unsatisfactory and what sort
of test they tend to fail?

--
Disclaimer:  I could be wrong -- but I'm not.  (Eagles, "Victim of Love")

Finger for PGP public key, or visit http://www.math.ucla.edu/~oliver.
1500 bits, fingerprint AE AE 4F F8 EA EA A6 FB  E9 36 5F 9E EA D0 F8 B9

Subject: Re: Random numbers in C: Some suggestions.
Date: 13 Jan 1999 12:21:37 -0800
From: Eric Backus <ericb@labejb.lsid.hp.com>
Message-ID: <tun23mnctq.fsf@labejb.lsid.hp.com>
References: <369B5E30.65A55FD1@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis,sci.crypt,sci.physics
Lines: 65

George Marsaglia <geo@stat.fsu.edu> writes:

> This posting ends with  17  lines of
> C code that provide eight different
> in-line random number generators, six for
> random 32-bit integers and two for uniform
> reals in (0,1) and (-1,1).
> Comments are interspersed with that
> code. Various combinations of the six in-line
> integer generators may put in C expressions to
> provide a wide variety of very fast, long period,
> well-tested RNG's. I invite comments, feedback,
> verifications and timings.

> #define UL unsigned long
> #define znew  ((z=36969*(z&65535)+(z>>16))<<16)
> #define wnew  ((w=18000*(w&65535)+(w>>16))&65535)
> #define MWC   (znew+wnew)
> #define SHR3  (jsr=(jsr=(jsr=jsr^(jsr<<17))^(jsr>>13))^(jsr<<5))
> #define CONG  (jcong=69069*jcong+1234567)
> #define KISS  ((MWC^CONG)+SHR3)
> #define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[++c+178])
> #define SWB   (t[c+237]=(x=t[c+15])-(y=t[++c]+(x<y)))
> #define UNI   (KISS*2.328306e-10)
> #define VNI   ((long) KISS)*4.656613e-10
> /*  Global static variables: */
>  static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;
>  static UL t[256];
>  static UL x=0,y=0; static unsigned char c=0;
>
> /* Random seeds must be used to reset z,w,jsr,jcong and
> the table t[256]  Here is an example procedure, using KISS: */
>
>  void settable(UL i1,UL i2,UL i3,UL i4)
>  { int i; z=i1;w=i2,jsr=i3; jcong=i4;
>  for(i=0;i<256;i++)  t[i]=KISS;        }

Thank you for providing this extremely useful code.  (I'd like to make
use of it, however I see no copyright notice, can I assume you are
making it free for anyone to use?)

I have a small problem with the definition of LFIB4 and SWB.  In an
attempt to make these a single line of C code, they both use "++c" in
the same expression as they use "c".  A C compiler is free to
rearrange the order in which it calculates the intermediate terms of
these expressions, so the expressions can produce different results
depending on the compiler.

I might propose alternate expressions using the "," operator in
order to remove any ambiguity.  With a good compiler, these
expressions probably won't be any slower than your original ones:

#define LFIB4_ALT (t[c]=t[c]+t[c+58]+t[c+119]+t[c+179],t[c++])
#define SWB_ALT   (t[c+237]=(x=t[c+15])-(y=t[c+1]+(x<y)),t[c++ +237])

However, these are uglier and harder to understand than your original
expressions, and of course I might have made a mistake in interpreting
where the c++ should go.  Any comments?

--
Eric Backus <eric_backus@hp.com>
http://labejb.lsid.hp.com/
(425) 335-2495

Subject: Random numbers for C: Improvements.
Date: Fri, 15 Jan 1999 11:41:47 -0500
From: George Marsaglia <geo@stat.fsu.edu>
Message-ID: <369F6FCA.74C7C041@stat.fsu.edu>
References: <369B5E30.65A55FD1@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.numer-analysis
Lines: 93

As I hoped, several suggestions have led to
improvements in the code for RNG's I proposed for
use in C. (See the thread "Random numbers for C: Some
suggestions" in previous postings.) The improved code
is listed below.

A question of copyright has also been raised.  Unlike
DIEHARD, there is no copyright on the code below. You
are free to use it in any way you want, but you may
wish to acknowledge the source, as a courtesy.

To avoid being cited by the style police, some have
suggested using typedef rather than #define in order
to replace unsigned long by UL.

Others have pointed out that one cannot be certain of
the way that a compiler will evaluate terms in a sum,
so using ++c in a term is dangerous.  They have
offered a version using the comma operator, which
ensures that the table index is incremented properly.
See LFIB4 and SWB below.

In my earlier work, done in Fortran, I had implemented
two 16-bit multiply-with-carry generators, say z and w,
as 32-bit integers, with the carry in the top 16 bits,
the output in the bottom 16.  They were combined by
(z<<16)+w.  (In Fortran, ishft(z,16)+w.) Such a
combination seemed to pass all tests. In the above-
mentioned post,  I used (z<<16)+(w&65525), and that
does not pass all tests.  So (z<<16)+w seems
preferable; it is used below, providing a MWC that
seems to pass all tests.

The generators MWC, KISS, LFIB4 and SWB seem to pass all tests.
By themselves, CONG and SHR3 do not, but using
CONG+SHR3 provides one of the fastest combinations that satisfy
the DIEHARD battery of tests.

Of course, one cannot have absolute confidence in any
generator. The choices LFIB4 and SWB have immense
periods, are very fast, and pass all tests in DIEHARD,
but I am hesitant to rely on them alone---primarily
because they come from such simple mod 2^32 arithmetic:
four adds in LFIB4 or one subtract-with-borrow in SWB.

The code below provides  a variety of in-line generators that
seem promising by themselves, and even more so in combination.
With them,  one may feel fairly confident  that combinations
will produce results  consistent with the underlying probability theory

All combinations seem to support the supplemented quote

A random number generator is like sex;
When it's good, it's wonderful,
And when it's bad, it's still pretty good.

And when it's bad, try a twosome or threesome.

The C code follows; you may want to snip and save from
here--------------------------------------------------:

#define znew  (z=36969*(z&65535)+(z>>16))
#define wnew  (w=18000*(w&65535)+(w>>16))
#define MWC   ( (znew<<16)+wnew )
#define SHR3  (jsr=(jsr=(jsr=jsr^(jsr<<17))^(jsr>>13))^(jsr<<5))
#define CONG  (jcong=69069*jcong+1234567)
#define KISS  ((MWC^CONG)+SHR3)
#define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c])
#define SWB   (t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c])
#define UNI   (KISS*2.328306e-10)
#define VNI   ((long) KISS)*4.656613e-10
typedef unsigned long UL;

/*  Global static variables: */
static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;
static UL t[256];
static UL x=0,y=0; static unsigned char c=0;
/* Random seeds must be used to reset z,w,jsr,jcong and
the table t[256].  Here is an example procedure, using KISS: */

void settable(UL i1,UL i2,UL i3,UL i4)
{ int i; z=i1;w=i2,jsr=i3; jcong=i4;
for(i=0;i<256;i++)  t[i]=KISS;        }

/* Any one of KISS, MWC, LFIB4, SWB, SHR3, or CONG  can be used in
an expression to provide a random 32-bit integer, while UNI
provides a real in (0,1) and VNI a real in (-1,1).   */

Subject: Re: Random numbers for C: Improvements.
Date: 15 Jan 1999 14:02:47 -0500
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <790f49x60.fsf@poole.statgen.ncsu.edu>
References: <369F6FCA.74C7C041@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 11

George Marsaglia <geo@stat.fsu.edu> writes:

> #define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c])
> #define SWB   (t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c])

Shouldn't these be

#define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+179], ++c)
#define SWB   (t[c+237]=(x=t[c+15])-(y=t[c+1]+(x<y)), ++c)

Dmitri

Subject: Re: Random numbers for C: Improvements.
Date: 15 Jan 99 20:16:13 GMT
Message-ID: <99Jan15.151547edt.785@neuron.ai.toronto.edu>
References: <790f49x60.fsf@poole.statgen.ncsu.edu>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 28

>George Marsaglia <geo@stat.fsu.edu> writes:
>
>> #define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c])
>> #define SWB   (t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c])
>

In article <790f49x60.fsf@poole.statgen.ncsu.edu>,
Dmitri Zaykin <zaykin@statgen.ncsu.edu> wrote:

>Shouldn't these be
>
>#define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+179], ++c)
>#define SWB   (t[c+237]=(x=t[c+15])-(y=t[c+1]+(x<y)), ++c)
>

This doesn't work either.  I believe that it is undefined whether the
comparison x<y uses the new or the old value of x.  The SHR3 macro
in the original source also suffers from this flaw.

I think one needs to face up to an unpleasant truth:  The #define
facility of C was poorly designed, and is incapable in general of
supporting the definition of "in-line" procedures.  It is far better
to simply write ordinary C procedures, and accept the fairly small

Subject: Re: Random numbers for C: Improvements.
Date: Fri, 15 Jan 1999 17:34:36 -0600
From: "Jeff Stout" <jstout@ncon.com>
Message-ID: <QfQn2.10695\$bf6.2024@news1.giganews.com>
References: <99Jan15.151547edt.785@neuron.ai.toronto.edu>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 27

<99Jan15.151547edt.785@neuron.ai.toronto.edu>...
>
>
>I think one needs to face up to an unpleasant truth:  The #define
>facility of C was poorly designed, and is incapable in general of
>supporting the definition of "in-line" procedures.  It is far better
>to simply write ordinary C procedures, and accept the fairly small
>
>
>

This is not a poor design of the macro facility of C, but a built in
limitation of the C language itself.  The preprocessor is just doing
a stupid text substituation, its the C compiler that is ambigous about
the interpretation

The C macro language can support "in-line" procedures and a lot
of other junk only true software wienies would ever use.

Jeff Stout

Subject: Re: Random numbers for C: Improvements.
Date: 16 Jan 99 04:43:12 GMT
Message-ID: <99Jan15.233424edt.6005@cortex.ai.toronto.edu>
References: <QfQn2.10695\$bf6.2024@news1.giganews.com>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 88

><99Jan15.151547edt.785@neuron.ai.toronto.edu>...
>>
>>I think one needs to face up to an unpleasant truth:  The #define
>>facility of C was poorly designed, and is incapable in general of
>>supporting the definition of "in-line" procedures.  It is far better
>>to simply write ordinary C procedures, and accept the fairly small

In article <QfQn2.10695\$bf6.2024@news1.giganews.com>,
Jeff Stout <jstout@ncon.com> wrote:
>This is not a poor design of the macro facility of C, but a built in
>limitation of the C language itself.  The preprocessor is just doing
>a stupid text substituation, its the C compiler that is ambigous about
>the interpretation
>
>The C macro language can support "in-line" procedures and a lot
>of other junk only true software wienies would ever use.

In saying that the #define facility of C was poorly designed, I of
course meant that it does not do what is needed to write good programs.
The fact that it does stupid text substitation is a big part of this.

In the case under discussion, the only reason (I hope!) that obscure
and semantically undefined constructs were being used was in order to
try to define an in-line procedure using this inadequate facility.

Unfortunately, although one can indeed create lots of junk using the
C pre-processor, one cannot, in general, use it to create in-line
procedures.  One can't write a macro equivalent to the following,
for instance:

int first_zero (int *a)
{ int i;
for (i = 0; a[i]!=0; i++) ;
return i;
}

Furthermore, in many cases where one CAN write in-line procedures
using C macros, 99% of the people who try to do so get it wrong.  For
example, the following procedure CAN be written as a macro:

{
while (f(a)==0) ;
g(a);
}

but NOT as follows:

#define badexample(a) while (f(a)==0) ; g(a);

nor as follows:

#define badexample(a) { while (f(a)==0) ; g(a); }

The following gets closer:

#define badexample(a) do { while (f(a)==0) ; g(a); } while (0)

Note: NO semicolon at the end.  The necessity of this trick is left
as an exercise for the reader.

Really, though, one needs to define the macro as follows:

#define badexample(a) do { int fy7653_4xq = a; \
while (f(fy7653_4xq)==0) ; \
g(fy7653_4xq); \
} while (0)

Here, fy7853_4xq is assumed to be sufficiently obscure that it is
unlikely to conflict with the name of a variable that occurs in a.

Not illustrated in this example, but commonly overlooked, is the
frequent necessity to enclose references to the parameters of the
macro in parentheses.

Failure to understand these arcane necessities will result in macros
that behave just like the procedure they are trying to mimic around
95% of the times they are called, and which produce completely obscure
bugs the other 5% of the time.

Subject: Re: Random numbers for C: Improvements.
Date: 16 Jan 1999 16:14:45 -0500
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <7ogny6htm.fsf@poole.statgen.ncsu.edu>
References: <QfQn2.10695\$bf6.2024@news1.giganews.com>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 15

"Jeff Stout" <jstout@ncon.com> writes:

> The C macro language can support "in-line" procedures and a lot
> of other junk only true software wienies would ever use.

It is probably true that the posted random number generators are
better implemented as regular functions. C compilers should attempt to
inline "short" functions into their callers when certain optimization
flags are enabled.

Also, the new C standard will support the "inline" keyword.  Some
compilers (e.g. gcc called without "-ansi" switch) already recognize
it.

Dmitri

Subject: Re: Random numbers for C: Improvements.
Date: 16 Jan 1999 15:19:24 -0500
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <7pv8f55tf.fsf@poole.statgen.ncsu.edu>
References: <99Jan15.151547edt.785@neuron.ai.toronto.edu>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 29

These should work (aside the possibility of clashing with local names)

#define SHR3  (jsr=jsr^(jsr<<17), jsr=jsr^(jsr>>13), jsr=jsr^(jsr<<5))
#define LFIB4 (t[c] = t[c]+t[c+58]+t[c+119]+t[c+179], t[++c])
#define SWB   (x = t[c+15], t[c+237] = x-(y=t[c+1]+(x<y)), t[++c])

Dmitri

> >George Marsaglia <geo@stat.fsu.edu> writes:
> >
> >> #define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c])
> >> #define SWB   (t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c])
> >
>
> In article <790f49x60.fsf@poole.statgen.ncsu.edu>,
> Dmitri Zaykin <zaykin@statgen.ncsu.edu> wrote:
>
> >Shouldn't these be
> >
> >#define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+179], ++c)
> >#define SWB   (t[c+237]=(x=t[c+15])-(y=t[c+1]+(x<y)), ++c)
> >
>
> This doesn't work either.  I believe that it is undefined whether the
> comparison x<y uses the new or the old value of x.  The SHR3 macro
> in the original source also suffers from this flaw.

Subject: Re: Random numbers for C: Improvements.
Date: 17 Jan 1999 08:06:29 PST
From: Ramin Sina <rsina@concentric.net>
References: <7pv8f55tf.fsf@poole.statgen.ncsu.edu>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 34

Dmitri Zaykin wrote:

> These should work (aside the possibility of clashing with local names)
>
> #define SHR3  (jsr=jsr^(jsr<<17), jsr=jsr^(jsr>>13), jsr=jsr^(jsr<<5))
> #define LFIB4 (t[c] = t[c]+t[c+58]+t[c+119]+t[c+179], t[++c])
> #define SWB   (x = t[c+15], t[c+237] = x-(y=t[c+1]+(x<y)), t[++c])
>
> Dmitri
>
>
> > >George Marsaglia <geo@stat.fsu.edu> writes:
> > >
> > >> #define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c])
> > >> #define SWB   (t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c])
> > >
> >
> >

Could someone please post a short example code on how to use these in
practice. I am lost in the discussion in this thread, but I would like
very much to understand how to use these for example to generate 1000
uniform random numbers between betwen A and B ( say 5 and 10) .

Thanks very much,
Ramin Sina

--
--------------------------------------------------------
Ramin Sina                         rsina@concentric.net

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 1999 17:34:52 GMT
From: hook@nas.nasa.gov (Ed Hook)
Message-ID: <77vrbs\$pmj\$1@sun500.nas.nasa.gov>
References: <7pv8f55tf.fsf@poole.statgen.ncsu.edu>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 25

In article <7pv8f55tf.fsf@poole.statgen.ncsu.edu>,
zaykin@statgen.ncsu.edu (Dmitri Zaykin) writes:
|> These should work (aside the possibility of clashing with local names)
|>
|> #define SHR3  (jsr=jsr^(jsr<<17), jsr=jsr^(jsr>>13), jsr=jsr^(jsr<<5))
|> #define LFIB4 (t[c] = t[c]+t[c+58]+t[c+119]+t[c+179], t[++c])
|> #define SWB   (x = t[c+15], t[c+237] = x-(y=t[c+1]+(x<y)), t[++c])
|>
|> Dmitri
|>

No -- 'SHR3' and 'LFIB4' are OK, but your version of 'SWB'
still invokes the dreaded undefined behavior. In the second
calculation, there's no sequence point to separate the use
of 'y' to determine the value of 'x<y' and the assignment to 'y'.
One might argue that this is not particularly relevant, since
we're trying to simulate randomness anyhow ... except that, once
undefined behavior rears its ugly head, the code is allowed to
simply dump core and die ...

--
Ed Hook                              |       Copula eam, se non posit
MRJ Technology Solutions, Inc.       |         acceptera jocularum.
NAS, NASA Ames Research Center       | I can barely speak for myself, much
Internet: hook@nas.nasa.gov          |         less for my employer

Subject: Re: Random numbers for C: Improvements.
Date: 19 Jan 1999 13:10:13 GMT
From: orjanjo@math.ntnu.no (Orjan Johansen)
Message-ID: <78207l\$1l8\$1@due.unit.no>
References: <77vrbs\$pmj\$1@sun500.nas.nasa.gov>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 21

In article <77vrbs\$pmj\$1@sun500.nas.nasa.gov>,
Ed Hook <hook@nas.nasa.gov> wrote:
>In article <7pv8f55tf.fsf@poole.statgen.ncsu.edu>,
>               zaykin@statgen.ncsu.edu (Dmitri Zaykin) writes:
[snip]
>|> #define SWB   (x = t[c+15], t[c+237] = x-(y=t[c+1]+(x<y)), t[++c])
[snip]
>   No -- 'SHR3' and 'LFIB4' are OK, but your version of 'SWB'
>   still invokes the dreaded undefined behavior. In the second
>   calculation, there's no sequence point to separate the use
>   of 'y' to determine the value of 'x<y' and the assignment to 'y'.
[snip]

I am not a programmer by trade, but I have a little knowledge of C.
Surely in an expression of the form y=..., the evaluation of ... is
guaranteed to happen before the assignment.  As far as I understand,
what is not guaranteed is that any assignments in ... would happen
before the one to y.

Greetings,
Ørjan.

Subject: Re: Random numbers for C: Improvements.
Date: Tue, 19 Jan 1999 18:29:34 GMT
From: horst.kraemer@snafu.de (Horst Kraemer)
Message-ID: <36a4cb5e.39101741@news.snafu.de>
References: <77vrbs\$pmj\$1@sun500.nas.nasa.gov>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 39

On 18 Jan 1999 17:34:52 GMT, hook@nas.nasa.gov (Ed Hook) wrote:

> In article <7pv8f55tf.fsf@poole.statgen.ncsu.edu>,
>                zaykin@statgen.ncsu.edu (Dmitri Zaykin) writes:
> |> These should work (aside the possibility of clashing with local names)
> |>
> |> #define SHR3  (jsr=jsr^(jsr<<17), jsr=jsr^(jsr>>13), jsr=jsr^(jsr<<5))
> |> #define LFIB4 (t[c] = t[c]+t[c+58]+t[c+119]+t[c+179], t[++c])
> |> #define SWB   (x = t[c+15], t[c+237] = x-(y=t[c+1]+(x<y)), t[++c])
> |>
> |> Dmitri
> |>
>
>    No -- 'SHR3' and 'LFIB4' are OK, but your version of 'SWB'
>    still invokes the dreaded undefined behavior. In the second
>    calculation, there's no sequence point to separate the use
>    of 'y' to determine the value of 'x<y' and the assignment to 'y'.

Sorry. No sequence point is needed between reading an writing to an
lvalue.

y = y - 1

would produce undefined behaviour, too ....

What produces undefined behaviour in C is _modifying_ an lvalue more
than once between the last and next sequence point. There is only one
modification of y in the expression

t[c+237] = x-(y=t[c+1]+(x<y))

Regards
Horst

Subject: Re: Random numbers for C: Improvements.
Date: 17 Jan 1999 13:22:21 -0500
From: hrubin@b.stat.purdue.edu (Herman Rubin)
Message-ID: <77t9ot\$15ha@b.stat.purdue.edu>
References: <99Jan15.151547edt.785@neuron.ai.toronto.edu>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 49

In article <99Jan15.151547edt.785@neuron.ai.toronto.edu>,
>>George Marsaglia <geo@stat.fsu.edu> writes:

>>> #define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c])
>>> #define SWB   (t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c])

>In article <790f49x60.fsf@poole.statgen.ncsu.edu>,
>Dmitri Zaykin <zaykin@statgen.ncsu.edu> wrote:

>>Shouldn't these be

>>#define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+179], ++c)
>>#define SWB   (t[c+237]=(x=t[c+15])-(y=t[c+1]+(x<y)), ++c)

>This doesn't work either.  I believe that it is undefined whether the
>comparison x<y uses the new or the old value of x.  The SHR3 macro
>in the original source also suffers from this flaw.

>I think one needs to face up to an unpleasant truth:  The #define
>facility of C was poorly designed, and is incapable in general of
>supporting the definition of "in-line" procedures.  It is far better
>to simply write ordinary C procedures, and accept the fairly small

I think it should be clarified, and probably written out in some
more detail.  But the procedure call overhead would be comparable
to the computing cost; C, and all other languages, have such great
built-in inefficiencies that what is needed is something written
from the standpoint of mathematics and efficiency.

But even the use of a comparison is a costly operation, if the
result is to be used promptly.  Any form of random, pseudo-random,
or quasi-random numbers should be done in good integer form,
not present now on many machines because of bad design coming
from bad languages which have thrown out the natural properties
of computers, and attempts to keep stupid people from making
mistakes.  An idiot-proof language is only for idiots.

--
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Subject: Re: Random numbers for C: Improvements.
Date: Sun, 17 Jan 1999 19:10:01 GMT
From: dmurdoch@pair.com (Duncan Murdoch)
Message-ID: <36a23371.10418803@newshost.uwo.ca>
References: <77t9ot\$15ha@b.stat.purdue.edu>
Newsgroups: sci.stat.math
Lines: 27

On 17 Jan 1999 13:22:21 -0500, hrubin@b.stat.purdue.edu (Herman Rubin)
wrote:
>  But the procedure call overhead would be comparable
>to the computing cost;

I don't know if that's true or not, but I don't think it is really
something to worry about.  The uniform value coming out of these
macros is unlikely to be the end of the computation; you'll almost
certainly do far more costly things with it after it's generated.  So
halving or doubling the speed of the generator won't have nearly so
much impact on the overall calculation.

On the other hand, if it's programmed in a way that is sometimes
incorrect in hard to detect ways, it might invalidate the whole
calculation.  That seems like a much higher cost to pay.

>An idiot-proof language is only for idiots.

No, if such a thing existed, it would also reduce the occasional
lapses of non-idiots.  Programmers are fallible; if tools can prevent
some common errors (even if it means doubling the computation time),
then those tools are worth using.

I'd rather not be the first one to finish, if my answer is wrong!

Duncan Murdoch

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 1999 08:56:33 -0500
From: hrubin@b.stat.purdue.edu (Herman Rubin)
Message-ID: <77veih\$11ts@b.stat.purdue.edu>
References: <36a23371.10418803@newshost.uwo.ca>
Newsgroups: sci.stat.math
Lines: 63

In article <36a23371.10418803@newshost.uwo.ca>,
Duncan Murdoch <dmurdoch@pair.com> wrote:
>On 17 Jan 1999 13:22:21 -0500, hrubin@b.stat.purdue.edu (Herman Rubin)
>wrote:
>>  But the procedure call overhead would be comparable
>>to the computing cost;

>I don't know if that's true or not, but I don't think it is really
>something to worry about.  The uniform value coming out of these
>macros is unlikely to be the end of the computation; you'll almost
>certainly do far more costly things with it after it's generated.  So
>halving or doubling the speed of the generator won't have nearly so
>much impact on the overall calculation.

It certainly is.  The cost of one procedure call could far exceed
the cost of a uniform, or even many nonuniform, random variables.
If procedure calls have to be made in the generation, I would not
be surprised to have a factor of 10 or more.  BTW, I would not
use any pseudo-random procedure in any case; XORing the numbers
with physical random numbers, which need not be as good as one
wants the final result to be, should be done, even if those
numbers should be reused, and this should be carefully done.

The fundamental output of a random number generator should be a
bit stream; this can be converted easily to anything, but the
converse is false.

>On the other hand, if it's programmed in a way that is sometimes
>incorrect in hard to detect ways, it might invalidate the whole
>calculation.  That seems like a much higher cost to pay.

This is why it is important that those who understand the mathematics
of the algorithm do it; I do not believe it pays to have a library
routine written by one person.  HLL algorithms fail in rather odd
ways, and algorithms for generating random numbers, unless they have
huge errors, can only be checked by direct proof of correctness.

>>An idiot-proof language is only for idiots.

>No, if such a thing existed, it would also reduce the occasional
>lapses of non-idiots.  Programmers are fallible; if tools can prevent
>some common errors (even if it means doubling the computation time),
>then those tools are worth using.

On the contrary, these tools can cause the programmer to make
errors which cannot be easily detected.  And it does not mean
just a factor of two, but far more.  The current design of
computers, driven by the languages, has increased the time to
do multiple precision arithmetic by a factor of possibly 100
relative to what it would have been if "old" architecture was
in place.

>I'd rather not be the first one to finish, if my answer is wrong!

>Duncan Murdoch

--
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 99 18:38:33 GMT
Message-ID: <99Jan18.133644edt.6004@cortex.ai.toronto.edu>
References: <77veih\$11ts@b.stat.purdue.edu>
Newsgroups: sci.stat.math
Lines: 60

In article <77veih\$11ts@b.stat.purdue.edu>,
Herman Rubin <hrubin@b.stat.purdue.edu> wrote:

>... The cost of one procedure call could far exceed
>the cost of a uniform, or even many nonuniform, random variables.
>If procedure calls have to be made in the generation, I would not
>be surprised to have a factor of 10 or more.

This might have been the case with some old languages such as Algol
60, but not with C as implemented on modern machines.  Even on a CISC
machine like a VAX, I doubt the C procedure call overhead would be
more that twice the time for the actual work, and on modern RISC
machines the ratio will tend to be less.

> BTW, I would not
>use any pseudo-random procedure in any case; XORing the numbers
>with physical random numbers, which need not be as good as one
>wants the final result to be, should be done, even if those
>numbers should be reused, and this should be carefully done.

I'd agree with this.  I just makes the work inside the procedure
greater, however, reducing the significance of the procedure call

>>>An idiot-proof language is only for idiots.
>
>>No, if such a thing existed, it would also reduce the occasional
>>lapses of non-idiots.  Programmers are fallible; if tools can prevent
>>some common errors (even if it means doubling the computation time),
>>then those tools are worth using.
>
>On the contrary, these tools can cause the programmer to make
>errors which cannot be easily detected.

Although such errors caused by use of high-level languages are
possible, they are minor compared to the numerous ways one can go
wrong when writing assembly-language programs (eg, failing to be
completely consistent in conventions for whether procedures do or do
not save register contents).

>The current design of
>computers, driven by the languages, has increased the time to
>do multiple precision arithmetic by a factor of possibly 100
>relative to what it would have been if "old" architecture was
>in place.

I find this hard to believe.  It's true that high level languages
provide no way to access operations such as 32-bit by 32-bit to 64-bit
multiplication, but that would just force you to implement multiple
precision arithmetic with 16-bit chunks rather than 32-bit chunks, for
a slowdown of "only" a factor of four.  Even accounting for additional
general inefficiencies, it's hard to see how it can be any more than a
factor of 10 slower.  If you're claiming that the hardware itself
would be designed differently if people still wrote in assembler, I'd
say that if anything the effect goes the other way.  The tendency with
RISC machines has been to simplify the hardware (making it faster) at
the expense of funny machine languages, which compilers can cope with,
but which are less convenient for humans writing assembly code.

Subject: Re: Random numbers for C: Improvements.
Date: 19 Jan 1999 10:25:10 -0500
From: hrubin@b.stat.purdue.edu (Herman Rubin)
Message-ID: <78284m\$m0e@b.stat.purdue.edu>
References: <99Jan18.133644edt.6004@cortex.ai.toronto.edu>
Newsgroups: sci.stat.math
Lines: 119

In article <99Jan18.133644edt.6004@cortex.ai.toronto.edu>,
>In article <77veih\$11ts@b.stat.purdue.edu>,
>Herman Rubin <hrubin@b.stat.purdue.edu> wrote:

>>... The cost of one procedure call could far exceed
>>the cost of a uniform, or even many nonuniform, random variables.
>>If procedure calls have to be made in the generation, I would not
>>be surprised to have a factor of 10 or more.

>This might have been the case with some old languages such as Algol
>60, but not with C as implemented on modern machines.  Even on a CISC
>machine like a VAX, I doubt the C procedure call overhead would be
>more that twice the time for the actual work, and on modern RISC
>machines the ratio will tend to be less.

There are more problems than that; the question is, whether
procedure calls have to be made in the internal computation
of the pseudo-random numbers.  In this case, they would require
a huge number of register save-restores.  The example considered
using a subroutine call for the generation of each random variable.

The code discussed was not in making a call for an array, but
for a single step in the generation procedure.

>> BTW, I would not
>>use any pseudo-random procedure in any case; XORing the numbers
>>with physical random numbers, which need not be as good as one
>>wants the final result to be, should be done, even if those
>>numbers should be reused, and this should be carefully done.

>I'd agree with this.  I just makes the work inside the procedure
>greater, however, reducing the significance of the procedure call

This is best reduced by using buffers.

>>>>An idiot-proof language is only for idiots.

>>>No, if such a thing existed, it would also reduce the occasional
>>>lapses of non-idiots.  Programmers are fallible; if tools can prevent
>>>some common errors (even if it means doubling the computation time),
>>>then those tools are worth using.

>>On the contrary, these tools can cause the programmer to make
>>errors which cannot be easily detected.

>Although such errors caused by use of high-level languages are
>possible, they are minor compared to the numerous ways one can go
>wrong when writing assembly-language programs (eg, failing to be
>completely consistent in conventions for whether procedures do or do
>not save register contents).

It depends on how much is done; saving dozens of registers is
certainly expensive.  This is the order of magnitude, as the
number of registers on many machines runs from 32 to 256, and
possibly more.  What should be used very often instead of calls,
where transfer of control is needed, are jumps with return, or
even quite liberal use of gotos.

>>The current design of
>>computers, driven by the languages, has increased the time to
>>do multiple precision arithmetic by a factor of possibly 100
>>relative to what it would have been if "old" architecture was
>>in place.

>I find this hard to believe.  It's true that high level languages
>provide no way to access operations such as 32-bit by 32-bit to 64-bit
>multiplication, but that would just force you to implement multiple
>precision arithmetic with 16-bit chunks rather than 32-bit chunks, for
>a slowdown of "only" a factor of four.

It is 16-bit chunks instead of the size chunks the floating processors
can handle, and also the much smaller parallelism and far slower
multipliers, as well as the problem of handling the much larger
number of registers.  What will fit in registers for floating is
likely to involve a large number of memory references in fixed.
There is also likely to be lots of shifting, to extract the smaller
chunks.

I did look into it in detail on the Cray, and I needed to use
20 instructions to get a product of two 48-bit numbers.

>general inefficiencies, it's hard to see how it can be any more than a
>factor of 10 slower.  If you're claiming that the hardware itself
>would be designed differently if people still wrote in assembler, I'd
>say that if anything the effect goes the other way.

At this stage, this would not be the case.  It would probably have
been best if, at the time that it was decided to go beyond Fortran,
the versatility of the machine instructions at the time had been
put into the HLL's.  Among those omitted was that of replacements
producing lists (NOT structs) of results, and the ability to directly
use the same hardware entities directly with different "type"
operations.  The simplest procedures, from the standpoint of
computational complexity, to generate non-uniform random variables
from uniform output use bit operations, not floating arithmetic.

The tendency with
>RISC machines has been to simplify the hardware (making it faster) at
>the expense of funny machine languages, which compilers can cope with,
>but which are less convenient for humans writing assembly code.

If we had real RISC, it would not have floating point, as floating
can be fairly efficiently done with fixed point.  Instead, we would
have fixed-point operations yielding multiple results, and flexible
hardware.  Forced normalization floating point arithmetic is quite
difficult to use for multiple precision, or much of anything else.
The IEEE committee was notified of these problems.

--
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Subject: Re: Random numbers for C: Improvements.
Date: Tue, 19 Jan 1999 03:23:26 GMT
From: dmurdoch@pair.com (Duncan Murdoch)
Message-ID: <36a5f74f.11285673@newshost.uwo.ca>
References: <77veih\$11ts@b.stat.purdue.edu>
Newsgroups: sci.stat.math
Lines: 41

On 18 Jan 1999 08:56:33 -0500, hrubin@b.stat.purdue.edu (Herman Rubin)
wrote:

>In article <36a23371.10418803@newshost.uwo.ca>,
>Duncan Murdoch <dmurdoch@pair.com> wrote:
>>On 17 Jan 1999 13:22:21 -0500, hrubin@b.stat.purdue.edu (Herman Rubin)
>>wrote:

>>On the other hand, if it's programmed in a way that is sometimes
>>incorrect in hard to detect ways, it might invalidate the whole
>>calculation.  That seems like a much higher cost to pay.
>
>This is why it is important that those who understand the mathematics
>of the algorithm do it; I do not believe it pays to have a library
>routine written by one person.  HLL algorithms fail in rather odd
>ways, and algorithms for generating random numbers, unless they have
>huge errors, can only be checked by direct proof of correctness.

Implementations of high-level languages often have bugs in them, but
the language definitions are mostly specific about what will be
calculated from a given input.  That's also true about machine
languages, but it's certainly easier to verify when the language is
closer to the original mathematical specification.

>>>An idiot-proof language is only for idiots.
>
>>No, if such a thing existed, it would also reduce the occasional
>>lapses of non-idiots.  Programmers are fallible; if tools can prevent
>>some common errors (even if it means doubling the computation time),
>>then those tools are worth using.
>
>On the contrary, these tools can cause the programmer to make
>errors which cannot be easily detected.

I'm not sure what you have in mind.  What I had in mind are tools like
strong type checking and optional range checking (which you don't get
in machine languages).  What sort of tools are you thinking of?  Can
you give examples where a tool designed to prevent errors actually
introduces them?

Duncan Murdoch

Subject: Re: Random numbers for C: Improvements.
Date: 19 Jan 1999 11:27:03 -0500
From: hrubin@b.stat.purdue.edu (Herman Rubin)
Message-ID: <782bon\$qjg@b.stat.purdue.edu>
References: <36a5f74f.11285673@newshost.uwo.ca>
Newsgroups: sci.stat.math
Lines: 121

In article <36a5f74f.11285673@newshost.uwo.ca>,
Duncan Murdoch <dmurdoch@pair.com> wrote:
>On 18 Jan 1999 08:56:33 -0500, hrubin@b.stat.purdue.edu (Herman Rubin)
>wrote:

>>In article <36a23371.10418803@newshost.uwo.ca>,
>>Duncan Murdoch <dmurdoch@pair.com> wrote:
>>>On 17 Jan 1999 13:22:21 -0500, hrubin@b.stat.purdue.edu (Herman Rubin)
>>>wrote:

>>>On the other hand, if it's programmed in a way that is sometimes
>>>incorrect in hard to detect ways, it might invalidate the whole
>>>calculation.  That seems like a much higher cost to pay.

>>This is why it is important that those who understand the mathematics
>>of the algorithm do it; I do not believe it pays to have a library
>>routine written by one person.  HLL algorithms fail in rather odd
>>ways, and algorithms for generating random numbers, unless they have
>>huge errors, can only be checked by direct proof of correctness.

>Implementations of high-level languages often have bugs in them, but
>the language definitions are mostly specific about what will be
>calculated from a given input.  That's also true about machine
>languages, but it's certainly easier to verify when the language is
>closer to the original mathematical specification.

The problems are rather that the HLLs are not designed with what
the programmer can think of.

>>>>An idiot-proof language is only for idiots.

>>>No, if such a thing existed, it would also reduce the occasional
>>>lapses of non-idiots.  Programmers are fallible; if tools can prevent
>>>some common errors (even if it means doubling the computation time),
>>>then those tools are worth using.

>>On the contrary, these tools can cause the programmer to make
>>errors which cannot be easily detected.

>I'm not sure what you have in mind.  What I had in mind are tools like
>strong type checking and optional range checking (which you don't get
>in machine languages).

Can we afford range checking?  And range checking could be added
to a machine language program; making it mandatory at least triples
the number of instructions needed for a memory reference, and one
of those is a conditional transfer.  If we want it, it should be
put in as a hardware interrupt in a more complicated instruction,
where the efficiency loss will only be in the additional arguments.

Basic random tools should be as a bit stream, not as floating
numbers, for many reasons.  If given a type, it should be integer,
or fixed point "real", not available in any of the common languages.
Floating point CANNOT do the job without major complications.
Efficient conversion is not available in most languages, and is
made worse with forced normalization.  Without that, efficient
conversion is much easier, and multiple precision is much easier.
This hardware defect was promoted by the languages; the operations
needed did not occur in the languages, and thus those who looked
at the programs being written did not recognize that they were
considered useful.

Can we afford range checking?  And range checking could be added
to a machine language program; making it mandatory at least triples
the number of instructions needed for a memory reference, and one
of those is a conditional transfer.  If we want it, it should be
put in as a hardware interrupt in a more complicated instruction,
where the efficiency loss will only be in the additional arguments.

What sort of tools are you thinking of?

Badly needed tools are more operations, not functions.  The power
operation is a good example; even the possibility of using types
in C++ does not do the job.  Good Fortran compilers expanded small
integers at compile time, and some even used this plus the square
root operation when the exponent was an integer plus .5.  The C
power function always uses ln and exp.  Also, pack and unpack are
needed, not what occurs in the languages.  And even the frexp
function in C shows the inefficiency of not having a list of
results; the insistence of a store may more than double the cost
of using the operation.

One can have very complicated instructions, and still the hardware
efficiency of RISC, by putting much of the decoding in the unit
where the instruction is transferred.  There were machines where
the first four bids were all that the instruction decoder looked
at, while certain other parts of the CPU looked at other bits
in parallel with this.

Can
>you give examples where a tool designed to prevent errors actually
>introduces them?

Can we afford range checking?  And range checking could be added
to a machine language program; making it mandatory at least triples
the number of instructions needed for a memory reference, and one
of those is a conditional transfer.  If we want it, it should be
put in as a hardware interrupt in a more complicated instruction,
where the efficiency loss will only be in the additional arguments.

One good example of this occurs in calls to procedures generating
random numbers.  As these calls do not have any arguments, or
the arguments are fixed locations, an "optimizing" compiler will
take them out of a loop!

The elaborate unnatural precedence structure in languages such as
C leads to errors.  I have run into this with the precedence of
"+" and "<<" as "<<" is used for multiplication by a power of 2.

It was a real problem to get C to respect the users' parentheses.
Failure to do so can cause major loss of precision.  Some modifications
of C allow binary floating point numbers; it is rare that one should
use a decimal floating point number in a subroutine.  This has only
recently been introduced in C; it was possible to do this with an
illegal cast, but on little-endian machines, this was not easy.
--
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Subject: Re: Random numbers for C: Improvements.
Date: Wed, 20 Jan 1999 02:31:55 GMT
From: dmurdoch@pair.com (Duncan Murdoch)
Message-ID: <36a63ccd.18681450@newshost.uwo.ca>
References: <782bon\$qjg@b.stat.purdue.edu>
Newsgroups: sci.stat.math
Lines: 43

On 19 Jan 1999 11:27:03 -0500, hrubin@b.stat.purdue.edu (Herman Rubin)
wrote:

>>I'm not sure what you have in mind.  What I had in mind are tools like
>>strong type checking and optional range checking (which you don't get
>>in machine languages).
>
>Can we afford range checking?  And range checking could be added
>to a machine language program; making it mandatory at least triples
>the number of instructions needed for a memory reference, and one
>of those is a conditional transfer.  If we want it, it should be
>put in as a hardware interrupt in a more complicated instruction,
>where the efficiency loss will only be in the additional arguments.

I think Java is the only language I've come across that makes range
checking mandatory.  Most implementations make it optional:  you
supply the same source code, the compiler produces different machine
code depending on how you set the option.  That could possibly be done
in an assembler, but I've never heard of anyone trying it.

The normal use for range checking is in debugging.  If you have to
change your source code between debugging and running, there's a big
chance of introducing new bugs.

>The elaborate unnatural precedence structure in languages such as
>C leads to errors.  I have run into this with the precedence of
>"+" and "<<" as "<<" is used for multiplication by a power of 2.

Someone else will have to argue for C.  I think it's hardly better
than assembler, myself.

>It was a real problem to get C to respect the users' parentheses.

I think it's pretty common for compilers to rearrange expressions to
make them more convenient to calculate, under the incorrect assumption
that mathematically equivalent expressions will generate equal
results.  But all languages that I know (including C) provide ways of
forcing the order of evaluation:  just put the things that need to be
done first in an earlier statement.  This is exactly the way you do it
in assembler.

Duncan Murdoch

Subject: Re: Random numbers for C: Improvements.
Date: Sun, 17 Jan 1999 14:33:32 -0500
From: dwnoon@compuserve.com (David W. Noon)
Message-ID: <cUwZzZKf1ety-pn2-vB2YCWNn5fgR@mfs-pci-bqh-vty163.as.wcom.net>
References: <77t9ot\$15ha@b.stat.purdue.edu>
Newsgroups: sci.math.num-analysis,sci.stat.math,sci.math
Lines: 62

On Sun, 17 Jan 1999 18:22:21, hrubin@b.stat.purdue.edu (Herman Rubin)
wrote:

> In article <99Jan15.151547edt.785@neuron.ai.toronto.edu>,
> >>George Marsaglia <geo@stat.fsu.edu> writes:
>
> >>> #define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c])
> >>> #define SWB   (t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c])
>
>
> >In article <790f49x60.fsf@poole.statgen.ncsu.edu>,
> >Dmitri Zaykin <zaykin@statgen.ncsu.edu> wrote:
>
> >>Shouldn't these be
>
> >>#define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[c+179], ++c)
> >>#define SWB   (t[c+237]=(x=t[c+15])-(y=t[c+1]+(x<y)), ++c)
>
>
> >This doesn't work either.  I believe that it is undefined whether the
> >comparison x<y uses the new or the old value of x.  The SHR3 macro
> >in the original source also suffers from this flaw.
>
> >I think one needs to face up to an unpleasant truth:  The #define
> >facility of C was poorly designed, and is incapable in general of
> >supporting the definition of "in-line" procedures.  It is far better
> >to simply write ordinary C procedures, and accept the fairly small
>
> I think it should be clarified, and probably written out in some
> more detail.  But the procedure call overhead would be comparable
> to the computing cost; C, and all other languages, have such great
> built-in inefficiencies that what is needed is something written
> from the standpoint of mathematics and efficiency.
>
> But even the use of a comparison is a costly operation, if the
> result is to be used promptly.  Any form of random, pseudo-random,
> or quasi-random numbers should be done in good integer form,
> not present now on many machines because of bad design coming
> from bad languages which have thrown out the natural properties
> of computers, and attempts to keep stupid people from making
> mistakes.  An idiot-proof language is only for idiots.

This is very admirable, but it is basically saying that the code
should be in assembler. There goes portability.

I don't mind translating all this stuff into assembler, but anybody
not using an Intel 80486 or better running a 32-bit version of OS/2
will not be able to use my code.

The only other language I know of that has sufficiently powerful macro
facilities to force inlining like C but with better syntax, is
portable across many platforms, and produces efficient object code is
PL/I. If anybody besides me can use such code, I will produce it

So, who wants it?

Regards

Dave

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 1999 00:16:24 GMT
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <5tFXN12Xwb3.7WA43126hM3916@statgen.ncsu.edu>
References: <cUwZzZKf1ety-pn2-vB2YCWNn5fgR@mfs-pci-bqh-vty163.as.wcom.net>
Newsgroups: sci.math.num-analysis,sci.stat.math,sci.math
Lines: 78

dwnoon@compuserve.com (David W. Noon) writes:
> The only other language I know of that has sufficiently powerful macro
> facilities to force inlining like C but with better syntax, is
> portable across many platforms, and produces efficient object code is
> PL/I. If anybody besides me can use such code, I will produce it

Another option is to re-write macros as C++ member functions. If all
the code is in the body of the class or "inline" keyword is explicitly
given, the compiler will attempt to inline the functions.

Below is my version of it and an example of usage. I've put "//"
comments in places where I thought changes to the C code are
global variables are introduced. (2) It is easy to run several
"independent" random sequences by creating several variables of Rnd
type and seeding them differently (something that would not be
straightforward to do in C).

Dmitri

#include <limits.h> // ULONG_MAX and UCHAR_MAX defined there

class Rnd {
Rnd() {}
typedef unsigned long Unlo;
Unlo z, w, jsr, jcong, t[UCHAR_MAX+1], x, y;
unsigned char c;
Unlo znew() { return (z = 36969UL*(z & 65535UL)+(z >> 16)); } // +UL
Unlo wnew() { return (w = 18000UL*(w & 65535UL)+(w >> 16)); } // +UL
public:
Rnd (Unlo i1, Unlo i2, Unlo i3, Unlo i4)
: z(i1), w(i2), jsr(i3), jcong(i4), x(0), y(0), c(0) {
for(int i=0; i<UCHAR_MAX+1; i++) t[i] = Kiss();
}
Unlo Mwc() { return (znew() << 16) + wnew(); }
Unlo Shr3 () { // was: jsr=(jsr=(jsr=jsr^(jsr<<17))^(jsr>>13))^(jsr<<5)
jsr=jsr^(jsr<<17);
jsr=jsr^(jsr>>13);
return (jsr=jsr^(jsr<<5));
}
Unlo Cong() { return (jcong = 69069UL*jcong + 1234567UL); } // +UL
Unlo Kiss() { return (Mwc() ^ Cong()) + Shr3(); }
Unlo Swb () { // was: t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c]
x = t[(c+15) & 0xFF];
t[(c+237) & 0xFF] = x - (y = t[(c+1) & 0xFF] + (x < y));
return t[++c];
}
Unlo Lfib4() { // was: t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c]
t[c]=t[c]+t[(c+58) & 0xFF]+t[(c+119) & 0xFF]+t[(c+179) & 0xFF];
return t[++c];
}
double Uni() { return Kiss() * 2.328306e-10; }
double Vni() { return long(Kiss()) * 4.656613e-10; }
double operator () () { return Uni(); }
Unlo operator () (Unlo n) {
return n == 1 ? 0 : Kiss() / (ULONG_MAX/n + 1);
}
double operator () (double Min, double Max) { return Min+Uni()*(Max-Min); }
};

// example of usage

#include <time.h>
#include <iostream.h>

int main()
{
unsigned i, seed=time(0);
Rnd rn (seed, 2*seed, 3*seed, 4*seed);

for(i=0; i<5; i++) cout << rn(5) << endl;      // [0, 1, 2, 3, 4]
for(i=0; i<5; i++) cout << rn() << endl;       // (0, ..., 1)
for(i=0; i<5; i++) cout << rn.Vni() << endl;   // (-1, ..., 1)
for(i=0; i<5; i++) cout << rn(10, 20) << endl; // (10, ..., 20)
for(i=0; i<5; i++) cout << rn.Lfib4() << endl; // trying Lfib4
for(i=0; i<5; i++) cout << rn.Swb() << endl;   // trying Swb
}

Subject: Re: Random numbers for C: Improvements.
Date: Mon, 18 Jan 1999 17:07:26 -0500
From: dwnoon@compuserve.com (David W. Noon)
Message-ID: <cUwZzZKf1ety-pn2-NxnsgFONxOGO@localhost>
References: <5tFXN12Xwb3.7WA43126hM3916@statgen.ncsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 97

On Mon, 18 Jan 1999 00:16:24, zaykin@statgen.ncsu.edu (Dmitri Zaykin)
wrote:

> dwnoon@compuserve.com (David W. Noon) writes:
> > The only other language I know of that has sufficiently powerful macro
> > facilities to force inlining like C but with better syntax, is
> > portable across many platforms, and produces efficient object code is
> > PL/I. If anybody besides me can use such code, I will produce it
>
> Another option is to re-write macros as C++ member functions. If all
> the code is in the body of the class or "inline" keyword is explicitly
> given, the compiler will attempt to inline the functions.
>
> Below is my version of it and an example of usage. I've put "//"
> comments in places where I thought changes to the C code are
> global variables are introduced. (2) It is easy to run several
> "independent" random sequences by creating several variables of Rnd
> type and seeding them differently (something that would not be
> straightforward to do in C).
>
> Dmitri
>
> #include <limits.h> // ULONG_MAX and UCHAR_MAX defined there
>
> class Rnd {
>     Rnd() {}
>     typedef unsigned long Unlo;
>     Unlo z, w, jsr, jcong, t[UCHAR_MAX+1], x, y;
>     unsigned char c;
>     Unlo znew() { return (z = 36969UL*(z & 65535UL)+(z >> 16)); } // +UL
>     Unlo wnew() { return (w = 18000UL*(w & 65535UL)+(w >> 16)); } // +UL
>  public:
>     Rnd (Unlo i1, Unlo i2, Unlo i3, Unlo i4)
>             : z(i1), w(i2), jsr(i3), jcong(i4), x(0), y(0), c(0) {
>         for(int i=0; i<UCHAR_MAX+1; i++) t[i] = Kiss();
>     }
>     Unlo Mwc() { return (znew() << 16) + wnew(); }
>     Unlo Shr3 () { // was: jsr=(jsr=(jsr=jsr^(jsr<<17))^(jsr>>13))^(jsr<<5)
>         jsr=jsr^(jsr<<17);
>         jsr=jsr^(jsr>>13);
>         return (jsr=jsr^(jsr<<5));
>     }
>     Unlo Cong() { return (jcong = 69069UL*jcong + 1234567UL); } // +UL
>     Unlo Kiss() { return (Mwc() ^ Cong()) + Shr3(); }
>     Unlo Swb () { // was: t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c]
>         x = t[(c+15) & 0xFF];
>         t[(c+237) & 0xFF] = x - (y = t[(c+1) & 0xFF] + (x < y));
>         return t[++c];
>     }
>     Unlo Lfib4() { // was: t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c]
>         t[c]=t[c]+t[(c+58) & 0xFF]+t[(c+119) & 0xFF]+t[(c+179) & 0xFF];
>         return t[++c];
>     }
>     double Uni() { return Kiss() * 2.328306e-10; }
>     double Vni() { return long(Kiss()) * 4.656613e-10; }
>     double operator () () { return Uni(); }
>     Unlo operator () (Unlo n) {
>         return n == 1 ? 0 : Kiss() / (ULONG_MAX/n + 1);
>     }
>     double operator () (double Min, double Max) { return Min+Uni()*(Max-Min); }
> };
>
> // example of usage
>
> #include <time.h>
> #include <iostream.h>
>
> int main()
> {
>     unsigned i, seed=time(0);
>     Rnd rn (seed, 2*seed, 3*seed, 4*seed);
>
>     for(i=0; i<5; i++) cout << rn(5) << endl;      // [0, 1, 2, 3, 4]
>     for(i=0; i<5; i++) cout << rn() << endl;       // (0, ..., 1)
>     for(i=0; i<5; i++) cout << rn.Vni() << endl;   // (-1, ..., 1)
>     for(i=0; i<5; i++) cout << rn(10, 20) << endl; // (10, ..., 20)
>     for(i=0; i<5; i++) cout << rn.Lfib4() << endl; // trying Lfib4
>     for(i=0; i<5; i++) cout << rn.Swb() << endl;   // trying Swb
> }

The problem with C++ is that VMT calling mechanisms are almost
invariably slower than the calling mechanisms used by C, PL/I,
FORTRAN, Pascal, etc. [You have an extra parameter (this) to pass,
even if you don't use it, and you have to do a VMT lookup to find the
method.] So, while you do have a way to inline with some elegance, it
can be a chore to get to the inlined code from the main program.

If you want to use C++, use a template instead of a class. It's much
faster, usually, but still usually slower than C, PL/I or FORTRAN --
and _way_ slower than assembler. For PRNG speed _is_ important, as
PRN's are usually generated in bulk.

Regards

Dave

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 1999 17:57:49 -0500
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <7sod842aa.fsf@poole.statgen.ncsu.edu>
References: <cUwZzZKf1ety-pn2-NxnsgFONxOGO@localhost>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 18

dwnoon@compuserve.com (David W. Noon) writes:
> The problem with C++ is that VMT calling mechanisms are almost
> invariably slower than the calling mechanisms used by C, PL/I,
> FORTRAN, Pascal, etc. [You have an extra parameter (this) to pass,
> even if you don't use it, and you have to do a VMT lookup to find the
> method.]

Well, in this particular case all that does not apply, since there are
no virtual functions in the code. As C++ FAQ says, "The compiler
creates a v-table for each class that has at least one virtual
function."

(http://www.cerfnet.com/~mpcline/c++-faq-lite/virtual-functions.html)

Regular member functions and overloaded operators are resolved at
compile time (in this case, inlined), so it gets as good as C-macros.

Dmitri

Subject: Re: Random numbers for C: Improvements.
Date: Tue, 19 Jan 1999 03:39:07 GMT
From: dmurdoch@pair.com (Duncan Murdoch)
Message-ID: <36a3fc77.12606540@newshost.uwo.ca>
References: <cUwZzZKf1ety-pn2-NxnsgFONxOGO@localhost>
Newsgroups: sci.stat.math
Lines: 20

On Mon, 18 Jan 1999 17:07:26 -0500, dwnoon@compuserve.com (David W.
Noon) wrote:

>If you want to use C++, use a template instead of a class. It's much
>faster, usually, but still usually slower than C, PL/I or FORTRAN --
>and _way_ slower than assembler. For PRNG speed _is_ important, as
>PRN's are usually generated in bulk.

...but just as with any other aspect of the program, speed isn't as
important as correctness.

If I can write tricky routines in assembler that are 10 times faster
than clear ones, then in a given length of computational time my Monte
Carlo integration will get an extra half digit of precision.

On the other hand, if I mess up the implementation because I was just
a little bit too clever for my own good, it might be that none of the
digits are right.

Duncan Murdoch

Subject: Re: Random numbers for C: Improvements.
Date: 20 Jan 1999 21:32:39 -0500
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <7sod5s6d4.fsf@poole.statgen.ncsu.edu>
References: <cUwZzZKf1ety-pn2-NxnsgFONxOGO@localhost>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 131

To check if C-macros for these random number generators do indeed
always produce faster, and maybe "less bloated" code than inlined C++
member functions, I did a little experiment with timing/code size
using the GNU compiler (egcs-1.1.1 g++/gcc) on Solaris 2.5.1. With
this compiler, it is clearly not the case.

(1) Code size (conclusion: C++ code smaller)

inlined member functions   C-macros              C-macros
g++ -Winline -O2 -s        g++ -Winline -O2 -s   gcc -Winline -O2 -s
8908                       9820                  9532

(2) Timing in 10 experiments (conclusion: C++ code faster)

inlined member functions   C-macros              C-macros
g++ -Winline -O2 -s        g++ -Winline -O2 -s   gcc -Winline -O2 -s
11330000                   15030000              14500000
11330000                   15040000              14470000
11340000                   15030000              14490000
11340000                   15040000              14520000
11340000                   15030000              14500000
11320000                   15030000              14490000
11340000                   15040000              14480000
11340000                   15030000              14510000
11340000                   15030000              14500000
11340000                   15030000              14500000

//-----------------------------------------------------
// C++ code:

#include <stdio.h>
#include <time.h>
#include <limits.h>

class Rnd {
Rnd() {}
typedef unsigned long Unlo;
typedef unsigned char Uc;
Unlo z, w, jsr, jcong, t[UCHAR_MAX+1], x, y, a, b;
unsigned char c;
Unlo znew() { return (z = 36969UL*(z & 65535UL)+(z >> 16)); }
Unlo wnew() { return (w = 18000UL*(w & 65535UL)+(w >> 16)); }
public:
Rnd (Unlo i1, Unlo i2, Unlo i3, Unlo i4, Unlo i5, Unlo i6)
: z(i1), w(i2), jsr(i3), jcong(i4), x(0), y(0),
a(i5), b(i6), c(0) {
for(int i=0; i<UCHAR_MAX+1; i++) t[i] = Kiss();
}
Unlo Mwc() { return (znew() << 16) + wnew(); }
Unlo Shr3 () {
jsr=jsr^(jsr<<17);
jsr=jsr^(jsr>>13);
return (jsr=jsr^(jsr<<5));
}
Unlo Cong() { return (jcong = 69069UL*jcong + 1234567UL); }
Unlo Kiss() { return (Mwc() ^ Cong()) + Shr3(); }
Unlo Swb () {
x = t[(Uc)(c+15)];
t[(Uc)(c+237)] = x - (y = t[(Uc)(c+1)] + (x < y));
return t[++c];
}
Unlo Lfib4() {
t[c]=t[c]+t[(Uc)(c+58)]+t[(Uc)(c+119)]+t[(Uc)(c+179)];
return t[++c];
}
Unlo Fib() { b=a+b; return (a=b-a); }
double Uni() { return Kiss() * 2.328306e-10; }
double Vni() { return long(Kiss()) * 4.656613e-10; }
double operator () () { return Uni(); }
Unlo operator () (Unlo n) {
return n == 1 ? 0 : Kiss() / (ULONG_MAX/n + 1);
}
double operator () (double Min, double Max) { return Min+Uni()*(Max-Min); }
};

int main()
{
unsigned long i, xx=0, seed=time(0);
long spent;
Rnd rn (seed, 2*seed, 3*seed, 4*seed, 5*seed, 6*seed);

spent=clock();
for(i=0; i<77777777; i++) xx += (rn.Kiss() + rn.Swb());
printf ("%ld \t", clock()-spent);

printf("\n");
}

/*****************************************************************/
/* C-macros */

#include <stdio.h>
#include <time.h>
#include <limits.h>

#define znew   (z=36969UL*(z&65535UL)+(z>>16))
#define wnew   (w=18000UL*(w&65535UL)+(w>>16))
#define MWC    ((znew<<16)+wnew )
#define SHR3  (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5))
#define CONG  (jcong=69069UL*jcong+1234567UL)
#define FIB   ((b=a+b),(a=b-a))
#define KISS  ((MWC^CONG)+SHR3)
#define UC    (unsigned char)
#define LFIB4 (c++,t[c]=t[c]+t[UC(c+58)]+t[UC(c+119)]+t[UC(c+178)])
#define SWB   (x = t[UC(c+15)], t[UC(c+237)] = x-(y=t[UC(c+1)]+(x<y)), t[++c])
#define UNI   (KISS*2.328306e-10)
#define VNI   ((long) KISS)*4.656613e-10
typedef unsigned long Un;
static Un z=362436069UL, w=521288629UL, jsr=123456789UL, jcong=380116160UL;
static Un a=224466889UL, b=7584631UL, t[256];
static Un x=0,y=0; static unsigned char c=0;
void settable(Un i1,Un i2,Un i3,Un i4,Un i5, Un i6)
{ int i; z=i1;w=i2,jsr=i3; jcong=i4; a=i5; b=i6;
for(i=0;i<256;i=i+1)  t[i]=KISS;
}

int main()
{
unsigned long i, xx=0, seed=time(0);
long spent;

settable (seed, 2*seed, 3*seed, 4*seed, 5*seed, 6*seed);

spent=clock();
for(i=0; i<77777777; i++) xx += (KISS + SWB);
printf ("%ld \t", spent=clock()-spent);

printf("\n");
return 0;
}

Subject: Re: Random numbers for C: Improvements.
Date: 21 Jan 1999 04:40:53 GMT
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <124123g26m11754.1356cjx442854@statgen.ncsu.edu>
References: <cUwZzZKf1ety-pn2-NxnsgFONxOGO@localhost>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 131

To check if C-macros for these random number generators do indeed
always produce faster, and maybe "less bloated" code than inlined C++
member functions, I did a little experiment with timing/code size
using the GNU compiler (egcs-1.1.1 g++/gcc) on Solaris 2.5.1. With
this compiler, it is clearly not the case.

(1) Executable size (conclusion: C++ code smaller)

inlined member functions   C-macros              C-macros
g++ -Winline -O2 -s        g++ -Winline -O2 -s   gcc -Winline -O2 -s
8908                       9820                  9532

(2) Timing in 10 experiments (conclusion: C++ code faster)

inlined member functions   C-macros              C-macros
g++ -Winline -O2 -s        g++ -Winline -O2 -s   gcc -Winline -O2 -s
11330000                   15030000              14500000
11330000                   15040000              14470000
11340000                   15030000              14490000
11340000                   15040000              14520000
11340000                   15030000              14500000
11320000                   15030000              14490000
11340000                   15040000              14480000
11340000                   15030000              14510000
11340000                   15030000              14500000
11340000                   15030000              14500000

//-----------------------------------------------------
// C++ code:

#include <stdio.h>
#include <time.h>
#include <limits.h>

class Rnd {
Rnd() {}
typedef unsigned long Unlo;
typedef unsigned char Uc;
Unlo z, w, jsr, jcong, t[UCHAR_MAX+1], x, y, a, b;
unsigned char c;
Unlo znew() { return (z = 36969UL*(z & 65535UL)+(z >> 16)); }
Unlo wnew() { return (w = 18000UL*(w & 65535UL)+(w >> 16)); }
public:
Rnd (Unlo i1, Unlo i2, Unlo i3, Unlo i4, Unlo i5, Unlo i6)
: z(i1), w(i2), jsr(i3), jcong(i4), x(0), y(0),
a(i5), b(i6), c(0) {
for(int i=0; i<UCHAR_MAX+1; i++) t[i] = Kiss();
}
Unlo Mwc() { return (znew() << 16) + wnew(); }
Unlo Shr3 () {
jsr=jsr^(jsr<<17);
jsr=jsr^(jsr>>13);
return (jsr=jsr^(jsr<<5));
}
Unlo Cong() { return (jcong = 69069UL*jcong + 1234567UL); }
Unlo Kiss() { return (Mwc() ^ Cong()) + Shr3(); }
Unlo Swb () {
x = t[(Uc)(c+15)];
t[(Uc)(c+237)] = x - (y = t[(Uc)(c+1)] + (x < y));
return t[++c];
}
Unlo Lfib4() {
t[c]=t[c]+t[(Uc)(c+58)]+t[(Uc)(c+119)]+t[(Uc)(c+179)];
return t[++c];
}
Unlo Fib() { b=a+b; return (a=b-a); }
double Uni() { return Kiss() * 2.328306e-10; }
double Vni() { return long(Kiss()) * 4.656613e-10; }
double operator () () { return Uni(); }
Unlo operator () (Unlo n) {
return n == 1 ? 0 : Kiss() / (ULONG_MAX/n + 1);
}
double operator () (double Min, double Max) { return Min+Uni()*(Max-Min); }
};

int main()
{
unsigned long i, xx=0, seed=time(0);
long spent;
Rnd rn (seed, 2*seed, 3*seed, 4*seed, 5*seed, 6*seed);

spent=clock();
for(i=0; i<77777777; i++) xx += (rn.Kiss() + rn.Swb());
printf ("%ld \t", clock()-spent);

printf("\n");
}

/*****************************************************************/
/* C-macros */

#include <stdio.h>
#include <time.h>
#include <limits.h>

#define znew   (z=36969UL*(z&65535UL)+(z>>16))
#define wnew   (w=18000UL*(w&65535UL)+(w>>16))
#define MWC    ((znew<<16)+wnew )
#define SHR3  (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5))
#define CONG  (jcong=69069UL*jcong+1234567UL)
#define FIB   ((b=a+b),(a=b-a))
#define KISS  ((MWC^CONG)+SHR3)
#define UC    (unsigned char)
#define LFIB4 (c++,t[c]=t[c]+t[UC(c+58)]+t[UC(c+119)]+t[UC(c+178)])
#define SWB   (x = t[UC(c+15)], t[UC(c+237)] = x-(y=t[UC(c+1)]+(x<y)), t[++c])
#define UNI   (KISS*2.328306e-10)
#define VNI   ((long) KISS)*4.656613e-10
typedef unsigned long Un;
static Un z=362436069UL, w=521288629UL, jsr=123456789UL, jcong=380116160UL;
static Un a=224466889UL, b=7584631UL, t[256];
static Un x=0,y=0; static unsigned char c=0;
void settable(Un i1,Un i2,Un i3,Un i4,Un i5, Un i6)
{ int i; z=i1;w=i2,jsr=i3; jcong=i4; a=i5; b=i6;
for(i=0;i<256;i=i+1)  t[i]=KISS;
}

int main()
{
unsigned long i, xx=0, seed=time(0);
long spent;

settable (seed, 2*seed, 3*seed, 4*seed, 5*seed, 6*seed);

spent=clock();
for(i=0; i<77777777; i++) xx += (KISS + SWB);
printf ("%ld \t", clock()-spent);

printf("\n");
return 0;
}

User-Agent: slrn/0.9.5.4 (UNIX)

Subject: Re: Random numbers for C: Improvements.
Date: 21 Jan 1999 18:01:49 GMT
From: davis@space.mit.edu (John E. Davis)
Message-ID: <slrn7aeqsa.bkc.davis@aluche.mit.edu>
References: <124123g26m11754.1356cjx442854@statgen.ncsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 38

On 21 Jan 1999 04:40:53 GMT, Dmitri Zaykin <zaykin@statgen.ncsu.edu>
wrote:
>(1) Executable size (conclusion: C++ code smaller)
[...]
>(2) Timing in 10 experiments (conclusion: C++ code faster)

Using your code (t.cc and t.c), my conclusion is the opposite:

Script started on Thu Jan 21 12:54:02 1999
\$ gcc -Winline -O2 -s t.c
\$ ls -l a.out
-rwxr-xr-x   1 davis    asc         8752 Jan 21 12:54 a.out*
\$ ./a.out
21700000
\$ g++ -Winline -O2 -s t.cc
t.cc: In method `Rnd::Rnd(long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int, long unsigned int)':
t.cc:29: warning: can't inline call to `long unsigned int Rnd::Kiss()'
t.cc:20: warning: called from here
\$ ls -l a.out
-rwxr-xr-x   1 davis    asc         8880 Jan 21 12:55 a.out*
\$ ./a.out
25250000
\$ uname -a
SunOS wiwaxia 5.6 Generic_105181-08 sun4u sparc SUNW,Ultra-2
\$ gcc -v
gcc version 2.8.1
\$ g++ -v
gcc version 2.8.1
\$ exit
exit

script done on Thu Jan 21 12:56:40 1999
--
John E. Davis                   Center for Space Research/AXAF Science Center
617-258-8119                    One Hampshire St., Building NE80-6019
http://space.mit.edu/~davis     Cambridge, MA  02139-4307

Subject: Re: Random numbers for C: Improvements.
Date: 21 Jan 1999 16:19:55 -0500
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <7btjsuxvo.fsf@poole.statgen.ncsu.edu>
References: <slrn7aeqsa.bkc.davis@aluche.mit.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 11

davis@space.mit.edu (John E. Davis) writes:
> Using your code (t.cc and t.c), my conclusion is the opposite:
>
> gcc version 2.8.1

I used egcs-1.1.1 compiler. It is is an improvement over gcc-2.8.1

http://egcs.cygnus.com/egcs-1.1/egcs-1.1.1.html

Dmitri

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 1999 21:12:35 GMT
From: jmccarty@sun1307.spd.dsccc.com (Mike McCarty)
Message-ID: <780843\$ikj\$1@relay1.dsccc.com>
References: <cUwZzZKf1ety-pn2-vB2YCWNn5fgR@mfs-pci-bqh-vty163.as.wcom.net>
Newsgroups: sci.math.num-analysis,sci.stat.math,sci.math
Lines: 23

In article <cUwZzZKf1ety-pn2-vB2YCWNn5fgR@mfs-pci-bqh-vty163.as.wcom.net>,
David W. Noon <dwnoon@compuserve.com> wrote:

[snip]

)This is very admirable, but it is basically saying that the code
)should be in assembler. There goes portability.
)
)I don't mind translating all this stuff into assembler, but anybody
)not using an Intel 80486 or better running a 32-bit version of OS/2
)will not be able to use my code.
)
)The only other language I know of that has sufficiently powerful macro
)facilities to force inlining like C but with better syntax, is
)portable across many platforms, and produces efficient object code is
)PL/I. If anybody besides me can use such code, I will produce it

You haven't heard of C++?
--
----
char *p="char *p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);}
This message made from 100% recycled bits.
I don't speak for Alcatel      <- They make me say that.

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 1999 19:27:13 -0500
From: hrubin@odds.stat.purdue.edu (Herman Rubin)
Message-ID: <780jh1\$11ii@odds.stat.purdue.edu>
References: <780843\$ikj\$1@relay1.dsccc.com>
Newsgroups: sci.math.num-analysis,sci.stat.math,sci.math
Lines: 45

In article <780843\$ikj\$1@relay1.dsccc.com>,
Mike McCarty <jmccarty@sun1307.spd.dsccc.com> wrote:
>In article <cUwZzZKf1ety-pn2-vB2YCWNn5fgR@mfs-pci-bqh-vty163.as.wcom.net>,
>David W. Noon <dwnoon@compuserve.com> wrote:

[snip]

>)This is very admirable, but it is basically saying that the code
>)should be in assembler. There goes portability.

Decent random number code cannot be portable.  Anything which involves
bit handling, as this should, suffers from the problem.  And besides
the problem of portability of code, there is the portability of results.
The order of generating two non-uniform random numbers, or the change
of size of a buffer, can do this with ease.

>)I don't mind translating all this stuff into assembler, but anybody
>)not using an Intel 80486 or better running a 32-bit version of OS/2
>)will not be able to use my code.

>)The only other language I know of that has sufficiently powerful macro
>)facilities to force inlining like C but with better syntax, is
>)portable across many platforms, and produces efficient object code is
>)PL/I. If anybody besides me can use such code, I will produce it

I doubt that PL/I can do the job.  But I would welcome trying it.

Let me give you an important tool; this is to generate the distance
to the next bit in a stream of random bits, move the pointer, and
interrupt if the stream becomes empty.  This is a TOOL; for some
purposes, it will be used more frequently than additions.

To illustrate what one can do with it, suppose we want to generate
random variables with density 4*x*(1-x) on (0,1).  Generate two
random variables A and B as above.  Let K be A/2, rounded up, and
let N be K+B.  Then change the N-th bit to the right of the binary
point of a uniform (0,1) random variable X to the opposite of the
K-th to get the result.  BTW, I believe you can see that to do this
with a floating point representation of X is quite difficult.

--
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Subject: Re: Random numbers for C: Improvements.
Date: Tue, 19 Jan 1999 18:09:39 -0500
From: dwnoon@compuserve.com (David W. Noon)
Message-ID: <cUwZzZKf1ety-pn2-deRYWonQPOXL@localhost>
References: <780jh1\$11ii@odds.stat.purdue.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 86

On Tue, 19 Jan 1999 00:27:13, hrubin@odds.stat.purdue.edu (Herman
Rubin) wrote:

> >In article <cUwZzZKf1ety-pn2-vB2YCWNn5fgR@mfs-pci-bqh-vty163.as.wcom.net>,
> >David W. Noon <dwnoon@compuserve.com> wrote:
>
> [snip]
>
> >)This is very admirable, but it is basically saying that the code
> >)should be in assembler. There goes portability.
>
> Decent random number code cannot be portable.  Anything which involves
> bit handling, as this should, suffers from the problem.  And besides
> the problem of portability of code, there is the portability of results.
> The order of generating two non-uniform random numbers, or the change
> of size of a buffer, can do this with ease.

I meant portable across software platforms, not hardware.

I would hope your code executes on any 32-bit Intel system the same
way, whether that system is running Windows NT, OS/2, Linux, any UNIX
variant or whatever.

> >)I don't mind translating all this stuff into assembler, but anybody
> >)not using an Intel 80486 or better running a 32-bit version of OS/2
> >)will not be able to use my code.
>
> >)The only other language I know of that has sufficiently powerful macro
> >)facilities to force inlining like C but with better syntax, is
> >)portable across many platforms, and produces efficient object code is
> >)PL/I. If anybody besides me can use such code, I will produce it
>
> I doubt that PL/I can do the job.  But I would welcome trying it.

It can certainly handle the code that you have posted so far, and with
ease.

> Let me give you an important tool; this is to generate the distance
> to the next bit in a stream of random bits,

Surely the distance to the next bit is always 1 position. Did you mean
something else?

I think you meant the next "interesting" bit, but I don't know what

> move the pointer, and
> interrupt if the stream becomes empty.  This is a TOOL; for some
> purposes, it will be used more frequently than additions.

But I live outside North America. I'm not allowed to use such tools.
[At least I _think_ you are alluding to cryptography.]

> To illustrate what one can do with it, suppose we want to generate
> random variables with density 4*x*(1-x) on (0,1).  Generate two
> random variables A and B as above.

You mean using the code you posted, or some equivalent?

> Let K be A/2, rounded up, and

Rounded up to an integer? Since the extrema of your distribution were
0 and 1, this would make the integer K almost always 1, and very
occasionally 0 (when A = 0).

> let N be K+B.

Is N also to be an integer? If not, what does "N-th bit to the right"
mean? If so, since B is potentially fractional is N rounded, forced up
or truncated to an integer value?

> Then change the N-th bit to the right of the binary
> point of a uniform (0,1) random variable X to the opposite of the
> K-th to get the result.  BTW, I believe you can see that to do this
> with a floating point representation of X is quite difficult.

Yes, it means denormalizing the floating point number, possibly
causing underflow if X is very small.

Do you want all of this implemented too? The little-endian byte
arrangement of the Intel could make it rather ugly, but it is
possible.

Regards

Dave

Subject: Re: Random numbers for C: Improvements.
Date: 19 Jan 1999 20:30:00 -0500
From: hrubin@odds.stat.purdue.edu (Herman Rubin)
Message-ID: <783bio\$12vk@odds.stat.purdue.edu>
References: <cUwZzZKf1ety-pn2-deRYWonQPOXL@localhost>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 117

In article <cUwZzZKf1ety-pn2-deRYWonQPOXL@localhost>,
David W. Noon <dwnoon@compuserve.com> wrote:
>On Tue, 19 Jan 1999 00:27:13, hrubin@odds.stat.purdue.edu (Herman
>Rubin) wrote:

>> >In article <cUwZzZKf1ety-pn2-vB2YCWNn5fgR@mfs-pci-bqh-vty163.as.wcom.net>,
>> >David W. Noon <dwnoon@compuserve.com> wrote:

[snip]

>> >)This is very admirable, but it is basically saying that the code
>> >)should be in assembler. There goes portability.

>> Decent random number code cannot be portable.  Anything which involves
>> bit handling, as this should, suffers from the problem.  And besides
>> the problem of portability of code, there is the portability of results.
>> The order of generating two non-uniform random numbers, or the change
>> of size of a buffer, can do this with ease.

>I meant portable across software platforms, not hardware.

>I would hope your code executes on any 32-bit Intel system the same
>way, whether that system is running Windows NT, OS/2, Linux, any UNIX
>variant or whatever.

It is by no means clear that HLL generated code will do this, as the
machine instructions involved are different.  I cannot see that there
is any more problem with the current assemblers, or with more
intelligently designed ones.  The design changes should be for
efficiency on the part of the programmer, not the machine.

>> >)I don't mind translating all this stuff into assembler, but anybody
>> >)not using an Intel 80486 or better running a 32-bit version of OS/2
>> >)will not be able to use my code.

>> >)The only other language I know of that has sufficiently powerful macro
>> >)facilities to force inlining like C but with better syntax, is
>> >)portable across many platforms, and produces efficient object code is
>> >)PL/I. If anybody besides me can use such code, I will produce it

>> I doubt that PL/I can do the job.  But I would welcome trying it.

>It can certainly handle the code that you have posted so far, and with
>ease.

>> Let me give you an important tool; this is to generate the distance
>> to the next bit in a stream of random bits,

What was meant was the distance to the next one in a stream of
random bits.  In other words, generate a geometric (.5) random
variable using only the number of bits required by information.

>Surely the distance to the next bit is always 1 position. Did you mean
>something else?

>I think you meant the next "interesting" bit, but I don't know what

>> move the pointer, and
>> interrupt if the stream becomes empty.  This is a TOOL; for some
>> purposes, it will be used more frequently than additions.

>But I live outside North America. I'm not allowed to use such tools.
>[At least I _think_ you are alluding to cryptography.]

What does this have to do with cryptography?  There are many uses for
this instruction in the computer literature, although none of the others
I have seen would use it that much.  It is used for locating the next
record, where the positions of records are stored in single bits.

>> To illustrate what one can do with it, suppose we want to generate
>> random variables with density 4*x*(1-x) on (0,1).  Generate two
>> random variables A and B as above.

>You mean using the code you posted, or some equivalent?

>> Let K be A/2, rounded up, and

>Rounded up to an integer? Since the extrema of your distribution were
>0 and 1, this would make the integer K almost always 1, and very
>occasionally 0 (when A = 0).

With the correction, A and B are positive integers; there is no
real problem.

>> let N be K+B.

>Is N also to be an integer? If not, what does "N-th bit to the right"
>mean? If so, since B is potentially fractional is N rounded, forced up
>or truncated to an integer value?

>> Then change the N-th bit to the right of the binary
>> point of a uniform (0,1) random variable X to the opposite of the
>> K-th to get the result.  BTW, I believe you can see that to do this
>> with a floating point representation of X is quite difficult.

>Yes, it means denormalizing the floating point number, possibly
>causing underflow if X is very small.

In fact, it cannot, unless the mantissa of the floating point number
is exceptionally long.  On most machines, it will have to be done
in the integer registers, anyhow.  And they may not be long enough;
but one does not always need that accurate random numbers.  At worst,
a bit past the end of the register will be targeted to be 1, and the
mantissa bits all cleared.

>Do you want all of this implemented too? The little-endian byte
>arrangement of the Intel could make it rather ugly, but it is
>possible.

It will not be as bad as that, but infinite precision methods are
not well suited to short registers.
--
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Subject: Re: Random numbers for C: Improvements.
Date: Tue, 19 Jan 1999 18:09:38 -0500
From: dwnoon@compuserve.com (David W. Noon)
Message-ID: <cUwZzZKf1ety-pn2-FJ7g8dchLbd1@localhost>
References: <780843\$ikj\$1@relay1.dsccc.com>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 48

On Mon, 18 Jan 1999 21:12:35, jmccarty@sun1307.spd.dsccc.com (Mike
McCarty) wrote:

> In article <cUwZzZKf1ety-pn2-vB2YCWNn5fgR@mfs-pci-bqh-vty163.as.wcom.net>,
> David W. Noon <dwnoon@compuserve.com> wrote:
>
> [snip]
>
> )This is very admirable, but it is basically saying that the code
> )should be in assembler. There goes portability.
> )
> )I don't mind translating all this stuff into assembler, but anybody
> )not using an Intel 80486 or better running a 32-bit version of OS/2
> )will not be able to use my code.
> )
> )The only other language I know of that has sufficiently powerful macro
> )facilities to force inlining like C but with better syntax, is
> )portable across many platforms, and produces efficient object code is
> )PL/I. If anybody besides me can use such code, I will produce it
>
> You haven't heard of C++?

Of course I have. I earn my living writing C++, including the
reprehensible MFC when the client demands it.

C++ just isn't ideally suited to the task under current
implementations of the language. I have already suggested (in another
post in this thread) templates as a substitute for method calls, to
try and get some speed improvement, but generally C++ compilers
produce code that is too bloated and too slow for high-volume number
crunching. [You might note the expression "produces efficient object
code" in my original post, which you have quoted above.]

You might ask yourself why "low level" coders use their C/C++
compilers in C mode. It is definitely not for more elegant syntax!

FORTRAN doesn't have a macro facility in present implementations.
Neither does Pascal or ALGOL 68. I have compilers for all of these,
but they won't do slick inlining.

So, what does that leave? COBOL? No, it lacks macros too.

That leaves the systems programmer's two faithful friends: PL/I and
assembler.

Regards

Dave

Subject: Re: Random numbers for C: Improvements.
Date: 20 Jan 1999 07:49:11 GMT
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <7841pn\$1ha\$1@uni00nw.unity.ncsu.edu>
References: <cUwZzZKf1ety-pn2-FJ7g8dchLbd1@localhost>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 15

In sci.stat.math David W. Noon <dwnoon@compuserve.com> wrote:
> I have already suggested (in another post in this thread) templates
> as a substitute for method calls, to try and get some speed
> improvement (...)

There are no calls for inlined methods.

Templates can be an alternative to C-macros in terms of
genericity, not inlining. C++ templates do not have to be
inlined. They do not offer anything in terms of speed
except as a substitute for C++ inheritance (again, as a
faster tool for doing generic stuff) which is not an
issue here.

Dmitri

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 1999 21:11:32 GMT
From: jmccarty@sun1307.spd.dsccc.com (Mike McCarty)
Message-ID: <780824\$ie4\$1@relay1.dsccc.com>
References: <77t9ot\$15ha@b.stat.purdue.edu>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 51

In article <77t9ot\$15ha@b.stat.purdue.edu>,
Herman Rubin <hrubin@b.stat.purdue.edu> wrote:

[snip]

)I think it should be clarified, and probably written out in some
)more detail.  But the procedure call overhead would be comparable
)to the computing cost; C, and all other languages, have such great
)built-in inefficiencies that what is needed is something written
)from the standpoint of mathematics and efficiency.

What do you mean by this? *No* language has built in inefficiencies. I
defy you to find anywhere in the ANSI/ISO C definition *any* statement
about something being no more than so-and-so efficient.

Is assembler a language?

The efficiency of emitted code is a matter of quality of
implementation, not of language. I wrote a compiler several years ago,
intended for sale (but the project was cancelled) which automatically
inlined procedures (or functions) which were called only once. The
first target language for sale (it was to have different front ends)
was C. C++ has a specific means for requesting that a function be
expanded inline in every invocation.

Now some *processors* are more efficient than others, but I don't think
you mean that.

)But even the use of a comparison is a costly operation, if the
)result is to be used promptly.  Any form of random, pseudo-random,
)or quasi-random numbers should be done in good integer form,

An amazing statement, given that there are machines available today
which do floating arithmetic faster than integer.

)not present now on many machines because of bad design coming
)from bad languages which have thrown out the natural properties
)of computers, and attempts to keep stupid people from making
)mistakes.  An idiot-proof language is only for idiots.

This last statement makes no sense to me at all. There are no
"idiot-proof languages". Some languages have more checking built into
them, but this in no way should affect the quality of the generated
code.

Mike
--
----
char *p="char *p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);}
This message made from 100% recycled bits.
I don't speak for Alcatel      <- They make me say that.

Subject: Re: Random numbers for C: Improvements.
Date: 18 Jan 1999 19:14:30 -0500
From: hrubin@odds.stat.purdue.edu (Herman Rubin)
Message-ID: <780ip6\$11h8@odds.stat.purdue.edu>
References: <780824\$ie4\$1@relay1.dsccc.com>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 88

In article <780824\$ie4\$1@relay1.dsccc.com>,
Mike McCarty <jmccarty@sun1307.spd.dsccc.com> wrote:
>In article <77t9ot\$15ha@b.stat.purdue.edu>,
>Herman Rubin <hrubin@b.stat.purdue.edu> wrote:

[snip]

>)I think it should be clarified, and probably written out in some
>)more detail.  But the procedure call overhead would be comparable
>)to the computing cost; C, and all other languages, have such great
>)built-in inefficiencies that what is needed is something written
>)from the standpoint of mathematics and efficiency.

>What do you mean by this? *No* language has built in inefficiencies. I
>defy you to find anywhere in the ANSI/ISO C definition *any* statement
>about something being no more than so-and-so efficient.

A language has a built in inefficiency in every situation in which
the optimal use of machine instructions cannot be addressed in that
language.  It has inefficiencies in all situations where workarounds

>Is assembler a language?

Currently, a very difficult one to use.  This is because the computer
people have deliberately tried to make it so.  There is no reason why
a user cannot reconfigure the "mnemonics" (they are generally atrocious)
so that they can be easily written.

>The efficiency of emitted code is a matter of quality of
>implementation, not of language. I wrote a compiler several years ago,
>intended for sale (but the project was cancelled) which automatically
>inlined procedures (or functions) which were called only once. The
>first target language for sale (it was to have different front ends)
>was C. C++ has a specific means for requesting that a function be
>expanded inline in every invocation.

More than this is needed.  The implementations do not take into
account what the intelligent programmer knows; they will not allow
this input.

>Now some *processors* are more efficient than others, but I don't think
>you mean that.

No, I do not.

>)But even the use of a comparison is a costly operation, if the
>)result is to be used promptly.  Any form of random, pseudo-random,
>)or quasi-random numbers should be done in good integer form,

>An amazing statement, given that there are machines available today
>which do floating arithmetic faster than integer.

This is a design catastrophe, partly due to the bad languages.  There
is no such thing as floating point architecture, but architecture
which does a combination of operations for badly designed floating
point arithmetic.  With almost no more cost, this arithmetic could
be made available for fixed point.  BTW, there is great need for
fixed point non-integer arithmetic.  And if increased precision is
needed, it is necessary to emulate fixed point with floating, which
is quite clumsy.

In addition, the fundamental random input should be a bit stream.
Try using such.  The hardware to do it in reasonable time is not
present on any machine.

>)not present now on many machines because of bad design coming
>)from bad languages which have thrown out the natural properties
>)of computers, and attempts to keep stupid people from making
>)mistakes.  An idiot-proof language is only for idiots.

>This last statement makes no sense to me at all. There are no
>"idiot-proof languages". Some languages have more checking built into
>them, but this in no way should affect the quality of the generated
>code.

I suggest that this checking be made strictly voluntary.  I know
what I want the computer to do, and how to use machine instructions
to do it, IF I can get a list of the machine instructions.  The
language has no such capability.

Checking is extremely expensive.  It is now largely a matter of
hardware; user hardware-style interrupts are needed.
--
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558

Subject: Re: Random numbers for C: Improvements.
Date: 21 Jan 1999 19:07:19 GMT
From: jmccarty@sun1307.spd.dsccc.com (Mike McCarty)
Message-ID: <787tt7\$gr\$1@relay1.dsccc.com>
References: <780ip6\$11h8@odds.stat.purdue.edu>
Newsgroups: sci.math,sci.math.num-analysis,sci.stat.math
Lines: 34

In article <780ip6\$11h8@odds.stat.purdue.edu>,
Herman Rubin <hrubin@odds.stat.purdue.edu> wrote:
)In article <780824\$ie4\$1@relay1.dsccc.com>,
)Mike McCarty <jmccarty@sun1307.spd.dsccc.com> wrote:
)>In article <77t9ot\$15ha@b.stat.purdue.edu>,
)>Herman Rubin <hrubin@b.stat.purdue.edu> wrote:

[snip]

Much of what you wrote here struck me as unsubstantiated opinion. None
of it appeared to be worth attempting to argue over.

)>This last statement makes no sense to me at all. There are no
)>"idiot-proof languages". Some languages have more checking built into
)>them, but this in no way should affect the quality of the generated
)>code.
)
)I suggest that this checking be made strictly voluntary.  I know
)what I want the computer to do, and how to use machine instructions
)to do it, IF I can get a list of the machine instructions.  The
)language has no such capability.
)
)Checking is extremely expensive.  It is now largely a matter of
)hardware; user hardware-style interrupts are needed.

The checking I had in mind costs *nothing* during the execution of the
program.

Mike
--
----
char *p="char *p=%c%s%c;main(){printf(p,34,p,34);}";main(){printf(p,34,p,34);}
This message made from 100% recycled bits.
I don't speak for Alcatel      <- They make me say that.

Subject: Re: Random numbers for C: Improvements.
Date: 16 Jan 1999 17:56:05 -0500
From: zaykin@statgen.ncsu.edu (Dmitri Zaykin)
Message-ID: <7k8ym6d4q.fsf@poole.statgen.ncsu.edu>
References: <369F6FCA.74C7C041@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.num-analysis
Lines: 12

I see one more problem with the code.

The table is declared as "t[256]". Then expressions like t[++c] are
safe assuming that the unsigned char c "wraps around" and becomes zero
after 255.

However, expressions like c+237 evaluate to integer and have to be
casted to the unsigned char explicitly, e.g. "t[(unsigned char)(c+237)]",
should be used instead of t[c+237] (the later is accessing beyond the
array bounds for c>18).

Dmitri

Subject: Re: Random numbers for C (and assembly)
Date: Tue, 19 Jan 1999 11:10:17 GMT
From: qscgz@my-dejanews.com
Message-ID: <781p6l\$lrv\$1@nnrp1.dejanews.com>
References: <369F6FCA.74C7C041@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math,sci.math.numer-analysis
Lines: 58

George Marsaglia wrote about these PRNGs:

>MWC:((z=36969*(z&65535)+(z>>16))<<16)+((w=18000*(w&65535)+(w>>16))&65535)
>SHR3:(jsr=(jsr=(jsr=jsr^(jsr<<17))^(jsr>>13))^(jsr<<5))
>CONG:(jcong=69069*jcong+1234567)
>LFIB4:(t[c]=t[c]+t[c+58]+t[c+119]+t[c+178],t[++c])
>SWB:(t[c+237]=(x=t[c+15])-(y=t[c]+(x<y)),t[++c])

DIEHARD-tests fail on :
MWC (z only) :  none
SHR3 : brank,opso,oqso,dna,count
LFIB4: none
CONG (16 highbits) : opso,oqso,dna
SWB: bday,brank,count

In x86-assembly :( using:  a for eax , c for ecx  ,  _  for sbb , # for adc )

bytes required  |  MWC(8):ax=36969  ax*mwc[4]  ax+mwc[2]  dx#0  mwc[4]=ax
mwc[2]=dx  ax=18000  ax*[mwc]  ax+mwc[6]  dx#0	[mwc]=ax  mwc[6]=dx
SHR3(4):a=[shr3]  c=a  a< < 17	a^c  c=a  a>>13  a^c  c=a  a< < 5  a^c
[shr3]=a  CONG(4):a=69069  a*[cong]  a+1234567	[cong]=a
LFIB4(1032):c=lfib4[4]	a=lfib4[4c+8]  cl+58  a+lfib4[4c+8]  cl+61
a+lfib4[4c+8]  cl+60  a+lfib4[4c+8]  cl+77  lfib4[4c+8]=a  cl+1  lfib4[4]=c
[lfib4]=a  SWB(1032):c=swb[4]  cl+15  a=swb[4c+8]  cl-14  a_swb[4c+8]  cl+236
swb[4c+8]=a  cl-236  swb[4]=c	[swb]=a

I estimate:
29,8,12,12,10  cyles  on a P5   for  MWC,SHR3,CONG,LFIB4,SWB
13,7,7,8,6     cycles on a P6 (P-II)

optimized with some calculations in parallel:
23,6,11,8,5    cycles on a P5
7,4,3,7,4      cycles on a P6

(these are only untested estimates)

for compound generators add the cycles of the parts.

( on a P6, the LFIB4 and SWB code should be changed (no cl),
to avoid partial register stalls (slow). )

what is the fastest PRNG , that passes all tests ?
I think that 1 or 2 bytes per cycle are possible.
So, you could call a small assembly routine to fill an
array with ~500 random-bytes in 1microsec. and then use
it in C, if you prefer C.

qscgz@aol.com

Subject: Random numbers for C: The END?
Date: Wed, 20 Jan 1999 10:55:14 -0500
From: George Marsaglia <geo@stat.fsu.edu>
Message-ID: <36A5FC62.17C9CC33@stat.fsu.edu>
Newsgroups: sci.stat.math,sci.math
Lines: 301

My offer of RNG's for C was an invitation to dance;
I did not expect the Tarantella.  I hope this post will
stop the music, or at least slow it to a stately dance
for language chauvinists and software police---under

In response to a number of requests for good RNG's in
C, and mindful of the desirability of having a variety
of methods readily available, I offered several. They
were implemented as in-line functions using the #define
feature of C.

Numerous responses have led to improvements; the result
is the listing below, with comments describing the
generators.

I thank all the experts who contributed suggestions, either
directly to me or as part of the numerous threads.

It seems necessary to use a (circular) table in order
to get extremely long periods for some RNG's. Each new
number is some combination of the previous r numbers, kept
in the circular table.  The circular table has to keep
at least the last r, but possible more than r, numbers.

For speed, an  8-bit index seems best for accessing
members of the table---at least for Fortran, where an
8-bit integer is readily  available via integer*1, and
arithmetic on the index is automatically mod 256
(least-absolute-residue).

Having little experience with C, I got out my little
(but BIG) Kernighan and Ritchie book to see if there
were an 8-bit integer type. I found none, but I did
find char and unsigned char: one byte. Furthemore, K&R
said arithmetic on characters was ok. That, and a study
of the #define examples, led me to propose #define's
for in-line generators LFIB4 and SWB, with monster
periods. But it turned out that char arithmetic jumps
"out of character", other than for simple cases such as
c++ or c+=1.   So, for safety, the index arithmetic
below is kept in character by the UC definition.

Another improvement on the original version  takes
advantage of the comma operator, which, to my chagrin,
I had not seen in K&R.  It is there, but only with an
example of (expression,expression).  From the advice of
contributors, I found that the comma operator allows
(expression,...,expression,expression) with the
last expression determining the value.   That makes it
much easier to create in-line functions via #define
(see SHR3, LFIB4, SWB and FIB below).

The improved #define's are listed below, with a
function to initialize the table and a main program
that calls each of the in-line functions one million
times and then  compares the result to what I got with
a DOS version of gcc.   That main program can serve
as a test to see if your system produces the same
results as mine.
_________________________________________
|If you run the program below, your output|
| should be  seven lines, each a 0 (zero).|
-----------------------------------------

in the philosophical aspects of computer languages,
but want to know: what is the use of this stuff?
Here are simple examples of the use of the in-line
functions:  Include the #define's in your program, with
the accompanying static variable declarations, and a
procedure, such as the example, for initializing
the static variable (seeds) and the table.

Then any one of those in-line functions, inserted
in a C expression, will provide a random 32-bit
integer, or a random float if UNI or VNI is used.
For example, KISS&255; would provide a random byte,
while 5.+2.*UNI; would provide a random real (float)
from 5 to 7. Or  1+MWC%10; would provide the
proverbial "take a number from 1 to 10",
(but with not quite, but virtually, equal
probabilities).
More generally, something such as 1+KISS%n; would
provide a practical uniform random choice from 1 to n,
if n is not too big.

A key point is: a wide variety of very fast, high-
quality, easy-to-use RNG's are available by means of
the nine in-line functions below, used individually or
in combination.

The comments after the main test program describe the
generators. These descriptions are much as in the first
post, for those who missed them. Some of the
generators (KISS, MWC, LFIB4) seem to pass all tests of
randomness, particularly the DIEHARD battery of tests,
and combining virtually any two or more of them should
provide fast, reliable, long period generators. (CONG
or FIB alone and CONG+FIB are suspect, but quite useful
in combinations.)

Serious users of random numbers may want to
run their simulations with several different
generators, to see if they get consistent results.
These #define's may make it easy to do.
Bonne chance,
George Marsaglia

The C code follows---------------------------------:

#include <stdio.h>
#define znew   (z=36969*(z&65535)+(z>>16))
#define wnew   (w=18000*(w&65535)+(w>>16))
#define MWC    ((znew<<16)+wnew )
#define SHR3  (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5))
#define CONG  (jcong=69069*jcong+1234567)
#define FIB   ((b=a+b),(a=b-a))
#define KISS  ((MWC^CONG)+SHR3)
#define LFIB4 (c++,t[c]=t[c]+t[UC(c+58)]+t[UC(c+119)]+t[UC(c+178)])
#define SWB   (c++,bro=(x<y),t[c]=(x=t[UC(c+34)])-(y=t[UC(c+19)]+bro))
#define UNI   (KISS*2.328306e-10)
#define VNI   ((long) KISS)*4.656613e-10
#define UC    (unsigned char)  /*a cast operation*/
typedef unsigned long UL;

/*  Global static variables: */
static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;
static UL a=224466889, b=7584631, t[256];
/* Use random seeds to reset z,w,jsr,jcong,a,b, and the table t[256]*/

static UL x=0,y=0,bro; static unsigned char c=0;

/* Example procedure to set the table, using KISS: */
void settable(UL i1,UL i2,UL i3,UL i4,UL i5, UL i6)
{ int i; z=i1;w=i2,jsr=i3; jcong=i4; a=i5; b=i6;
for(i=0;i<256;i=i+1)  t[i]=KISS;
}

/* This is a test main program.  It should compile and print 7  0's. */
int main(void){
int i; UL k;
settable(12345,65435,34221,12345,9983651,95746118);

for(i=1;i<1000001;i++){k=LFIB4;} printf("%u\n", k-1064612766U);
for(i=1;i<1000001;i++){k=SWB  ;} printf("%u\n", k- 627749721U);
for(i=1;i<1000001;i++){k=KISS ;} printf("%u\n", k-1372460312U);
for(i=1;i<1000001;i++){k=CONG ;} printf("%u\n", k-1529210297U);
for(i=1;i<1000001;i++){k=SHR3 ;} printf("%u\n", k-2642725982U);
for(i=1;i<1000001;i++){k=MWC  ;} printf("%u\n", k- 904977562U);
for(i=1;i<1000001;i++){k=FIB  ;} printf("%u\n", k-3519793928U);
}
/*-----------------------------------------------------
Write your own calling program and try one or more of
the above, singly or in combination, when you run a
simulation. You may want to change the simple 1-letter
names, to avoid conflict with your own choices.        */

/* All that follows is comment, mostly from the initial
post. You may want to remove it */

/* Any one of KISS, MWC, FIB, LFIB4, SWB, SHR3, or CONG
can be used in an expression to provide a random 32-bit
integer.

The KISS generator, (Keep It Simple Stupid), is
designed to combine the two multiply-with-carry
generators in MWC with the 3-shift register SHR3 and
the congruential generator CONG, using addition and
It is one of my favorite generators.

The  MWC generator concatenates two 16-bit multiply-
with-carry generators, x(n)=36969x(n-1)+carry,
y(n)=18000y(n-1)+carry  mod 2^16, has period about
2^60 and seems to pass all tests of randomness. A
favorite stand-alone generator---faster than KISS,
which contains it.

FIB is the classical Fibonacci sequence
x(n)=x(n-1)+x(n-2),but taken modulo 2^32.
Its period is 3*2^31 if one of its two seeds is odd
and not 1 mod 8. It has little worth as a RNG by
itself, but provides a simple and fast component for
use in combination generators.

SHR3 is a 3-shift-register generator with period
2^32-1. It uses y(n)=y(n-1)(I+L^17)(I+R^13)(I+L^5),
with the y's viewed as binary vectors, L the 32x32
binary matrix that shifts a vector left 1, and R its
transpose.  SHR3 seems to pass all except those
related to the binary rank test, since 32 successive
values, as binary vectors, must be linearly
independent, while 32 successive truly random 32-bit
integers, viewed as binary vectors, will be linearly
independent only about 29% of the time.

CONG is a congruential generator with the widely used 69069
multiplier: x(n)=69069x(n-1)+1234567.  It has period
2^32. The leading half of its 32 bits seem to pass
tests, but bits in the last half are too regular.

LFIB4 is an extension of what I have previously
defined as a lagged Fibonacci generator:
x(n)=x(n-r) op x(n-s), with the x's in a finite
set over which there is a binary operation op, such
as +,- on integers mod 2^32, * on odd such integers,
exclusive-or(xor) on binary vectors. Except for
those using multiplication, lagged Fibonacci
generators fail various tests of randomness, unless
the lags are very long. (See SWB below).
To see if more than two lags would serve to overcome
the problems of 2-lag generators using +,- or xor, I
have developed the 4-lag generator LFIB4 using
mod 2^32. Its period is 2^31*(2^256-1), about 2^287,
and it seems to pass all tests---in particular,
those of the kind for which 2-lag generators using
+,-,xor seem to fail.  For even more confidence in
its suitability,  LFIB4 can be combined with KISS,
with a resulting period of about 2^410: just use
(KISS+LFIB4) in any C expression.

SWB is a subtract-with-borrow generator that I
developed to give a simple method for producing
extremely long periods:
x(n)=x(n-222)-x(n-237)- borrow mod 2^32.
The 'borrow' is 0, or set to 1 if computing x(n-1)
caused overflow in 32-bit integer arithmetic. This
generator has a very long period, 2^7098(2^480-1),
about 2^7578.   It seems to pass all tests of
randomness, except for the Birthday Spacings test,
which it fails badly, as do all lagged Fibonacci
generators using +,- or xor. I would suggest
combining SWB with KISS, MWC, SHR3, or CONG.
KISS+SWB has period >2^7700 and is highly
recommended.
Subtract-with-borrow has the same local behaviour
as lagged Fibonacci using +,-,xor---the borrow
merely provides a much longer period.
SWB fails the birthday spacings test, as do all
lagged Fibonacci and other generators that merely
combine two previous values by means of =,- or xor.
Those failures are for a particular case: m=512
birthdays in a year of n=2^24 days. There are
choices of m and n for which lags >1000 will also
fail the test.  A reasonable precaution is to always
combine a 2-lag Fibonacci or SWB generator with
another kind of generator, unless the generator uses
*, for which a very satisfactory sequence of odd
32-bit integers results.

The classical Fibonacci sequence mod 2^32 from FIB
fails several tests.  It is not suitable for use by
itself, but is quite suitable for combining with
other generators.

The last half of the bits of CONG are too regular,
and it fails tests for which those bits play a
significant role. CONG+FIB will also have too much
regularity in trailing bits, as each does. But keep
in mind that it is a rare application for which
the trailing bits play a significant role.  CONG
is one of the most widely used generators of the
last 30 years, as it was the system generator for
VAX and was incorporated in several popular
software packages, all seemingly without complaint.

Finally, because many simulations call for uniform
random variables in 0<x<1 or -1<x<1, I use #define
statements that permit inclusion of such variates
directly in expressions:  using UNI will provide a
uniform random real (float) in (0,1), while VNI will
provide one in (-1,1).

All of these: MWC, SHR3, CONG, KISS, LFIB4, SWB, FIB
UNI and VNI, permit direct insertion of the desired
random quantity into an expression, avoiding the
time and space costs of a function call. I call
these in-line-define functions.  To use them, static
variables z,w,jsr,jcong,a and b should be assigned
seed values other than their initial values.  If
LFIB4 or SWB are used, the static table t[256] must
be initialized.

A note on timing:  It is difficult to provide exact
time costs for inclusion of one of these in-line-
define functions in an expression.  Times may differ
widely for different compilers, as the C operations
may be deeply nested and tricky. I suggest these
rough comparisons, based on averaging ten runs of a
routine that is essentially a long loop:
for(i=1;i<10000000;i++) L=KISS; then with KISS
replaced with SHR3, CONG,... or KISS+SWB, etc. The
times on my home PC, a Pentium 300MHz, in nanoseconds:
FIB 49;LFIB4 77;SWB 80;CONG 80;SHR3 84;MWC 93;KISS 157;
VNI 417;UNI 450;
*/

```

Terry Ritter, his current address, and his top page.

Last updated: 1999-02-18