T. Valko said:
Does that mean there are over 10^13 permutations of RAND()?
[....]
I just tested 100 million iterations (in Excel 2007)
trying to get a result of exactly 0.
The short answers are....
I believe we can say that the "period" of the RNG is "over 10^13", as long
as no algorithm parameter in KB 828795 (IX, IY or IZ) is zero initially.
That is, the entire sequence does not repeat itself until "over 10^13"
RAND
calls.
And....
I do not believe the current Excel 2003 RAND algorithm would result in
exactly zero, barring defects, if
http://support.microsoft.com/kb/828795
is
correct.
The long answers follow, in reverse order.
I just tested 100 million iterations (in Excel 2007)
trying to get a result of exactly 0.
I do not believe the current Excel 2003 RAND algorithm would result in
exactly zero, barring defects, if
http://support.microsoft.com/kb/828795
is
correct.
It could take as many as about 2.78E+13 iterations to know for sure, based
on the algorithm and parameters published in KB 828795. I don't have the
patience to wait as long as 60 days on my computer ;-).
On second thought, I believe it requires at most about 1.10E+10 iterations
to try all __relevant__ possibilities. I think that would take "only" a
couple hours on my computer. Moreover, I can reduce that number
significantly. So it takes about 45 min for about 4.13E+09 itertions.
(See
results below.)
Mathematically, the formula cannot result in exactly zero unless all 3
parameters (IX, IY and IZ) are zero. That can happen only if the seeding
(initialization) algorithm sets all 3 parameters to zero at the outset.
(See the MOD explanation below.)
I doubt that would be permitted because that would require reseeding the
RNG
arbitrarily after the first RAND call in order to avoid returning zero ad
nauseum.
On the other hand, it might be possible for the formula to result in
exactly
zero due to the artifacts of floating-point arithmetic. IX/30269 +
IY/30307
+ IZ/30323 would have to result in exactly 1 or 2 when represented in
64-bit
floating-point.
But after trying all relevant combinations, I do not believe that is the
case. The closest sum is about 2.00000000000004, which, modulo 1, is
about
0.0000000000000359712259978551. FYI, the sum whose result, modulo 1, is
closest to 1 is about 0.999999999999964.
Note that all of the above is based on an implementation of the algorithm
described in KB 828795. I did not actually use Excel's RAND in my
analysis.
It is possible that Excel's RAND algorithm might make a special effort to
force infinitesimal decimal fractions to exactly zero before computing the
RAND result modulo 1, effectively changing 2.00000000000004 to exactly 2,
in
the spirit of the heuristics described under the title "Example When a
Value
Reaches Zero" in
http://support.microsoft.com/kb/78113 .
I doubt it; but anything is possible with Excel :-(. Still, I don't think
that particular heuristic applies here because Excel has no problem
subtracting 2 from 2.00000000000004 (as computed by the KB formula, the
result of which is infinitesimally different from the constant), resulting
in about 0.0000000000000359712259978551.
Nevertheless, it is also possible that the actual RAND algorithm differs
from KB 828795 in some way, be it the algorithm parameter values or the
algorithm itself. (I am trying to vet KB 828795 now.)
In any case, I do not know if all of the KB details apply to Excel 2007,
notwithstanding the statement in KB 828795 that they do. If you get the
impression that I do not trust KB articles, you're right ;-). I have seen
many mistakes, especially in KB articles that seek to "clarify"
computational behavior.
Also note that Excel 2010 uses a completely different algorithm for RAND
(Mersenne twister).
Does that mean there are over 10^13 permutations of RAND()?
I believe we can say that the "period" of the RNG is "over 10^13", as long
as no algorithm parameter in KB 828795 (IX, IY or IZ) is zero initially.
That is, the entire sequence does not repeat itself until "over 10^13"
RAND
calls.
The MOD formula for each parameter results in the maximum sequence,
excluding zero, if the initial parameter value is non-zero.
Given the algorithm parameters in the KB article (30269, 30307 and 30323),
there are about 2.78E+13 possible sums. I believe the period of the RNG
covers all possible sums, excluding those which can be derived only if one
or more parameters are zero (about 2.75E+09). (See below.)
Since the denominator of each term is a prime number, I believe the sums
are
unique mathematically.
However, I cannot say that all of the sums modulo 1 result in unique
floating-point values. I suspect they do. But theoretically, it is
possible that the result of different arithmetic expressions might have
the
same representation in 64-bit floating-point.
Thus, it is possible (but I doubt it) that the same floating-point value
might appear more than once in a sequence of "over 10^13" results. But
that
does not signal a restart of the entire RNG sequence.
On the other hand, if the seeding algorithm permitted one or more
parameters
to be zero at the outset, that would significantly reduce the period of
the
RNG, since the parameter(s) would remain zero unless the RNG is reseeded
arbitrarily.
For that reason, I suspect the seeding algorithm would not permit even one
of the parameters to be zero at the outset.
One final note....
Although RAND should not return a value exactly equal to 1, and
mathematically the algorithm cannot, it is possible that Excel might
__display__ some RAND results as 1.0...0 to 14 decimal places due to its
display limit of 15 significant digits. Likewise, it is possible that
some
infinitesimal RAND results might be __displayed__ as 0.0...0E+00 to 14
decimal places, even though they are not exactly zero.
But I doubt it.
First, it is possible that Excel's RAND algorithm might avoid that by
making
a special effort to ignore those results internally, reducing the period
of
the RNG by only a few.
Second, based on my empirical results above, I do not believe that RAND
produces results that close to 0 or 1. So I suspect this a non-issue.
----- original message -----