NOTE Most of the tests in DIEHARD return a p-value, which should be uniform on [0,1) if the input file contains truly independent random bits. Those p-values are obtained by p=1-F(X), where F is the assumed distribution of the sample random variable X---often normal. But that assumed F is often just an asymptotic approximation, for which the fit will be worst in the tails. Thus you should not be surprised with occasion- al p-values near 0 or 1, such as .0012 or .9983. When a bit stream really FAILS BIG, you will get p`s of 0 or 1 to six or more places. By all means, do not, as a Statistician might, think that a p < .025 or p> .975 means that the RNG has "failed the test at the .05 level". Such p`s happen among the hundreds that DIEHARD produces, even with good RNGs. So keep in mind that "p happens" Enter the name of the file to be tested. This must be a form="unformatted",access="direct" binary file of about 10-12 million bytes. Enter file name: deadbeef HERE ARE YOUR CHOICES: 1 Birthday Spacings 2 Overlapping Permutations 3 Ranks of 31x31 and 32x32 matrices 4 Ranks of 6x8 Matrices 5 Monkey Tests on 20-bit Words 6 Monkey Tests OPSO,OQSO,DNA 7 Count the 1`s in a Stream of Bytes 8 Count the 1`s in Specific Bytes 9 Parking Lot Test 10 Minimum Distance Test 11 Random Spheres Test 12 The Sqeeze Test 13 Overlapping Sums Test 14 Runs Test 15 The Craps Test 16 All of the above To choose any particular tests, enter corresponding numbers. Enter 16 for all tests. If you want to perform all but a few tests, enter corresponding numbers preceded by "-" sign. Tests are executed in the order they are entered. Enter your choices. 16 |-------------------------------------------------------------| | This is the BIRTHDAY SPACINGS TEST | |Choose m birthdays in a "year" of n days. List the spacings | |between the birthdays. Let j be the number of values that | |occur more than once in that list, then j is asymptotically | |Poisson distributed with mean m^3/(4n). Experience shows n | |must be quite large, say n>=2^18, for comparing the results | |to the Poisson distribution with that mean. This test uses | |n=2^24 and m=2^10, so that the underlying distribution for j | |is taken to be Poisson with lambda=2^30/(2^26)=16. A sample | |of 200 j''s is taken, and a chi-square goodness of fit test | |provides a p value. The first test uses bits 1-24 (counting | |from the left) from integers in the specified file. Then the| |file is closed and reopened, then bits 2-25 of the same inte-| |gers are used to provide birthdays, and so on to bits 9-32. | |Each set of bits provides a p-value, and the nine p-values | |provide a sample for a KSTEST. | |------------------------------------------------------------ | RESULTS OF BIRTHDAY SPACINGS TEST FOR deadbeef (no_bdays=1024, no_days/yr=2^24, lambda=16.00, sample size=500) Bits used mean chisqr p-value 1 to 24 15.56 30.8622 0.020757 2 to 25 15.87 27.2043 0.055138 3 to 26 15.93 11.8014 0.812009 4 to 27 15.91 22.7202 0.158585 5 to 28 15.81 12.4247 0.773765 6 to 29 15.86 11.8160 0.811149 7 to 30 15.46 20.2866 0.259832 8 to 31 15.56 33.2175 0.010577 9 to 32 15.83 20.3182 0.258280 degree of freedoms is: 17 --------------------------------------------------------------- p-value for KStest on those 9 p-values: 0.055895 |-------------------------------------------------------------| | THE OVERLAPPING 5-PERMUTATION TEST | |This is the OPERM5 test. It looks at a sequence of one mill-| |ion 32-bit random integers. Each set of five consecutive | |integers can be in one of 120 states, for the 5! possible or-| |derings of five numbers. Thus the 5th, 6th, 7th,...numbers | |each provide a state. As many thousands of state transitions | |are observed, cumulative counts are made of the number of | |occurences of each state. Then the quadratic form in the | |weak inverse of the 120x120 covariance matrix yields a test | |equivalent to the likelihood ratio test that the 120 cell | |counts came from the specified (asymptotically) normal dis- | |tribution with the specified 120x120 covariance matrix (with | |rank 99). This version uses 1,000,000 integers, twice. | |-------------------------------------------------------------| OPERM5 test for file (For samples of 1,000,000 consecutive 5-tuples) sample 1 chisquare=317.583661 with df=99; p-value= 0.000000 _______________________________________________________________ sample 2 chisquare=235.954157 with df=99; p-value= 0.000000 _______________________________________________________________ |-------------------------------------------------------------| |This is the BINARY RANK TEST for 31x31 matrices. The leftmost| |31 bits of 31 random integers from the test sequence are used| |to form a 31x31 binary matrix over the field {0,1}. The rank | |is determined. That rank can be from 0 to 31, but ranks< 28 | |are rare, and their counts are pooled with those for rank 28.| |Ranks are found for 40,000 such random matrices and a chisqu-| |are test is performed on counts for ranks 31,30,28 and <=28. | |-------------------------------------------------------------| Rank test for binary matrices (31x31) from deadbeef RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=28 201 211.4 0.513 0.513 r=29 5128 5134.0 0.007 0.520 r=30 23083 23103.0 0.017 0.538 r=31 11588 11551.5 0.115 0.653 chi-square = 0.653 with df = 3; p-value = 0.884 -------------------------------------------------------------- |-------------------------------------------------------------| |This is the BINARY RANK TEST for 32x32 matrices. A random 32x| |32 binary matrix is formed, each row a 32-bit random integer.| |The rank is determined. That rank can be from 0 to 32, ranks | |less than 29 are rare, and their counts are pooled with those| |for rank 29. Ranks are found for 40,000 such random matrices| |and a chisquare test is performed on counts for ranks 32,31,| |30 and <=29. | |-------------------------------------------------------------| Rank test for binary matrices (32x32) from deadbeef RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=29 223 211.4 0.634 0.634 r=30 5053 5134.0 1.278 1.913 r=31 23134 23103.0 0.041 1.954 r=32 11590 11551.5 0.128 2.082 chi-square = 2.082 with df = 3; p-value = 0.555 -------------------------------------------------------------- |-------------------------------------------------------------| |This is the BINARY RANK TEST for 6x8 matrices. From each of | |six random 32-bit integers from the generator under test, a | |specified byte is chosen, and the resulting six bytes form a | |6x8 binary matrix whose rank is determined. That rank can be| |from 0 to 6, but ranks 0,1,2,3 are rare; their counts are | |pooled with those for rank 4. Ranks are found for 100,000 | |random matrices, and a chi-square test is performed on | |counts for ranks 6,5 and (0,...,4) (pooled together). | |-------------------------------------------------------------| Rank test for binary matrices (6x8) from deadbeef bits 1 to 8 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 922 944.3 0.527 0.527 r=5 21645 21743.9 0.450 0.976 r=6 77433 77311.8 0.190 1.166 chi-square = 1.166 with df = 2; p-value = 0.558 -------------------------------------------------------------- bits 2 to 9 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 912 944.3 1.105 1.105 r=5 21656 21743.9 0.355 1.460 r=6 77432 77311.8 0.187 1.647 chi-square = 1.647 with df = 2; p-value = 0.439 -------------------------------------------------------------- bits 3 to 10 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 910 944.3 1.246 1.246 r=5 21764 21743.9 0.019 1.264 r=6 77326 77311.8 0.003 1.267 chi-square = 1.267 with df = 2; p-value = 0.531 -------------------------------------------------------------- bits 4 to 11 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 925 944.3 0.394 0.394 r=5 21676 21743.9 0.212 0.606 r=6 77399 77311.8 0.098 0.705 chi-square = 0.705 with df = 2; p-value = 0.703 -------------------------------------------------------------- bits 5 to 12 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 972 944.3 0.813 0.813 r=5 21661 21743.9 0.316 1.129 r=6 77367 77311.8 0.039 1.168 chi-square = 1.168 with df = 2; p-value = 0.558 -------------------------------------------------------------- bits 6 to 13 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 984 944.3 1.669 1.669 r=5 21768 21743.9 0.027 1.696 r=6 77248 77311.8 0.053 1.748 chi-square = 1.748 with df = 2; p-value = 0.417 -------------------------------------------------------------- bits 7 to 14 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 966 944.3 0.499 0.499 r=5 21717 21743.9 0.033 0.532 r=6 77317 77311.8 0.000 0.532 chi-square = 0.532 with df = 2; p-value = 0.766 -------------------------------------------------------------- bits 8 to 15 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 1009 944.3 4.433 4.433 r=5 21641 21743.9 0.487 4.920 r=6 77350 77311.8 0.019 4.939 chi-square = 4.939 with df = 2; p-value = 0.085 -------------------------------------------------------------- bits 9 to 16 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 990 944.3 2.212 2.212 r=5 21453 21743.9 3.892 6.103 r=6 77557 77311.8 0.778 6.881 chi-square = 6.881 with df = 2; p-value = 0.032 -------------------------------------------------------------- bits 10 to 17 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 987 944.3 1.931 1.931 r=5 21530 21743.9 2.104 4.035 r=6 77483 77311.8 0.379 4.414 chi-square = 4.414 with df = 2; p-value = 0.110 -------------------------------------------------------------- bits 11 to 18 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 970 944.3 0.699 0.699 r=5 21662 21743.9 0.308 1.008 r=6 77368 77311.8 0.041 1.049 chi-square = 1.049 with df = 2; p-value = 0.592 -------------------------------------------------------------- bits 12 to 19 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 969 944.3 0.646 0.646 r=5 21753 21743.9 0.004 0.650 r=6 77278 77311.8 0.015 0.665 chi-square = 0.665 with df = 2; p-value = 0.717 -------------------------------------------------------------- bits 13 to 20 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 933 944.3 0.135 0.135 r=5 21736 21743.9 0.003 0.138 r=6 77331 77311.8 0.005 0.143 chi-square = 0.143 with df = 2; p-value = 0.931 -------------------------------------------------------------- bits 14 to 21 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 919 944.3 0.678 0.678 r=5 21754 21743.9 0.005 0.683 r=6 77327 77311.8 0.003 0.686 chi-square = 0.686 with df = 2; p-value = 0.710 -------------------------------------------------------------- bits 15 to 22 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 870 944.3 5.846 5.846 r=5 21733 21743.9 0.005 5.852 r=6 77397 77311.8 0.094 5.945 chi-square = 5.945 with df = 2; p-value = 0.051 -------------------------------------------------------------- bits 16 to 23 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 941 944.3 0.012 0.012 r=5 21405 21743.9 5.282 5.294 r=6 77654 77311.8 1.515 6.808 chi-square = 6.808 with df = 2; p-value = 0.033 -------------------------------------------------------------- bits 17 to 24 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 910 944.3 1.246 1.246 r=5 21562 21743.9 1.522 2.768 r=6 77528 77311.8 0.605 3.372 chi-square = 3.372 with df = 2; p-value = 0.185 -------------------------------------------------------------- bits 18 to 25 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 923 944.3 0.480 0.480 r=5 21611 21743.9 0.812 1.293 r=6 77466 77311.8 0.308 1.600 chi-square = 1.600 with df = 2; p-value = 0.449 -------------------------------------------------------------- bits 19 to 26 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 960 944.3 0.261 0.261 r=5 21593 21743.9 1.047 1.308 r=6 77447 77311.8 0.236 1.545 chi-square = 1.545 with df = 2; p-value = 0.462 -------------------------------------------------------------- bits 20 to 27 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 928 944.3 0.281 0.281 r=5 21800 21743.9 0.145 0.426 r=6 77272 77311.8 0.020 0.447 chi-square = 0.447 with df = 2; p-value = 0.800 -------------------------------------------------------------- bits 21 to 28 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 867 944.3 6.328 6.328 r=5 21804 21743.9 0.166 6.494 r=6 77329 77311.8 0.004 6.498 chi-square = 6.498 with df = 2; p-value = 0.039 -------------------------------------------------------------- bits 22 to 29 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 937 944.3 0.056 0.056 r=5 21679 21743.9 0.194 0.250 r=6 77384 77311.8 0.067 0.318 chi-square = 0.318 with df = 2; p-value = 0.853 -------------------------------------------------------------- bits 23 to 30 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 942 944.3 0.006 0.006 r=5 21560 21743.9 1.555 1.561 r=6 77498 77311.8 0.448 2.009 chi-square = 2.009 with df = 2; p-value = 0.366 -------------------------------------------------------------- bits 24 to 31 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 986 944.3 1.841 1.841 r=5 21660 21743.9 0.324 2.165 r=6 77354 77311.8 0.023 2.188 chi-square = 2.188 with df = 2; p-value = 0.335 -------------------------------------------------------------- bits 25 to 32 RANK OBSERVED EXPECTED (O-E)^2/E SUM r<=4 959 944.3 0.229 0.229 r=5 21630 21743.9 0.597 0.825 r=6 77411 77311.8 0.127 0.953 chi-square = 0.953 with df = 2; p-value = 0.621 -------------------------------------------------------------- TEST SUMMARY, 25 tests on 100,000 random 6x8 matrices These should be 25 uniform [0,1] random variates: 0.558092 0.438883 0.530711 0.702982 0.557657 0.417192 0.766326 0.084634 0.032046 0.110023 0.591915 0.717250 0.931061 0.709806 0.051163 0.033235 0.185243 0.449262 0.461929 0.799879 0.038819 0.853180 0.366156 0.334836 0.621028 The KS test for those 25 supposed UNI's yields KS p-value = 0.415190 |-------------------------------------------------------------| | THE BITSTREAM TEST | |The file under test is viewed as a stream of bits. Call them | |b1,b2,... . Consider an alphabet with two "letters", 0 and 1| |and think of the stream of bits as a succession of 20-letter | |"words", overlapping. Thus the first word is b1b2...b20, the| |second is b2b3...b21, and so on. The bitstream test counts | |the number of missing 20-letter (20-bit) words in a string of| |2^21 overlapping 20-letter words. There are 2^20 possible 20| |letter words. For a truly random string of 2^21+19 bits, the| |number of missing words j should be (very close to) normally | |distributed with mean 141,909 and sigma 428. Thus | | (j-141909)/428 should be a standard normal variate (z score)| |that leads to a uniform [0,1) p value. The test is repeated | |twenty times. | |-------------------------------------------------------------| THE OVERLAPPING 20-TUPLES BITSTREAM TEST for deadbeef (20 bits/word, 2097152 words 20 bitstreams. No. missing words should average 141909.33 with sigma=428.00.) ---------------------------------------------------------------- BITSTREAM test results for deadbeef. Bitstream No. missing words z-score p-value 1 141678 -0.54 0.705571 2 142071 0.38 0.352814 3 141972 0.15 0.441793 4 142387 1.12 0.132200 5 141533 -0.88 0.810374 6 142334 0.99 0.160545 7 141875 -0.08 0.531965 8 142555 1.51 0.065704 9 142402 1.15 0.124846 10 141913 0.01 0.496579 11 141371 -1.26 0.895764 12 141884 -0.06 0.523596 13 141688 -0.52 0.697466 14 142087 0.42 0.339028 15 142206 0.69 0.244106 16 142219 0.72 0.234678 17 141250 -1.54 0.938280 18 142351 1.03 0.151050 19 141559 -0.82 0.793472 20 141611 -0.70 0.757109 ---------------------------------------------------------------- |-------------------------------------------------------------| | OPSO means Overlapping-Pairs-Sparse-Occupancy | |The OPSO test considers 2-letter words from an alphabet of | |1024 letters. Each letter is determined by a specified ten | |bits from a 32-bit integer in the sequence to be tested. OPSO| |generates 2^21 (overlapping) 2-letter words (from 2^21+1 | |"keystrokes") and counts the number of missing words---that | |is 2-letter words which do not appear in the entire sequence.| |That count should be very close to normally distributed with | |mean 141,909, sigma 290. Thus (missingwrds-141909)/290 should| |be a standard normal variable. The OPSO test takes 32 bits at| |a time from the test file and uses a designated set of ten | |consecutive bits. It then restarts the file for the next de- | |signated 10 bits, and so on. | |------------------------------------------------------------ | OPSO test for file deadbeef Bits used No. missing words z-score p-value 23 to 32 142028 0.4092 0.341194 22 to 31 141671 -0.8218 0.794412 21 to 30 141750 -0.5494 0.708639 20 to 29 142037 0.4402 0.329881 19 to 28 141879 -0.1046 0.541648 18 to 27 141659 -0.8632 0.805988 17 to 26 142012 0.3540 0.361656 16 to 25 141690 -0.7563 0.775268 15 to 24 142192 0.9747 0.164849 14 to 23 142239 1.1368 0.127812 13 to 22 141904 -0.0184 0.507332 12 to 21 141712 -0.6804 0.751890 11 to 20 141989 0.2747 0.391764 10 to 19 141906 -0.0115 0.504581 9 to 18 141737 -0.5942 0.723825 8 to 17 142440 1.8299 0.033633 7 to 16 142031 0.4196 0.337406 6 to 15 142208 1.0299 0.151529 5 to 14 141761 -0.5115 0.695493 4 to 13 141806 -0.3563 0.639196 3 to 12 142043 0.4609 0.322424 2 to 11 141771 -0.4770 0.683319 1 to 10 142195 0.9851 0.162295 ----------------------------------------------------------------- |------------------------------------------------------------ | | OQSO means Overlapping-Quadruples-Sparse-Occupancy | | The test OQSO is similar, except that it considers 4-letter| |words from an alphabet of 32 letters, each letter determined | |by a designated string of 5 consecutive bits from the test | |file, elements of which are assumed 32-bit random integers. | |The mean number of missing words in a sequence of 2^21 four- | |letter words, (2^21+3 "keystrokes"), is again 141909, with | |sigma = 295. The mean is based on theory; sigma comes from | |extensive simulation. | |------------------------------------------------------------ | OQSO test for file deadbeef Bits used No. missing words z-score p-value 28 to 32 141660 -0.8452 0.800997 27 to 31 142122 0.7209 0.235481 26 to 30 141513 -1.3435 0.910444 25 to 29 141979 0.2362 0.406651 24 to 28 141927 0.0599 0.476118 23 to 27 141436 -1.6045 0.945699 22 to 26 141836 -0.2486 0.598156 21 to 25 141856 -0.1808 0.571730 20 to 24 141581 -1.1130 0.867142 19 to 23 141419 -1.6621 0.951757 18 to 22 142101 0.6497 0.257934 17 to 21 142122 0.7209 0.235481 16 to 20 141973 0.2158 0.414560 15 to 19 141904 -0.0181 0.507208 14 to 18 141449 -1.5604 0.940672 13 to 17 141999 0.3040 0.380577 12 to 16 142001 0.3107 0.377997 11 to 15 141867 -0.1435 0.557049 10 to 14 141979 0.2362 0.406651 9 to 13 141799 -0.3740 0.645798 8 to 12 141862 -0.1604 0.563733 7 to 11 141907 -0.0079 0.503151 6 to 10 141595 -1.0655 0.856681 5 to 9 141444 -1.5774 0.942647 4 to 8 142123 0.7243 0.234439 3 to 7 142174 0.8972 0.184810 2 to 6 141986 0.2599 0.397471 1 to 5 141900 -0.0316 0.512615 ----------------------------------------------------------------- |------------------------------------------------------------ | | The DNA test considers an alphabet of 4 letters: C,G,A,T,| |determined by two designated bits in the sequence of random | |integers being tested. It considers 10-letter words, so that| |as in OPSO and OQSO, there are 2^20 possible words, and the | |mean number of missing words from a string of 2^21 (over- | |lapping) 10-letter words (2^21+9 "keystrokes") is 141909. | |The standard deviation sigma=339 was determined as for OQSO | |by simulation. (Sigma for OPSO, 290, is the true value (to | |three places), not determined by simulation. | |------------------------------------------------------------ | DNA test for file deadbeef Bits used No. missing words z-score p-value 31 to 32 142832 2.7217 0.003247 30 to 31 142117 0.6126 0.270072 29 to 30 141929 0.0580 0.476865 28 to 29 142331 1.2439 0.106775 27 to 28 142125 0.6362 0.262325 26 to 27 142082 0.5094 0.305253 25 to 26 142023 0.3353 0.368696 24 to 25 141888 -0.0629 0.525085 23 to 24 141830 -0.2340 0.592512 22 to 23 141827 -0.2429 0.595944 21 to 22 141890 -0.0570 0.522736 20 to 21 142170 0.7689 0.220965 19 to 20 141749 -0.4729 0.681875 18 to 19 142535 1.8456 0.032473 17 to 18 142459 1.6214 0.052461 16 to 17 141574 -0.9892 0.838711 15 to 16 141877 -0.0954 0.537989 14 to 15 142895 2.9076 0.001821 13 to 14 142197 0.8486 0.198056 12 to 13 142029 0.3530 0.362041 11 to 12 141971 0.1819 0.427824 10 to 11 141776 -0.3933 0.652952 9 to 10 141919 0.0285 0.488622 8 to 9 141724 -0.5467 0.707706 7 to 8 141801 -0.3196 0.625348 6 to 7 142153 0.7188 0.236135 5 to 6 141760 -0.4405 0.670213 4 to 5 142540 1.8604 0.031416 3 to 4 142470 1.6539 0.049075 2 to 3 141573 -0.9921 0.839431 1 to 2 141878 -0.0924 0.536817 ----------------------------------------------------------------- |-------------------------------------------------------------| | This is the COUNT-THE-1''s TEST on a stream of bytes. | |Consider the file under test as a stream of bytes (four per | |32 bit integer). Each byte can contain from 0 to 8 1''s, | |with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let | |the stream of bytes provide a string of overlapping 5-letter| |words, each "letter" taking values A,B,C,D,E. The letters are| |determined by the number of 1''s in a byte: 0,1,or 2 yield A,| |3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus| |we have a monkey at a typewriter hitting five keys with vari-| |ous probabilities (37,56,70,56,37 over 256). There are 5^5 | |possible 5-letter words, and from a string of 256,000 (over- | |lapping) 5-letter words, counts are made on the frequencies | |for each word. The quadratic form in the weak inverse of | |the covariance matrix of the cell counts provides a chisquare| |test: Q5-Q4, the difference of the naive Pearson sums of | |(OBS-EXP)^2/EXP on counts for 5- and 4-letter cell counts. | |-------------------------------------------------------------| Test result for the byte stream from deadbeef (Degrees of freedom: 5^4-5^3=2500; sample size: 2560000) chisquare z-score p-value 2555.05 0.779 0.218124 |-------------------------------------------------------------| | This is the COUNT-THE-1''s TEST for specific bytes. | |Consider the file under test as a stream of 32-bit integers. | |From each integer, a specific byte is chosen , say the left- | |most: bits 1 to 8. Each byte can contain from 0 to 8 1''s, | |with probabilitie 1,8,28,56,70,56,28,8,1 over 256. Now let | |the specified bytes from successive integers provide a string| |of (overlapping) 5-letter words, each "letter" taking values | |A,B,C,D,E. The letters are determined by the number of 1''s,| |in that byte: 0,1,or 2 ---> A, 3 ---> B, 4 ---> C, 5 ---> D, | |and 6,7 or 8 ---> E. Thus we have a monkey at a typewriter | |hitting five keys with with various probabilities: 37,56,70, | |56,37 over 256. There are 5^5 possible 5-letter words, and | |from a string of 256,000 (overlapping) 5-letter words, counts| |are made on the frequencies for each word. The quadratic form| |in the weak inverse of the covariance matrix of the cell | |counts provides a chisquare test: Q5-Q4, the difference of | |the naive Pearson sums of (OBS-EXP)^2/EXP on counts for 5- | |and 4-letter cell counts. | |-------------------------------------------------------------| Test results for specific bytes from deadbeef (Degrees of freedom: 5^4-5^3=2500; sample size: 256000) bits used chisquare z-score p-value 1 to 8 5871.05 47.674 0.000000 2 to 9 4379.78 26.584 0.000000 3 to 10 3747.06 17.636 0.000000 4 to 11 4041.33 21.798 0.000000 5 to 12 3002.01 7.100 0.000000 6 to 13 5932.47 48.543 0.000000 7 to 14 3250.60 10.615 0.000000 8 to 15 3240.68 10.475 0.000000 9 to 16 3836.15 18.896 0.000000 10 to 17 3441.34 13.313 0.000000 11 to 18 4099.70 22.623 0.000000 12 to 19 5522.19 42.740 0.000000 13 to 20 3351.42 12.041 0.000000 14 to 21 4016.20 21.442 0.000000 15 to 22 2709.50 2.963 0.001524 16 to 23 2733.61 3.304 0.000477 17 to 24 4416.43 27.102 0.000000 18 to 25 4717.93 31.366 0.000000 19 to 26 5440.76 41.589 0.000000 20 to 27 3415.55 12.948 0.000000 21 to 28 4385.30 26.662 0.000000 22 to 29 4483.32 28.048 0.000000 23 to 30 5334.94 40.092 0.000000 24 to 31 4478.94 27.986 0.000000 25 to 32 4364.83 26.373 0.000000 |-------------------------------------------------------------| | THIS IS A PARKING LOT TEST | |In a square of side 100, randomly "park" a car---a circle of | |radius 1. Then try to park a 2nd, a 3rd, and so on, each | |time parking "by ear". That is, if an attempt to park a car | |causes a crash with one already parked, try again at a new | |random location. (To avoid path problems, consider parking | |helicopters rather than cars.) Each attempt leads to either| |a crash or a success, the latter followed by an increment to | |the list of cars already parked. If we plot n: the number of | |attempts, versus k: the number successfully parked, we get a | |curve that should be similar to those provided by a perfect | |random number generator. Theory for the behavior of such a | |random curve seems beyond reach, and as graphics displays are| |not available for this battery of tests, a simple characteriz| |ation of the random experiment is used: k, the number of cars| |successfully parked after n=12,000 attempts. Simulation shows| |that k should average 3523 with sigma 21.9 and is very close | |to normally distributed. Thus (k-3523)/21.9 should be a st- | |andard normal variable, which, converted to a uniform varia- | |ble, provides input to a KSTEST based on a sample of 10. | |-------------------------------------------------------------| CDPARK: result of 10 tests on file deadbeef (Of 12000 tries, the average no. of successes should be 3523.0 with sigma=21.9) No. succeses z-score p-value 3518 -0.2283 0.590298 3546 1.0502 0.146807 3515 -0.3653 0.642555 3511 -0.5479 0.708135 3500 -1.0502 0.853193 3526 0.1370 0.445521 3492 -1.4155 0.921543 3553 1.3699 0.085365 3538 0.6849 0.246694 3533 0.4566 0.323972 Square side=100, avg. no. parked=3523.20 sample std.=18.67 p-value of the KSTEST for those 10 p-values: 0.005995 |-------------------------------------------------------------| | THE MINIMUM DISTANCE TEST | |It does this 100 times: choose n=8000 random points in a | |square of side 10000. Find d, the minimum distance between | |the (n^2-n)/2 pairs of points. If the points are truly inde-| |pendent uniform, then d^2, the square of the minimum distance| |should be (very close to) exponentially distributed with mean| |.995 . Thus 1-exp(-d^2/.995) should be uniform on [0,1) and | |a KSTEST on the resulting 100 values serves as a test of uni-| |formity for random points in the square. Test numbers=0 mod 5| |are printed but the KSTEST is based on the full set of 100 | |random choices of 8000 points in the 10000x10000 square. | |-------------------------------------------------------------| This is the MINIMUM DISTANCE test for file deadbeef Sample no. d^2 mean equiv uni 5 0.9566 1.2302 0.617655 10 0.6540 1.2402 0.481734 15 2.7219 1.1871 0.935142 20 0.7560 1.2219 0.532248 25 0.1244 1.0744 0.117545 30 2.4374 1.0841 0.913674 35 2.3509 1.1149 0.905836 40 0.3765 1.1346 0.315008 45 0.7634 1.1999 0.535713 50 1.1154 1.2772 0.674034 55 0.1839 1.2020 0.168753 60 1.8542 1.2679 0.844881 65 0.5708 1.3064 0.436544 70 2.3454 1.2894 0.905318 75 1.1240 1.2613 0.676868 80 0.0246 1.2316 0.024421 85 0.4125 1.2759 0.339358 90 1.9690 1.2920 0.861783 95 0.0615 1.2703 0.059954 100 1.0669 1.2769 0.657779 -------------------------------------------------------------- Result of KS test on 100 transformed mindist^2's: p-value=0.013821 |-------------------------------------------------------------| | THE 3DSPHERES TEST | |Choose 4000 random points in a cube of edge 1000. At each | |point, center a sphere large enough to reach the next closest| |point. Then the volume of the smallest such sphere is (very | |close to) exponentially distributed with mean 120pi/3. Thus | |the radius cubed is exponential with mean 30. (The mean is | |obtained by extensive simulation). The 3DSPHERES test gener-| |ates 4000 such spheres 20 times. Each min radius cubed leads| |to a uniform variable by means of 1-exp(-r^3/30.), then a | | KSTEST is done on the 20 p-values. | |-------------------------------------------------------------| The 3DSPHERES test for file deadbeef sample no r^3 equiv. uni. 1 5.454 0.166234 2 36.169 0.700495 3 0.698 0.022985 4 41.599 0.750084 5 17.967 0.450585 6 5.306 0.162116 7 4.149 0.129161 8 8.988 0.258885 9 13.307 0.358265 10 4.185 0.130205 11 3.350 0.105653 12 43.349 0.764246 13 72.178 0.909817 14 27.779 0.603853 15 74.617 0.916862 16 34.814 0.686665 17 17.123 0.434915 18 4.870 0.149847 19 1.552 0.050420 20 11.824 0.325744 -------------------------------------------------------------- p-value for KS test on those 20 p-values: 0.194372 |-------------------------------------------------------------| | This is the SQUEEZE test | | Random integers are floated to get uniforms on [0,1). Start-| | ing with k=2^31=2147483647, the test finds j, the number of | | iterations necessary to reduce k to 1, using the reduction | | k=ceiling(k*U), with U provided by floating integers from | | the file being tested. Such j''s are found 100,000 times, | | then counts for the number of times j was <=6,7,...,47,>=48 | | are used to provide a chi-square test for cell frequencies. | |-------------------------------------------------------------| RESULTS OF SQUEEZE TEST FOR deadbeef Table of standardized frequency counts (obs-exp)^2/exp for j=(1,..,6), 7,...,47,(48,...) 4.8 2.2 5.8 5.5 6.0 7.2 4.4 4.8 0.7 0.7 -3.1 -2.1 1.6 -1.8 -1.2 -3.5 -2.3 -0.4 -0.1 -1.0 1.3 0.4 1.4 1.4 0.5 0.7 1.6 2.4 0.0 -0.0 1.1 -0.6 0.5 0.0 0.5 1.7 -0.7 -0.7 -1.2 0.4 1.6 0.0 -0.1 Chi-square with 42 degrees of freedom:287.640654 z-score=26.801593, p-value=0.000000 _____________________________________________________________ |-------------------------------------------------------------| | The OVERLAPPING SUMS test | |Integers are floated to get a sequence U(1),U(2),... of uni- | |form [0,1) variables. Then overlapping sums, | | S(1)=U(1)+...+U(100), S2=U(2)+...+U(101),... are formed. | |The S''s are virtually normal with a certain covariance mat- | |rix. A linear transformation of the S''s converts them to a | |sequence of independent standard normals, which are converted| |to uniform variables for a KSTEST. | |-------------------------------------------------------------| Results of the OSUM test for deadbeef Test no p-value 1 0.011778 2 0.010571 3 0.126712 4 0.120523 5 0.104981 6 0.233243 7 0.805659 8 0.036827 9 0.574851 10 0.409788 _____________________________________________________________ p-value for 10 kstests on 100 kstests:0.002441 |-------------------------------------------------------------| | This is the RUNS test. It counts runs up, and runs down,| |in a sequence of uniform [0,1) variables, obtained by float- | |ing the 32-bit integers in the specified file. This example | |shows how runs are counted: .123,.357,.789,.425,.224,.416,.95| |contains an up-run of length 3, a down-run of length 2 and an| |up-run of (at least) 2, depending on the next values. The | |covariance matrices for the runs-up and runs-down are well | |known, leading to chisquare tests for quadratic forms in the | |weak inverses of the covariance matrices. Runs are counted | |for sequences of length 10,000. This is done ten times. Then| |another three sets of ten. | |-------------------------------------------------------------| The RUNS test for file deadbeef (Up and down runs in a sequence of 10000 numbers) Set 1 runs up; ks test for 10 p's: 0.315552 runs down; ks test for 10 p's: 0.078976 Set 2 runs up; ks test for 10 p's: 0.551048 runs down; ks test for 10 p's: 0.129876 |-------------------------------------------------------------| |This the CRAPS TEST. It plays 200,000 games of craps, counts| |the number of wins and the number of throws necessary to end | |each game. The number of wins should be (very close to) a | |normal with mean 200000p and variance 200000p(1-p), and | |p=244/495. Throws necessary to complete the game can vary | |from 1 to infinity, but counts for all>21 are lumped with 21.| |A chi-square test is made on the no.-of-throws cell counts. | |Each 32-bit integer from the test file provides the value for| |the throw of a die, by floating to [0,1), multiplying by 6 | |and taking 1 plus the integer part of the result. | |-------------------------------------------------------------| RESULTS OF CRAPS TEST FOR deadbeef No. of wins: Observed Expected 98564 98585.858586 z-score=-0.098, pvalue=0.53894 Analysis of Throws-per-Game: Throws Observed Expected Chisq Sum of (O-E)^2/E 1 66646 66666.7 0.006 0.006 2 37741 37654.3 0.200 0.206 3 26721 26954.7 2.027 2.233 4 19437 19313.5 0.790 3.023 5 13916 13851.4 0.301 3.324 6 9979 9943.5 0.126 3.450 7 7235 7145.0 1.133 4.583 8 5186 5139.1 0.429 5.012 9 3606 3699.9 2.381 7.393 10 2584 2666.3 2.540 9.933 11 1963 1923.3 0.818 10.752 12 1327 1388.7 2.745 13.497 13 971 1003.7 1.066 14.563 14 731 726.1 0.033 14.595 15 544 525.8 0.627 15.223 16 389 381.2 0.162 15.385 17 267 276.5 0.329 15.714 18 213 200.8 0.737 16.451 19 154 146.0 0.440 16.891 20 94 106.2 1.405 18.296 21 296 287.1 0.275 18.571 Chisq= 18.57 for 20 degrees of freedom, p= 0.54985 SUMMARY of craptest on deadbeef p-value for no. of wins: 0.538940 p-value for throws/game: 0.549847 _____________________________________________________________