nullprogram.com/blog/2020/11/17/
Pixelmusement produces videos about MS-DOS games and software. Each video ends with a short, randomly-selected listing of financial backers. In ADG Filler #57, Kris revealed the selection process, and it absolutely fits the channel’s core theme: a QBasic program. His program relies on QBasic’s built-in pseudo random number generator (PRNG). Even accounting for the platform’s limitations, the PRNG is much poorer quality than it could be. Let’s discuss these weaknesses and figure out how to make the selection more fair.
Kris’s program seeds the PRNG with the system clock (RANDOMIZE TIMER
, a QBasic idiom), populates an array with the backers represented as integers (indices), continuously shuffles the list until the user presses a key, then finally prints out a random selection from the array. Here’s a simplified version of the program (note: QBasic comments start with apostrophe '
):
CONST ntickets = 203 ' input parameter
CONST nresults = 12
RANDOMIZE TIMER
DIM tickets(0 TO ntickets - 1) AS LONG
FOR i = 0 TO ntickets - 1
tickets(i) = i
NEXT
CLS
PRINT "Press any key to stop shuffling..."
DO
i = INT(RND * ntickets)
j = INT(RND * ntickets)
SWAP tickets(i), tickets(j)
LOOP WHILE INKEY$ = ""
FOR i = 0 to nresults - 1
PRINT tickets(i)
NEXT
This should be readable even if you don’t know QBasic. Note: In the real program, backers at higher tiers get multiple tickets in order to weight the results. This is accounted for in the final loop such that nobody appears more than once. It’s mostly irrelevant to the discussion here, so I’ve omitted it.
The final result is ultimately a function of just three inputs:
- The system clock (
TIMER
) - The total number of tickets
- The number of loop iterations until a key press
The second item has the nice property that by becoming a backer you influence the result.
QBasic RND
QBasic’s PRNG is this 24-bit Linear Congruential Generator (LCG):
uint32_t
rnd24(uint32_t *s)
{
*s = (*s*0xfd43fd + 0xc39ec3) & 0xffffff;
return *s;
}
The result is the entire 24-bit state. RND
divides this by 2^24 and returns it as a single precision float so that the caller receives a value between 0 and 1 (exclusive).
Needless to say, this is a very poor PRNG. The LCG constants are reasonable, but the choice to limit the state to 24 bits is strange. According to the QBasic 16-bit assembly (note: the LCG constants listed here are wrong), the implementation is a full 32-bit multiply using 16-bit limbs, and it allocates and writes a full 32 bits when storing the state. As expected for the 8086, there was nothing gained by using only the lower 24 bits.
To illustrate how poor it is, here’s a randogram for this PRNG, which shows obvious structure:
Admittedly this far overtaxes the PRNG. With a 24-bit state, it’s only good for 4,096 (2^12) outputs, after which it no longer follows the birthday paradox: No outputs are repeated even though we should start seeing some. However, as I’ll soon show, this doesn’t actually matter.
Instead of discarding the high 8 bits — the highest quality output bits — QBasic’s designers should have discarded the low 8 bits for the output, turning it into a truncated 32-bit LCG:
uint32_t
rnd32(uint32_t *s)
{
*s = *s*0xfd43fd + 0xc39ec3;
return *s >> 8;
}
This LCG would have the same performance, but significantly better quality. Here’s the randogram for this PRNG, and it is also heavily overtaxed (more than 65,536, 2^16 outputs).
It’s a solid upgrade, completely for free!
QBasic RANDOMIZE
That’s not the end of our troubles. The RANDOMIZE
statement accepts a double precision (i.e. 64-bit) seed. The high 16 bits of its IEEE 754 binary representation are XORed with the next highest 16 bits. The high 16 bits of the PRNG state is set to this result. The lowest 8 bits are preserved.
To make this clearer, here’s a C implementation, verified against QBasic 7.1:
uint32_t s;
void
randomize(double seed)
{
uint64_t x;
memcpy(&x ,&seed, 8);
s = (x>>24 ^ x>>40) & 0xffff00 | (s & 0xff);
}
In other words, RANDOMIZE
only sets the PRNG to one of 65,536 possible states.
As the final piece, here’s how RND
is implemented, also verified against QBasic 7.1:
float
rnd(float arg)
{
if (arg < 0) {
memcpy(&s, &arg, 4);
s = (s & 0xffffff) + (s >> 24);
}
if (arg != 0.0f) {
s = (s*0xfd43fd + 0xc39ec3) & 0xffffff;
}
return s / (float)0x1000000;
}
System clock seed
The TIMER
function returns the single precision number of seconds since midnight with ~55ms precision (i.e. the 18.2Hz timer interrupt counter). This is strictly time of day, and the current date is not part of the result, unlike, say, the unix epoch.
This means there are only 1,572,480 distinct values returned by TIMER
. That’s small even before considering that these map onto only 65,536 possible seeds with RANDOMIZE
— all of which are fortunately realizable via TIMER
.
Of the three inputs to random selection, this first one is looking pretty bad.
Loop iterations
Kris’s idea of continuously mixing the array until he presses a key makes up for much of the QBasic PRNG weaknesses. He lets it run for over 200,000 array swaps — traversing over 2% of the PRNG’s period — and the array itself acts like an extended PRNG state, supplementing the 24-bit RND
state.
Since iterations fly by quickly, the exact number of iterations becomes another source of entropy. The results will be quite different if it runs 214,600 iterations versus 273,500 iterations.
Possible improvement: Only exit the loop when a certain key is pressed. If any other key is pressed then that input and the TIMER
are mixed into the PRNG state. Mashing the keyboard during the loop introduces more entropy.
Replacing the PRNG
Since the built-in PRNG is so poor, we could improve the situation by implementing a new one in QBasic itself. The challenge is that QBasic has no unsigned integers, not even unsigned integer operators (i.e. Java and JavaScript’s >>>
), and signed overflow is a run-time error. We can’t even re-implement QBasic’s own LCG without doing long multiplication in software, since the intermediate result overflows its 32-bit LONG
.
Popular choices in these constraints are Park–Miller generator (as we saw in Bash) or a lagged Fibonacci generator (as used by Emacs, which was for a long time constrained to 29-bit integers).
However, I have a better idea: a PRNG based on RC4. Specifically, my own design called Sponge4, a sponge construction built atop RC4. In short: Mixing in more input is just a matter of running the key schedule again. Implementing this PRNG requires just two simple operations: addition over 2^8, and array swap. QBasic has a SWAP
statement, so it’s a natural fit!
Sponge4 (RC4) has much higher quality output than the 24-bit LCG, and I can mix in more sources of entropy.
Learning QBasic
Until this past weekend, I had not touched QBasic for about 23 years and had to learn it essentially from scratch. Though within a couple of hours I probably already understood it better than I ever had. That’s in large part because I’m far more experienced, but also probably because QBasic tutorials are universally awful. Not surprisingly they’re written for beginners, but they also seem to be all written by beginners, too. I soon got the impression that QBasic community has usually been another case of the blind leading the blind.
There’s little direct information for experienced programmers, and even the official documentation tends to be thin in important places. I wanted documentation that started with the core language semantics:
-
The basic types are INTEGER (int16), LONG (int32), SINGLE (float32), DOUBLE (float64), and two flavors of STRING, fixed-width and variable-width. There’s also a 10,000x fixed-point CURRENCY (int64).
-
Variables are SINGLE by default and do not need to be declared ahead of time. Arrays have 11 elements by default.
-
Variables, constants, and functions may have a suffix if their type is not SINGLE: INTEGER
%
, LONG&
, SINGLE!
, DOUBLE#
, STRING$
, and CURRENCY@
. For functions, this is the return type. -
Each variable type has its own namespace, i.e.
i%
is distinct fromi&
. Arrays are also their own namespace, i.e.i%
is distinct fromi%(0)
is distinct fromi&(0)
. -
Variables may be declared explicitly with
DIM
. Declaring a variable withDIM
allows the suffix to be omitted. It also locks that name out of the other type namespaces, i.e.DIM i AS LONG
makes any use ofi%
invalid in that scope. Though arrays and non-arrays can still have the same name even withDIM
declarations. -
Numeric operations with mixed types implicitly promote like C.
-
Functions and subroutines have a single, common namespace regardless of function suffix. As a result, the suffix can (usually) be omitted at function call sites. Built-in functions are special in this case.
-
Despite initial appearances, QBasic is statically-typed.
-
The default is pass-by-reference. Use
BYVAL
to pass by value. -
In array declarations, the parameter is not the size but the largest index. Multidimensional arrays are supported. Arrays need not be indexed starting at zero (e.g.
(x TO y)
), though this is the default. -
Strings are not arrays, but their own special thing with special accessor statements and functions.
-
Scopes are module, subroutine, and function. “Global” variables must be declared with
SHARED
. -
Users can define custom structures with
TYPE
. Functions cannot return user-defined types and instead rely on pass-by-reference. -
A crude kind of dynamic allocation is supported with
REDIM
to resize$DYNAMIC
arrays at run-time.ERASE
frees allocations.
These are the semantics I wanted to know getting started. Throw in some illustrative examples, and then it’s a tutorial for experienced developers. (Future article perhaps?) Anyway, that’s enough to follow along below.
Implementing Sponge4
Like RC4, I need a 256-element byte array, and two 1-byte indices, i
and j
. Sponge4 also keeps a third 1-byte counter, k
, to count input.
TYPE sponge4
i AS INTEGER
j AS INTEGER
k AS INTEGER
s(0 TO 255) AS INTEGER
END TYPE
QBasic doesn’t have a “byte” type. A fixed-size 256-byte string would normally be a good match here, but since they’re not arrays, strings are not compatible with SWAP
and are not indexed efficiently. So instead I accept some wasted space and use 16-bit integers for everything.
There are four “methods” for this structure. Three are subroutines since they don’t return a value, but mutate the sponge. The last, squeeze
, returns the next byte as an INTEGER (%
).
DECLARE SUB init (r AS sponge4)
DECLARE SUB absorb (r AS sponge4, b AS INTEGER)
DECLARE SUB absorbstop (r AS sponge4)
DECLARE FUNCTION squeeze% (r AS sponge4)
Initialization follows RC4:
SUB init (r AS sponge4)
r.i = 0
r.j = 0
r.k = 0
FOR i% = 0 TO 255
r.s(i%) = i%
NEXT
END SUB
Absorbing a byte means running the RC4 key schedule one step. Absorbing a “stop” symbol, for separating inputs, transforms the state in a way that absorbing a byte cannot.
SUB absorb (r AS sponge4, b AS INTEGER)
r.j = (r.j + r.s(r.i) + b) MOD 256
SWAP r.s(r.i), r.s(r.j)
r.i = (r.i + 1) MOD 256
r.k = (r.k + 1) MOD 256
END SUB
SUB absorbstop (r AS sponge4)
r.j = (r.j + 1) MOD 256
END SUB
Squeezing a byte may involve mixing the state first, then it runs the RC4 generator normally.
FUNCTION squeeze% (r AS sponge4)
IF r.k > 0 THEN
absorbstop r
DO WHILE r.k > 0
absorb r, r.k
LOOP
END IF
r.j = (r.j + r.i) MOD 256
r.i = (r.i + 1) MOD 256
SWAP r.s(r.i), r.s(r.j)
squeeze% = r.s((r.s(r.i) + r.s(r.j)) MOD 256)
END FUNCTION
That’s the entire generator in QBasic! A couple more helper functions will be useful, though. One absorbs entire strings, and the second emits 24-bit results.
SUB absorbstr (r AS sponge4, s AS STRING)
FOR i% = 1 TO LEN(s)
absorb r, ASC(MID$(s, i%))
NEXT
END SUB
FUNCTION squeeze24& (r AS sponge4)
b0& = squeeze%(r)
b1& = squeeze%(r)
b2& = squeeze%(r)
squeeze24& = b2& * &H10000 + b1& * &H100 + b0&
END FUNCTION
QBasic doesn’t have bit-shift operations, so we must make due with multiplication. The &H
is hexadecimal notation.
Putting the sponge to use
One of the problems with the original program is that only the time of day was a seed. Even were it mixed better, if we run the program at exactly the same instant on two different days, we get the same seed. The DATE$
function returns the current date, which we can absorb into the sponge to make the whole date part of the input.
DIM sponge AS sponge4
init sponge
absorbstr sponge, DATE$
absorbstr sponge, MKS$(TIMER)
absorbstr sponge, MKI$(ntickets)
I follow this up with the timer. It’s converted to a string with MKS$
, which returns the little-endian, single precision binary representation as a 4-byte string. MKI$
does the same for INTEGER, as a 2-byte string.
One of the problems with the original program was bias: Multiplying RND
by a constant, then truncating the result to an integer is not uniform in most cases. Some numbers are selected slightly more often than others because 2^24 inputs cannot map uniformly onto, say, 10 outputs. With all the shuffling in the original it probably doesn’t make a practical difference, but I’d like to avoid it.
In my program I account for it by generating another number if it happens to fall into that extra “tail” part of the input distribution (very unlikely for small ntickets
). The squeezen
function uniformly generates a number in 0 to N (exclusive).
FUNCTION squeezen% (r AS sponge4, n AS INTEGER)
DO
x& = squeeze24&(r) - &H1000000 MOD n
LOOP WHILE x& < 0
squeezen% = x& MOD n
END FUNCTION
Finally a Fisher–Yates shuffle, then print the first N elements:
FOR i% = ntickets - 1 TO 1 STEP -1
j% = squeezen%(sponge, i% + 1)
SWAP tickets(i%), tickets(j%)
NEXT
FOR i% = 1 TO nresults
PRINT tickets(i%)
NEXT
Though if you really love Kris’s loop idea:
PRINT "Press Esc to finish, any other key for entropy..."
DO
c% = c% + 1
LOCATE 2, 1
PRINT "cycles ="; c%; "; keys ="; k%
FOR i% = ntickets - 1 TO 1 STEP -1
j% = squeezen%(sponge, i% + 1)
SWAP tickets(i%), tickets(j%)
NEXT
k$ = INKEY$
IF k$ = CHR$(27) THEN
EXIT DO
ELSEIF k$ <> "" THEN
k% = k% + 1
absorbstr sponge, k$
END IF
absorbstr sponge, MKS$(TIMER)
LOOP
If you want to try it out for yourself in, say, DOSBox, here’s the full source:
https://gist.github.com/skeeto/17cb335d9652b2fc6f11485393362543
from Hacker News https://ift.tt/32TothL
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.