Random Number Generation

 
Author:  Follow: TwitterFacebook
Job Title:Sarcastic Architect
Hobbies:Thinking Aloud, Arguing with Managers, Annoying HRs,
Calling a Spade a Spade, Keeping Tongue in Cheek
 
 


Pages: 1 2

Obtaining random bit stream: On PRNGs and hardware RNGs

Ok, we’ve more or less defined “what is good RNG”, so we can start discussing “how to implement these good RNGs”.

In general, there are two very different types of RNG: the first one is PRNG (pseudo-RNG), and the second one is “true” RNG.

PRNGs

Assertive hare:Of course, if somebody can learn internal state, they can immediately get all the bits coming from PRNG, so at least some parts of the internal state need to be kept secret.PRNG is essentially an algorithm, which keeps some internal state, and generates numbers which look “as if” they’re completely random, though the numbers itself are derived from the internal state in a perfectly deterministic manner. Of course, if somebody can learn internal state, they can immediately get all the bits coming from PRNG, so at least some parts of the internal state need to be kept secret.

Linear Congruential Generator and Mersenne Twister: Reversibility is a Potentially Mortal Sin

Mersenne Twister The Mersenne Twister is a pseudorandom number generator... It is by far the most widely used general-purpose PRNG.— Wikipedia —There are quite a few different PRNGs out there. The simplest one – linear congruential generator (which can be written as a one-liner, see [Wikipedia.LCG]) – fell out of favour quite a while ago, as the one exhibiting certain statistical patterns. There is a replacement for it (and a very popular one in non-crypto part of the academy) – it is Mersenne Twister.

I won’t say anything bad about Mersenne Twister as long as it is used for Monte-Carlo simulations and any other similar applications, but I certainly do not really like it for games; in particular, as a non-crypto generator, it it may easily be reversible. And as soon as your RNG is reversible, one may reconstruct the state of your RNG from its outputs, with potentially devastating-for-your-game results. Even if you’re using your RNG “just” to calculate a chance of critical hit, Mersenne Twister is potentially dangerous 🙁 , and if you’re using it to determine AI actions in an area which is supposed to be covered by fog-of-war at least for some players – you may be easily granting a Game-Critical Advantage to somebody who managed to reverse your RNG (see more discussion on implications of AI and fog-of-war below). In short –

Especially as long as there are much better and easily available alternatives – DON’T use Mersenne Twister for games (see a potential exception below though)

Yes, there are situations where you might be able to get away with Mersenne Twister, but well – analysis whether it can be abused or not, is rarely worth the trouble of discussing it. It is just simpler to use something like “Poor Man’s Crypto PRNG” discussed below.

Exception: Potential Case for non-crypto-secure RNGs

One potential use case (I’ve neither seen nor heard of such things in real world, but well – I certainly didn’t see everything) for non-crypto-PRNGs MIGHT theoretically arise in games if you’re running something like MMORTS with hundreds of thousands of dice rolls per network tick. In such cases, you MAY indeed need something faster than AES-CTR (see “Poor Man’s Crypto PRNG” below). If you ever find yourself in such a situation, I suggest to play the following game:

  • Run AES-CTR once per network tick (or frame)
  • Take its output to seed Mersenne Twister (or some other non-crypto PRNG, more on them below)
  • Use Mersenne Twister throughout the network tick/frame
  • Throw ALL the state of Mersenne Twister away at the end of the network tick/frame
  • While you’re at it – make sure that the number of different states provided by your AES-CTR (for AES256 with a randomly initialized counter, it is 2^384) is MUCH more than the number of different states you hope to generate from your Mersenne Twister. In other words, trying to get some 100-card deck perfectly shuffled from an MT19997, which has been initialised from a single AES-CTR, is probably a pretty bad idea :-(. If you run into problems here – you can run several AES-CTRs to ensure that all the different states you need, can be indeed generated while using this trickery.

This way, impact of predictability of Mersenne Twister will be minimised (as its potentially predictable state is NOT carried across network ticks, and any inter-tick interaction, while still being a PRNG, is protected by crypto-level AES-CTR). On the other hand, even if we’re staying within the same tick, if our game involves some fog-of-war (and/or walls), it MAY be possible to get some information out of it; for example, it MIGHT be possible to reconstruct the state of PRNG, and to reconstruct actions of AI in areas covered by fog-of-war 🙁 . In such a case, the best thing would be to revert back to crypto PRNG. And if using of the crypto PRNG is not possible for performance reasons – it MIGHT be a good idea to use several different non-crypto-PRNGs for different characters of AI. For example, if we split the whole map into 100×100 zones, with a separate non-crypto-PRNG-per-tick-per-zone, then these effects will be much smaller, and if size of zone being MUCH smaller than area of the map visible anyway – the effects will be pretty much negligible. Alternatively, we MAY need to take a look at not-so-obviously-reversible non-crypto PRNGs (see below).

Other non-crypto-secure RNGs

IF (and I insist it is still a Big If) you need a non-crypto PRNG to make things faster (and once again – I DO insist that you SHOULD NOT use it as a single PRNG, but rather reinitialise it completely on each “network tick”/frame from a crypto-secure PRNG), you have more than one alternative.

Besides Mersenne Twister, there are quite a few other non-crypto-safe RNGs of different popularity and quality. Out of these, I’d suggest to keep away from linear congruential stuff, and to consider one of the following:

  • [xorshift] family of algorithms, especially xorshift+. Has good statistical properties, BUT is trivially reversible (just from one single output(!)), so if “fog-of-war”-like attacks described above are of any concern, I strongly suggest to avoid it.
  • aforementioned Mersenne Twister. Has reasonably good statistical properties. Reversible; restoring state of popular MT19937 requires 624 outputs.
  • [PCG] Has good statistical properties. In addition, while I am NOT buying the argument by its creator that it qualifies as a middle ground between non-crypto and crypto PRNGs in terms of reversibility (at least until it is subjected to security scrutiny by a Big Portion of crypto community), PCG DOES qualify as a good non-crypto PRNG, AND it IS more difficult to reverse than both xorshift and Mersenne Twister. If I am ever forced to use non-crypto PRNG, PCG is the one I am likely to pick.

Blum-Blum-Shub

Blum Blum Shub Blum Blum Shub (B.B.S.) is a pseudorandom number generator proposed in 1986 by Lenore Blum, Manuel Blum and Michael Shub that is derived from Michael O. Rabin's oblivious transfer mapping.— Wikipedia —Coming from a very-different-than-Mersenne-Twister (but still very academical) field, there is a Blum-Blum-Shub PRNG. It is well-known as one of those algorithms which have a proof attached; note though that the proof itself hinges on the unproven intractability of a so-called quadratic residuosity problem. Of course, most likely this problem is indeed intractable. On the other hand, most likely AES128 is not breakable during the time while our game is being played. As a result, I don’t consider the Blum-Blum-Shub to be any better than any of the crypto algorithms which are currently considered secure (even though there is usually no nice theorem attached to the latter).

In other words – I do NOT think that the “proof” behind the Blum-Blum-Shub qualifies as a sufficient reason to use it. Yes, it would be fine from reversibility perspective, but not more than that. And taking into account how slow Blum-Blum-Shub is – we can usually throw it out just because of poor performance.

“Poor Man’s” Crypto PRNG – AES-CTR

As a result, I generally suggest to use the following construct for a PRNG (often referred to as AES-CTR; for formal/more complete/more secure description, look, for example, for CTR_DRBG in [SP800-90A]):1

  • State of our PRNG consists of two fields: key (such as an AES128 key) and counter (such as a 128-bit counter)
  • PRNG is seeded with the key (needs to be of Crypto-Random Quality (!), see below); at the same point counter is initialized with either a different(!) random value, or a fixed one
  • to get next 16 bytes of random bit stream, we simply encrypt the counter with AES128 using the key as the crypto key (if somebody asks you about the mode – in ECB (sic!) mode).

This description is pretty much the same as saying that you’re simply encrypting a stream of zero bytes (i.e. plaintext for encryption ALWAYS consisting of zero bytes) using your block cipher in CTR mode (that’s why the name of AES-CTR). To achieve effect of initial counter being random, you may use random CTR’s IV/nonce.

BTW, of course, you can use your favorite block cipher instread of AES128 too. Alternatively, you may use your favourite stream cipher (such as Chacha20) directly (without CTR). NB: On recent x64 CPUs by Intel AES is generally faster than Chacha20 (at least in part, due to AES-NI instructions).

Hare thumb up:On modern x86 CPUs, single core can generate 150M+ random bytes/second this way (and this is a Damn Lot).First, let’s note that this construct is pretty fast. On modern x86 CPUs, single core can generate 150M+ random bytes/second this way (and this is a Damn Lot for most of the games; however, if you happen to need more – you MAY need a non-crypto PRNG, see discussion in “Exception: Potential Case for non-crypto-secure RNGs” section above).

Now let’s discuss the security (very sketchy, and only at intuitive level; much more formal reasoning does exist, but is way too far beyond the scope of this book). The construct is secure (at least as secure as AES128 in CTR mode provided that known-plaintext attacks are allowed) as essentially it is a bit stream which is used to XOR our data when we run AES128 in a CTR mode. As a result, any ability to predict it given the history or any chance of statistical skew, would make original AES-in-CTR-mode non-secure. So, as long as we consider AES-CTR encryption to be secure – so we should consider AES-CTR RNG.

Good stuff aside, let’s discuss limitations of the “Poor Man’s PRNG”. For crypto-purposes, such an AES-CTR-based thing can generate up to 2^64 random blocks before it requires re-seeding (mechanics of this limitation is rather complicated, but is related to birthday paradox). With 150M+ random bytes/second I’ve mentioned above, it means that we could run our PRNG continuously on one single core for 2^64 blocks * 16 bytes/block / 150Mbytes/second / 86400 seconds/day / 365 days/year ~= 62394 years. Good enough for a vast majority of the games out there even without reseeding.

One potential issue with such a “Poor Man’s PRNG” is that it’s state is relatively limited. In particular, if trying to use it to deal decks of 52 cards, and initializing counter with a predefined value, we will be able to get only 2^128 different combinations,2 while we need at least 52! ~= 2^226. Therefore, if using such a “Poor Man’s PRNG” for card shuffling purposes, we need at least to upgrade it from AES128 to AES256, and preferably also to seed 128-bit counter too (making the whole space of reachable states 2^384, which provides a not-so-bad reserve over 2^226 we need). NB: these calculations are given just to illustrate the problem; in practice, for such RNG-critical  tasks I do NOT recommend “Poor Man’s Crypto”, see “Recommendations for Bit Stream” section below.

Assertive hare:Still, I’d rather not use “Poor Man’s” Crypto for Really Crypto-Sensitive Stuff, such as key generation. Also, we still need something to seed it too.Still, I’d rather not use “Poor Man’s” Crypto for Really Crypto-Sensitive Stuff, such as key generation. Also, we still need something to seed it too. Which leads us to the much more murky field of “true” RNGs.


1 note that as discussed below, any PRNG, including this one, needs to be properly seeded, so PRNG itself is not enough
2 actually, a little bit more than that, but this “little bit” is not likely to be worth more than ~30-40 bits in practice, as space of counter values is going to be limited

 

Determinism of PRNGs

It should be noted that with PRNGs, their output, while appearing completely random, is in fact perfectly deterministic (and yes, it stands for ALL PRNGs, however secure they are). This observation has some interesting implications:

  • you CAN use PRNG for reproducible testing (as long as you seed it with the same value)
    • as a result, you DON’T need to write PRNG’s inputs and outputs into inputs-log when implementing deterministic Reactors (those discussed in Chapter V).

“True” RNGs

Assertive hare:Even if we have the best-in-the-world PRNG, its output is still only as good as its seed.Even if we have the best-in-the-world PRNG, its output is still only as good as its seed. For example, even if the guys from Planet Poker would use the-best-possible-PRNG, but would still seed it with time-with-millisecond-precision, they would still be easily breakable (to the extent that cards of the other players at the table can be predicted(!)).

To seed a PRNG, we DO need some “true” entropy. For this, we need some “true” RNG. Or from a different perspective: if our system as a whole is perfectly deterministic, we will have absolutely-nothing to get randomicity from.

“Physics-based” Hardware RNGs

One family of “true” RNGs is based on physical effects which are essentially random (such as Zener noise, thermal noise, and so on).

Unfortunately, good physics-based RNGs are difficult to find, and quite a bit of the stuff you can Google, is garbage 🙁 . Here goes a very short guide for selecting your hardware RNG (that is, if you need it, which is NOT always the case):

  • Check whether you’re ok with physics-based Hardware RNG’s advertised bandwidth and price.
  • Check whether the vendor claims that they run NIST, DieHard, and/or TestU01 tests (BTW, two are better than one, and three are better than two). And while we’re at it – any kind of FIPS-140 tests (unless they’re run in real-time to detect sudden failures) is pretty much useless for any kind of serious RNG testing.
  • Get their unit and run AT LEAST TWO of {NIST,Diehard,TestU01} tests yourself, in your lab.
    • There is a chance that you’ll need to throw the RNG away at this point 🙁
  • Keep running the tests on random samples of their bit stream while you’re in production

Hare with an idea:the idea of throwing a single photon and detecting where it got reflected (which is a process guaranteed to be truly random by quantum mechanics), is IMHO fascinatingOne physics-based RNG which I personally had Very Good experience with, is the one manufactured by ID Quantique (and while we’re at it – no, I am not affiliated with them in any manner). Besides from their Quantis true RNGs working well in production and passing all the tests we were able to throw at it (including both Die-Hard and NIST, though we didn’t try TestU01 at that time), the whole idea of throwing a single photon and detecting where it got reflected (which is a process guaranteed to be truly random by quantum mechanics), is IMHO fascinating. Moreover, due to independence of such single-photon experiments, such construct allows to avoid spectrum-related interdependencies (which are MUCH more difficult to deal with in other true RNGs).

Ongoing re-seeding, Fortuna and /dev/urandom

“Physics-based” RNG MAY look as “just the ticket” (and, if done properly, they are for some kinds of games), but in practice for many applications (many games included), they’re NOT strictly required.

Hare pointing out:in addition to the usual PRNG functionality, these are specially designed for on-going re-seedingThere is an interesting family of PRNGs out there, represented, in particular, by Yarrow and later [Wikipedia.Fortuna]. While these are technically PRNGs (or more accurately – CSPRNGs as in “Crypto-Secure Pseudo-Random Generators”), in addition to the usual PRNG functionality, these are specially designed for on-going re-seeding. In other words – they DO accept all the entropy you can throw at them, and change their state accordingly.3 Moreover, if such a PRNG-with-ongoing-reseeding is integrated into an operating system, operating system will have a LOT of entropy to throw at it: information from interrupt handlers, disk I/O, and user input – everything can be used as a source of randomness… This will make such a PRNG non-predictable pretty soon even if its state has leaked once by some black magic(!). Oh, and as soon as the machine has run for long enough (which is usually in “< 1second” range) – such a PRNG-with-ongoing-reseeding-from-OS-events has enough state to be indistinguishable from True RNG.

However, as usual, there are certain considerations:

  • Out of all such PRNGs-with-ongoing-re-seeding, I strongly prefer Fortuna by Bruce Schneier
    • Yes, it is the one used by FreeBSD for it’s /dev/urandom
  • /dev/urandom in Linux uses a simpler Crypto-Secure PRNG than Fortuna, but it is criticized for weaknesses [DodisEtAl][Bernstein]
    • OTOH, unless your stuff is Really Critical, it is usually fine to use /dev/urandom
  • CryptGenRandom() by MS has also been criticised [Dorrendorf]
    • Same as with Linux and /dev/urandom, IMHO it is fine except for most-RNG-critical applications
  • If using /dev/urandom etc. directly, they MAY be rather slow (performance being system-dependent)
    • In this case, using their output as an input for “Poor Man’s PRNG” described above, usually helps
  • If using /dev/urandom etc. directly while trying to use deterministic goodies from Reactors (as discussed in Chapter V), you DO need to record their data as inputs for your Reactor, which may additionally hurt performance further under certain scenarios

3 Yes, at least for these PRNGs “re-seeding” does NOT mean “throw away all the state and start anew”, but rather means “mix existing state with a new state”

 

Recommendations for Bit Stream

Here goes my personal list of recommendations for bit stream RNGs:

  • For not-so-RNG-critical games:
    • DO use /dev/urandom to seed your PRNGs, to generate long-term keys etc. Under Windows, DO use CryptGenRandom()
      • If you run into performance issues – DO use “Poor Man’s PRNG” (seeded from /dev/urandom); this should satisfy all your RNG needs with a very healthy reserve
        • If you run into performance issues even with “Poor Man’s PRNG” – I can safely bet that you’re using TONS of random bits per network tick (or per frame). Then, use “Poor Man’s PRNG” to initialise simple non-crypto PRNG (such as PCG) once per network tick/frame, and then use non-crypto PRNG throughout the tick, without carrying any of non-crypto PRNG state to the next tick. This will give you another boost in RNG performance, while minimizing any potential impact of the reversibility of non-crypto PRNG.
          • OTOH, if there is “fog-of-war” involved, this type of trickery MAY allow for nasty attacks which MAY partially lift “fog-of-war”. To avoid them, the best is to use a crypto-PRNG, but other than that – splitting your map in smaller independent zones, or picking a non-crypto-PRNG which is more difficult to reverse, MAY help a bit.
    • Judging hare:DO run your Marsaglia and NIST tests on your bit stream, and Chi-Square analysis at your game event level before deployment, and once in a whileDO run at least two of {Marsaglia,NIST,TestU01} tests on your bit stream, and Chi-Square analysis at your game event level before deployment, and once in a while
  • For Really-RNG-Critical games (think lotteries, casinos, decisions defined as “chosen randomly” in stock exchanges, etc.):
    • DO use multiple sources of entropy.
      • This includes BOTH /dev/urandom AND at least one independent Hardware RNG
        • using more than one independent Hardware RNG is recommended
      • If your Hardware RNG is NOT used by /dev/urandom, use simple XOR (byte-for-byte) between output from /dev/urandom and output from your Hardware RNG(s); I prefer this configuration to the second one below.
      • If your HW RNG is already used by /dev/urandom, DO NOT mix output from your Hardware RNG with output from your /dev/urandom again; mixing things twice may easily be devastating for the quality and security of your bit stream(!)
      • DO consider running your own crypto-safe PRNG (such as Fortuna), using a completely independent seed, and XOR-ing its output with outputs of all the other streams mentioned above. Doing it this way would prevent several RNG disasters (including, for example, Debian-related one). The key question here is “where to obtain that completely independent seed”. I would try to access several sites-providing-supposedly-true-random-data (with random.org, HotBits, and qrng.anu.edu.au coming to mind immediately), mix them (either via XOR-ing, or via hashing), and use the result to initialise this your own crypto-safe PRNG (NB: you may either to do it on each restart, or may do it offline and initialise Fortuna’s seed file, just using it on subsequent restarts). In addition, you MAY also do ongoing re-seeding of your own PRNG such as Fortuna with your own Client-Side data (such as resulting from user inputs, but NOT coming from the server itself, as it may violate independence requirements). It should be noted that I tend to consider such your own PRNG as a kind of “safety net”, which is XOR-ed with the others all the time (and as long as it is independent and simply XOR-ed, it cannot possibly get things worse), but which start being important only when/if everything else fails (such as in a doomsday scenario of all HW generators failing and /dev/urandom being improperly initialised such as it has happened with Debian).
    • DO run simple tests (such as monobit/poker/runs/long-runs which were crossed out from FIPS140-2, see [FIPS140-2]) over each of your HW generator outputs in real-time while they are running. Hardware RNGs DO fail accidentally, and knowing about it as soon as possible, IS important.
    • DO hire a company (ideally – one of those certifying casino RNG stuff) to perform analysis of your RNG. DO it again after ANY change you make to your RNG.
    • DO run Marsaglia and NIST and TestU01 on samples from your bitstream, and Chi-Square analysis at your game event level, on daily basis. Libraries you’re relying on, DO change (and not always to the good 🙁 ), hardware generators DO fail, and so on… And running with a faulty RNG in a Really-RNG-Critical environment is tantamount to a Mortgage-Crisis-Size disaster 🙁 🙁 .

From Bit Stream to Real World

Thinking hare:In addition to the perfect bit stream, we also need a way to convert it to our precious randomised game eventsOk, we’ve got our perfectly random bitstream. However, as we can see from the mistakes made by Planet Poker devs, it is certainly not the end of the story. In addition to the perfect bit stream, we also need a way to convert it to our precious randomised game events (answering questions ranging from “what is the next card in the deck?” to “whether this hit should be a critical hit?”).

As discussing ALL the random things you may want to do, would take too long, I’ll take a closer look only at two most common primitives (which are still subject to Big Fat Bugs 🙁 ).

Bit Stream to Random Number in Range

One very common operation requested from RNG, is “give me a random integer from 0 to N” (to be specific, let’s assume it is ‘N inclusive’). The very common way to handle it, is just to get next 32 bits from the bit stream, and to use something like

//WARNING: BIAS AHEAD!
void my_random( uint32_t randBits, int to ) {
  //converts 32 random bits into a number
  //  from 0 to ‘to’ inclusive
  return randBits % (to+1);
}

Unfortunately, even for small values of to, there is a small but potentially nasty problem with this code: if to is not 2^N-1, returned value is NOT exactly evenly distributed. This happens just because 2^32 cannot be evenly divided into number-of-buckets which is not 2^N 🙁 . Sure, for small values of to the bias will be very small, but for Really RNG-Critical applications, this DOES qualify as an unacceptable bias 🙁 .

To deal with it (that is, for Really RNG-Critical stuff), I usually prefer to use the following bullet-proof algorithm:

  • Take a number-of-bits N (which is enough-to-represent-to-parameter-plus-one), from bit stream
    • For example, if to is 51, we need to take N=6 bits
  • Convert these bits into a number X (it will be from 0 to 2^N-1)
  • If X is less than or equal to to – we’ve got our random value
  • If X is more than to – we throw away all those N bits of random data, and go back to square one

Arguing hare:given that bugs in this kind of code can be Very Difficult to find - I STRONGLY prefer to Keep It Simple and just throw bits awaySure, it is not the most optimal way (and it DOES waste quite a few random bits unnecessarily – usually around 25%, but if your to is like 2^N, it may be as high as 50%); however, it has exactly zero bias, AND it is simple enough so that it is rather difficult to make a mistake while implementing it. If you’re really concerned about those wasted bits of randomicity (which you usually shouldn’t, as generation is extremely fast for AES-CTR-like stuff) – it IS possible to reuse them (see, for example, [CheckMyWorking]), but the whole thing will become MUCH more complicated (and error-prone too); given that bugs in this kind of code can be Very Difficult to find – I STRONGLY prefer to Keep It Simple and just throw bits away.

Shuffling

Another algorithm which is easy to mis-implement, is shuffling. As we’ve seen above, it IS quite easy to make a mistake in shuffling. And there ARE ways to implement shuffling properly (see, for example, [Wikipedia.Fisher-Yates.Shuffle], look for Durstenfeld variation there).

Wtf hare:developers with absolutely no clue about the subject area are allowed to 'optimise' Major Standard LibrariesIt is fairly simple, but it is still not too difficult to make a mistake (at least Planet Poker devs managed to do it). On the other hand, from my experience in development, I’ve learned to distrust those third-party implementations which are written by NON-experts in the field. This, in particular, applies to std:: libraries when it comes to math-related stuff4 In one of the versions of an std:: library (shipped to millions of developers and compiled into Who Knows how many programs), I’ve seen an implementation of std::hash() which was “optimised” to split the long string into 16 or so parts of the equal length, and to calculate “hash” only from bytes around these split points (effectively ignoring most of the string when calculating such “hash”). Yes, this “optimisation” is long gone now, but it was out there for 2 years, and illustrates that developers with absolutely no clue about the subject area are allowed to “optimise” Major Standard Libraries 🙁 🙁 . In turn, this raises a Pretty Bad Question: how can I be 100% sure that such an enterprising-and-optimizing-developer won’t “optimise” shuffle in some pretty-bad-but-not-immediately-visible way? Reviewing 3rd-party code after each update is even worse than writing it ourselves 🙁 .

As a result, here we’re facing a strange and unpleasant dilemma between writing shuffle ourselves and relying on existing stuff such as std::random_shuffle(). With this regard, my personal position (given certain things which I’ve seen in certain std:: implementations), goes as follows (and yes, I know that once again I will be beaten Really Hard for saying it):

  • For the not-so-RNG-critical stuff, it is ok to use third-party library by non-experts such as std::
  • However, for really-RNG-critical games, I’d rather take it from the library which specialises in random stuff (I don’t know such a library though), or implement it myself (and BOTH review AND test it thoroughly). I am in particular afraid of scenarios when “newer better” version of the third-party library gets broken but it goes unnoticed for a while because the results are not obviously broken, but rather skewed instead. I certainly do NOT want to fall victim of such an “optimization” for a Really Critical part of the system.

4 and BTW, an aforementioned bug in java.SecureRandom is another example of this kind of problems

 

Chi-Square Testing

Last but not least about the algorithms which convert bit streams into game-levels events (such as obtaining random number from the bit stream, or shuffling):

Make sure to run Chi-Square Testing (as described above) on your game-level events

While these tests are NOT guaranteed to catch ALL the bugs, they will catch at least most egregious biases (including some of the bugs made by Planet Poker). Without this, I won’t be comfortable to release any kind of the RNG for use even in the not-so-RNG-critical environments.

Dealing with Perception Problems

Ok, you’ve spent a lot of time and made your perfect RNG, which passes all the tests, code reviews, and third-party audits. Let’s even assume that it is indeed perfect. However, it is not the end of the RNG story; to the contrary, in many RNG-critical cases it may be just a beginning.

Hare with hopeless face:they will invent all kinds of the conspiracy theories why you intentionally biasing your RNG to make moneyAs noted above, players will complain about biased RNG even if your RNG is perfect. Even worse, they will invent all kinds of the conspiracy theories why you intentionally biasing your RNG to make money (see, for example, [Blizzard.ConspiracyTheory] using a poetic wording of “the ‘invisible’ hand messing around with their decks”). As noted above, this is largely related to the phenomenon of remembering-only-bad-stuff 🙁 . But we cannot change this, so what we can do to deal with those disgruntled players and conspiracy theories?

Once upon a time, I’ve seen a company which took these things very seriously. And they were very interested in countering those conspiracy theories. They even considered hiring a mathematician to disprove (statistically) the prevalent conspiracy theory. However, they quickly realised that while disproving the theory would take half a year, inventing a new one would take just another 10 minutes, so it would be a never-winning situation.

Then, they’ve done an IMHO brilliant thing: they’ve started saying to their disgruntled players:

Hey, we can give you all the history of all the random events for your account, and you can analyse it to your heart’s content. If you find a proof of something bad going on here – you’re Very Welcome to come back to us. (BTW, we’ve already handed out histories to <insert-number-here> players, and nobody has come back yet).

The brilliance of this solution is that it shifts “burden of proof” to the player. Company doesn’t need to do prove that it is innocent (which is pretty much hopeless even if you ARE innocent, see above), and it is now responsibility of the player to prove that something fishy IS going on.

However, keep in mind that to play such games, your RNG DOES need to be perfect

Otherwise, the damage can be Really Bad 🙁 .

Yet Another Real-World Horror Story

And to wrap this Chapter up, yet another real-world story to illustrate how elusive those randomicity issues are.

Once upon a time, there was another poker software company. And, as it is customary in these circles, they have had RNG.

Hare thumb down:On one not-so-bright day, there came a report that their tournaments have had exactly the same deck dealt on most of the first tables in the tournament.On one not-so-bright day, there came a report that their tournaments have had exactly the same deck dealt on most of the first tables in the tournament. Most likely, they were using the seed initialized by time at the moment when the table was created; as the tables within the tournament are created at almost the same time – it meant that the first deck was (almost) the same.

That’s not the end of the story, however. After the flaw became a public knowledge, it has caused quite a stir among players, and as a result, their CEO has made a public statement along the lines of “from now on, we will take special measures to guarantee that starting hands on different tables of the tournament are never the same”.

So much for randomicity (NB: any “guarantees” on the hands “never” being the same introduce bias, as all the hands are supposed to be perfectly independent; moreover, this bias can be potentially used to get an unfair advantage – and degree of this advantage depends on how they implemented these “guarantees”).

I rest my case.

[[TODO: Separating Game RNG from Transport-Security RNG]]

[[To Be Continued…

Tired hare:This concludes beta Chapter 19 from the upcoming book “Development and Deployment of Multiplayer Online Games (from social games to MMOFPS, with social games in between)”. Stay tuned for beta Chapter 15, where we’ll start discussing basic security as it applies to games.]]

Don't like this post? Comment↯ below. You do?! Please share: ...on LinkedIn...on Reddit...on Twitter...on Facebook

[+]References

Acknowledgement

Cartoons by Sergey GordeevIRL from Gordeev Animation Graphics, Prague.

Join our mailing list:

Comments

  1. Grégory OBANOS says

    Hi, very interesting article !
    By the way, footnote anchors on first page are broken because note are on second page.

  2. says

    This is FANTASTIC! This is what my friend and I said for YEARS!

    I would like to add some personal experiences in checking random number generation.

    I am a “math dunce” to some degree. But I LOVE playing with numbers, especially random generation. I have found that in addition to checking the end result, if you dissect your random generation into pieces – literally dissect your code, as if you are pulling it apart to view each “organ” – it helps to find small random problems before they become large random problems.

    As a result, I always check each stage for random acceptability before checking the end result. This has helped me greatly in tracking down small cracks before they become huge!

    I am sharing this article to everyone I know. It feels good to have someone legitimize what you’ve been saying for years.

  3. ALearningDeveloper says

    Hello,

    Great article, much appreciated with the efforts put into it and the whole series, but I have a question about the “Poor Man’s” Crypto PRNG. Below are the steps from my understanding:

    1. Seed the PRNG with a random cryptographically-secure key
    2. Initialize the counter with a different random (cryptographically secure?) value, or a fixed one
    3. Encrypt the counter with AES-ECB using the key in 1
    4. Increment the counter (using whatever function that gives you a sequence) and repeat

    If I understand correctly, there is no encryption (random number generation?) done with AES-CTR. If that is the case, the name AES-CTR sounds misleading. I would call it AES-ECB with a counter.

    or…

    I got it wrong, CTR mode is actually used, and the ECB mode in the third bullet is a typo. This is what would happen:
    a. Seed the PRNG with a random cryptographically-secure key
    b. Initialize the keystream block with the key in (a) and a counter value which can be different random (cryptographically secure?) or fixed
    c. Encrypt the keystream block with AES-CTR using the key in (a) and XOR the output with only the counter
    d. Increment the counter and repeat

    But in either case (or maybe a third case or more), I am puzzled and I could use more explanation. Thank you!

    • "No Bugs" Hare says

      Actually, you’ve got it right (and yes, ECB is NOT a typo). However, the very same thing is a description of AES running in a CTR mode, while encrypting plaintext consisting of zero bytes of (potentially) unlimited length. If you take a look at the pics of ECB and CTR in https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation, it should become rather apparent.

      I’ve tried to make it more clear now, THANKS!

Leave a Reply

Your email address will not be published. Required fields are marked *