54

I was trying to solve a hobby problem that required generating a million random numbers. But I quickly realized, it is becoming difficult to make them unique. I picked up Algorithm Design Manual to read about random number generation.

It has the following paragraph that I am fully not able to understand.

Unfortunately, generating random numbers looks a lot easier than it really is. Indeed, it is fundamentally impossible to produce truly random numbers on any deterministic device. Von Neumann [Neu63] said it best: “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.” The best we can hope for are pseudo-random numbers, a stream of numbers that appear as if they were generated randomly.

Why is it impossible to produce truly random numbers in any deterministic device? What does this sentence mean?

Arseni Mourzenko
  • 134,780
  • 31
  • 343
  • 513
Vinoth Kumar C M
  • 15,455
  • 23
  • 57
  • 86
  • 88
    Are you _really_ asking why you cannot produce truly _random_ number on a _deterministic_ device? Doesn't the question already include the answer? – herby Dec 09 '11 at 17:02
  • 37
    If all the numbers you generate have to be unique, they aren't really random. It's entirely possible that a true random number generator will give the same result ten times in a row. – TMN Dec 09 '11 at 17:22
  • 29
    There is a flaw in looking for *random* numbers which are *unique*. If you are con-straining the numbers to be *unique*, then they are not *random* as random demands the possibility of [repetition](http://www.youtube.com/watch?v=T4SVVKuOr0c) no matter how improbable. – Mark Booth Dec 09 '11 at 17:29
  • 1
    For projects where malicious users are not a concern, you can often just useuse of random.org's [pregenerated files](http://www.random.org/files/) instead. Not appropriate if malicious users are concerned, since the files are public. – Brian Dec 09 '11 at 17:47
  • 14
    Outside the computer, is any random number ever truly random? Throw a die, it's physics with just a lot of vectors. – MPelletier Dec 09 '11 at 17:49
  • 9
    @MPelletier: Not quite. Quantum mechanics might (once scientists have figured more of it out) imply the existence of true randomness, depending on your definition of randomness. – Brian Dec 09 '11 at 17:53
  • 1
    @Brian So what we need is a chaos API to fetch numbers from that! – MPelletier Dec 09 '11 at 17:54
  • 1
    The answer to this question is simple. If you decide 0 or 1 then your selection is not random. Since all code is broken into 0 or 1 then your outcome of any code that generates a "random" number is not actually random. – Ramhound Dec 09 '11 at 18:24
  • 3
    @Ramhound: ...what? By that logic, *any* random generator which outputs in binary is not random! – BlueRaja - Danny Pflughoeft Dec 09 '11 at 18:53
  • 3
    @MPelletier: There are [already devices](http://en.wikipedia.org/wiki/Hardware_random_number_generator#Physical_phenomena_with_quantum-random_properties) which exploit quantum mechanical properties for generating random numbers. The NSA uses them, for instance, for generating keys (which are then physically shipped to other locations) for [one-time pads](http://en.wikipedia.org/wiki/One-time_pad). – BlueRaja - Danny Pflughoeft Dec 09 '11 at 18:56
  • @Ramhound: that's like saying flipping a coin isn't random because it can only be heads or tails. There is more to computer programs than just 1s and 0s just as there is more to a person than a bunch of atoms. Structure is important. – whatsisname Dec 09 '11 at 19:36
  • This is an existential question. If there is a deterministic "first cause" then nothing is random. If there isn't, everything is. :) – Daniel Dec 09 '11 at 19:48
  • 1
    Shameless plug: I've built a TRNG and included the code & circuit design http://arronchapman.com/blog/2011/11/9/netduino-powered-true-random-number-generator-trng.html – Unkwntech Dec 09 '11 at 20:47
  • 1
    Lava lamp + webcam... Random numbers or strings while keeping your desk cool. But seriously, I'm using it for my login scripts as a source for random numbers/strings and hash salting – HTDutchy Dec 09 '11 at 21:11
  • 5
    No, it's really not that hard - http://xkcd.com/221/ – Coder Dec 09 '11 at 21:21
  • If you need REAL random numbers in your app, check out http://www.random.org they have a cool way of getting around the problem of a deterministic machine generating random numbers. – JohnFx Dec 09 '11 at 21:51
  • 1
    Nice one :http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/00000/2000/300/2318/2318.strip.gif – Imran Omar Bukhsh Dec 13 '11 at 17:54
  • 1
    Related to what @TMN says, it's not only *possible* but *statistically guaranteed* that any true random number generator will *eventually* produce an arbitrarily-long sequence of repeating numbers. It's just a question of how long you're willing to wait. **Random does not equal unique.** – Aaronaught Oct 27 '13 at 16:18
  • http://xkcd.com/221/ – eckes Feb 19 '15 at 18:26
  • Due to the coding structure used in computing (bits and bytes) is is impossible to ever have a true random number produced. I learned this back in 1981 when being trained as a systems programmer. –  Feb 19 '15 at 15:51
  • "Deterministic": By running the same code from the same start state you always get the same end state. "Random": By running the same code from the same start state you usually get a different end state. See the contradiction? – user253751 Feb 20 '15 at 07:11
  • 1
    @MPelletier: If no number is truly random, there remains a variable or two that we humans cannot accurately predict. Therefore, a dice roll is random as far as we know. Chaos theory might be an interesting subject for you - the basic idea is that we can predict exactly what a system will do if, and only if, we can accurately predict/measure all the variables. – moonman239 Sep 15 '15 at 01:29
  • @TMN Generating *unique* random numbers is trivial if you can produce random numbers: `n = p + rand(1,s)`, where `n` is the new number, `p` is the previously generated number (probably defaulting to 0 to start), and `s` is the maximum step between numbers. Admittedly, depending on `n`'s type, there may be a risk of overflow, but that can be avoided by using a type that can handle the numbers needed. – 8bittree Jun 23 '16 at 15:10
  • Relevant: https://keepass.info/screenshots/getrand_big.png – S.D. Sep 13 '18 at 09:32
  • > _randomness is the apparent lack of pattern or predictability in events._ by definition, the **complete** lack of predictability isn't random. extensive manifesto ahead: https://cregox.net/random – cregox Nov 12 '20 at 20:54

9 Answers9

67

One should look for a cryptographically secure pseudo-random number generator. Most PRNG are linear congruence generators (so next number is a linear function of previous number), so if you plot next number vs previous number you'll get a chart of parallel lines. A CSPRNG will not do that. The trade-off is that they're slow.

I group random number generators into 3 categories:

  1. Good enough for homework.
  2. Good enough to bet your company on.
  3. Good enough to bet your country on.

Why is it impossible to produce truly random numbers in any deterministic device ?

A deterministic device will always produce the same output when given the same starting conditions and inputs - that is what it means to be deterministic. "Truly random number" is more of a philosophical viewpoint, as what does it mean to be random is the crux of the philosophical navel gazing (folks aren't even certain if atomic decay is random or follows some pattern we just can't figure out yet). A cryptographically secure random number generator is going to take some external source of entropy to make the device non-deterministic.

Tangurena
  • 13,294
  • 4
  • 37
  • 65
  • 3
    That is why it is impossible to get a truly random number. Even if the *sequence* never repeats, which is not guaranteed for *random* numbers, Another run of the program with the same inputs *will* produce the same results. So, someone else can reproduce your *random* numbers at a later time, which means it wasn't really random at all. – Spencer Rathbun Dec 09 '11 at 17:12
  • 1
    @Spencer: Well, there is [special hardware](http://en.wikipedia.org/wiki/Hardware_random_number_generator) that can generate truly random numbers. They usually generate pretty slowly, though, so in real-world scenarios they are usually only used to seed a cryptographically-secure PRNG. This is what Tangurena meant in his last sentence. – BlueRaja - Danny Pflughoeft Dec 09 '11 at 19:06
  • @BlueRaja-DannyPflughoeft Yes, but that's because you can't get the hardware to reproduce the same inputs. The *program using the hardware* will still create the same values from the same inputs, so if someone knows what the inputs were for a run, they can reproduce the values. It is similar to the concepts of [limits](http://en.wikipedia.org/wiki/Limit_of_a_function) from calculus. We can approximate randomness well enough, but you can't actually get there. – Spencer Rathbun Dec 09 '11 at 19:21
  • 1
    I don't agree that a truly random number is philosophical. A number can't be random, but a sequence of numbers can be. The way my prof who taught me this phrased it was "is 2 random?" It's a nonsensical question. So for a sequence of numbers to be random, there is a definition that is accepted in information theory. The definition of a random sequence is that it is the shortest definition of itself. E.g. you can't compress the data in it at all. – Darth Egregious Dec 09 '11 at 21:43
  • 2
    @user973810 The problem with that definition from information theory is that you cannot exhibit an actual instance of a random sequence. We can prove, for any reasonable definition language, that almost every infinite sequence (in a technical sense) is random, because it cannot be described in the language at all. What is more useful is the concept of a random sequence generator: not one that produces a random sequence, but one that produces a sequence randomly. – Gilles 'SO- stop being evil' Dec 09 '11 at 22:07
  • 14
    Slight nitpick: some folks, namely nuclear and particle physicists, are pretty certain that processes like atomic decay _are_ truly random. – David Z Dec 10 '11 at 02:03
  • 12
    @David: We can go a bit further than that even. The various experiments on Bell's inequality shows that certain quantum processes are *definitively unpredictable*. They may be random in some philosophical sense or they may depend on non-local hidden variables, but either case prevents reliable prediction. – dmckee --- ex-moderator kitten Dec 10 '11 at 18:50
  • @DavidZaslavsky, Einstein was disturbed enough by the unpredictability/randomness of quantum mechanics to say "God does not play dice with the universe" on more than one occasion. – Tangurena Dec 12 '11 at 01:25
  • 7
    @dmckee: yeah, I just figured it would be easier to stay away from trying to explain the connection between Bell's inequality and wavefunction collapse in comments on prog.SE. People can always come to [our site](http://s.tk/physics) if they're curious ;-) Tangurena: true, Einstein did say that, but that just meant he really really wanted the universe to be deterministic. It's not, though. Experiments done after Einstein's death showed that pretty conclusively (barring non-local hidden variables, a.k.a. _weirdness_). Just because he's Einstein doesn't mean he was right about everything. – David Z Dec 12 '11 at 01:45
  • just *getting* a random number is easy however. This url gives you one (as random as they come) between 1 and 100 : https://www.random.org/integers/?num=1&min=1&max=100&col=1&base=10&format=plain&rnd=new – Peter Jun 10 '15 at 20:28
23

True randomness implies nondeterminism. If it's deterministic, it can be accurately predicted (this is what determinism means); if it can be predicted, it is not random.

The best thing you can get from a deterministic pseudo-random number generator is a stream of numbers that has a very long cycle (non-repeating is impossible unless your RNG device has unlimited storage) which, for the length of the cycle, produces a stream numbers that meets all the other properties of a random sequence (a uniform distribution of values being the most interesting one).

To solve this issue, many modern UNIXes and Unix-likes have kernel RNG's that use physical noise sources to generate true randomness.

Another common approach is to take the current time as the seed for a deterministic RNG (srand(time(NULL)); in C); cryptographically speaking, this is worthless, since the current time is no secret, but for things like physical simulations or video games, it is good enough.

tdammers
  • 52,406
  • 14
  • 106
  • 154
  • Note that non-repeating is also impossible for any generator with bounded output value (limited number of bits). But of course the cycle length of a deterministic generator is most probably way shorter than the theoretical maximum which is all possible permutations. – 9000 Oct 27 '13 at 12:51
  • @9000: Of course this isn't true. Take an irrational number of use the digits (any base) as your "random" sequence. Boom! non-repeating sequence (by definition) and still bounded (to your base). – ThePopMachine Feb 19 '15 at 16:27
  • @ThePopMachine: you can generate a non-repeating sequence of bits of any length, equivalent to a non-repeating sequence of numbers of unbounded length. You cannot generate a non-repeating sequence of integer numbers of limited magnitude (e.g. 32-bit); once you have generated all permutations of 32-bit values, a sequence must repeat. You are right; we're just talking about different things. – 9000 Feb 19 '15 at 17:50
  • @9000: No weaseling. You made a blanket statement which is false. If you're really just trying to there aren't more than n^k different sequences of length k for n different values, and therefore it must repeat, then this is pretty obvious and not interesting. – ThePopMachine Feb 19 '15 at 18:40
  • 2
    @ThePopMachine: I'd appreciate if you toned down a bit. To quote, « non-repeating is also impossible for any generator with bounded output value (limited number of bits)». What you explicitly talk about is an unbounded number of bits, as a sequence of [binary] digits of an irrational number. Your statement, while true, is unrelated to the problem. – 9000 Feb 19 '15 at 21:00
  • @9000: Sorry, what you're saying still doesn't make sense. It is true that certain classes of PRNG must repeat because they rely on a limited internal state. What I'm pointing out, is that you can build other classes of generators that produce sequences that have statistically random appearance, don't repeat and are deterministic. You said "any generator". This is why what you're saying isn't true, whether you like my tone or not. How you are somehow applying the notion of my sequence not being bounded also doesn't fly. ...4795467007050347999588867695... is this bounded? – ThePopMachine Feb 20 '15 at 15:34
  • @ThePopMachine: I specifically speak about generators whose _external_ state is limited. Consider a generator that is based on a truly random process like radioactive decay. Its output, though, is a sequence of 32-bit integers. Once it has generated all possible distinct sequences of 32-bit integers, all 2^2^32 of them, it _has_ to start repeating, even if the underlying bit-generating random process never repeats. (The latter results in a non-repeating pattern of sometimes-repeating sequences.) – 9000 Feb 21 '15 at 03:21
  • @9000: There are a finite number of different sequences of a particular length. For 32-bit integers, N of them, it is 2^32^N (BTW even your number here isn't right). So what??? Yes, if you define for finite set of values and some finite length, there are only a finite number of those sequence before one has to recur. What does this have to do with how you generate them?? Nothing. You're just stating the [Pigeonhole principle](http://en.wikipedia.org/wiki/Pigeonhole_principle) – ThePopMachine Feb 21 '15 at 18:36
10

The second chapter of the book Discrete-Event Simulation: A First Course by Lawrence Leemis gives a fantastic introduction to random number generators (or more accurately, psuedo-random number generators).

An excerpt from his book explains it well in my opinion:

Historically three types of random number generators have been advocated for computational applications: (a) 1950's-style table look-up generators like, for example, the RAND corporation table of a million random digits; (b) hardware generators like, for example, thermal "white noise" devices; and (c) algorithmic (software) generators. Of these three types, only algorithmic generators have achieved widespread acceptance. The reason for this is that only algorithmic generators have the potential to satisfy all of the following generally well-accepted random number generation criteria. A generator should be:

  • random - able to produce output that passes all reasonable statistical tests of randomness;
  • controllable - able to reproduce its output, if desired;
  • portable - able to produce the same output on a wide variety of computer systems;
  • efficient - fast, with minimal computer resource requirements;
  • documented - theoretically analyzed and extensively tested.

So while it could be possible to use a white-noise generator to get "better" random numbers, they have not gained acceptance because they do not follow most of the criteria above.

I would recommend that you get your hands on a copy of that book (or on something similar). Understanding exactly how PRNG's work will definitely assist you in your efforts.

riwalk
  • 7,660
  • 3
  • 29
  • 32
9

Because You need to write code to generates the random numbers and Code is NOT random. (It's deterministic)

So you wind up starting with a "Seed value(s)" that is picked at "Random" (usually the current time stamp) then use it in an algorithm to start generating numbers. But the entire set of is based off the original Seed value!

So if you run your code again with the exact same Seed value(s), you will get the EXACT same SET of numbers! How can any reasonably person call that random? But it sure does LOOK random.


Regarding making them unique, After generating a number simply check if you already have that number, if you do, throw it away and generate a new one.

Morons
  • 14,674
  • 4
  • 37
  • 73
6

I have a very simple definition of Pseudo Random:

Too many unknown variables to predict.

I also have a simple definition of True Random:

Infinite unknown variables.

The problem with a computer is that it always knows ALL variables. The random number is simply a mathematical function of some seed value.
The best we can do is to give the computer a pseudo-random seed value, which is usually based off a variable that we can't predict (such as exact time).

Even though a computer is absolutely unable to create a random number, it is good at introducing too many variables to predict!

Scott Rippey
  • 321
  • 1
  • 4
  • 1
    Well "time" is a bad example of something that can't be predicted. Mouse movement, microphone input, etc. on the other hand are input that aren't predictable. – HoLyVieR Dec 10 '11 at 03:17
5

Since you are generating random numbers, you should expect the generated values to be non-unique. This is a property of randomness - you can't say a sequence of truly random (or even pseudo-random) numbers is unique, because that requirement would allow the final value in the range to be predicted, as well as changing the probability of all the unchosen numbers each time a new one is selected.

James McLeod
  • 7,613
  • 4
  • 21
  • 34
4

Generating truly random numbers in software is indeed not possible as others have pointed out, however it is possible with hardware to build a device which can generate truly random numbers*. There are quite a few examples of this on the internet, and there are a variety of methods used, from reading the time between ticks on Geiger counter to sampling the white noise (mostly background radiation from the universe) of an untuned receiver. I myself have built a few using a few of the methods available.

*Any good physics geek will point out that given the way the universe operates none of these are hyper-technicaly truely random but there is no reasonable way to predict the results so for the sake of this discussion they are sufficiant.

Unkwntech
  • 141
  • 6
  • 5
    As a part-time physics geek, generators based on quantum events are (as far as we've been able to tell) truly random. People who dislike randomness have been trying to take the randomness out of quantum mechanics since it started, and all it's done is pile up more evidence that it's truly random. – David Thornley Dec 09 '11 at 21:43
  • @DavidThornley, ...until someone figures out the formula. – CaffGeek Dec 09 '11 at 21:49
  • 1
    @Chad: There is no formula in the usual sense; that was ruled out by the EPR experiments. It's certainly conceivable that it's all deterministic, but not in any easily understandable way. – David Thornley Dec 09 '11 at 22:28
  • @DavidThornley, I knew that was the wrong word to use. I think we know what I was trying to say. Nearly whenever someone says something is impossible, someone else eventually proves them wrong. It's human nature. – CaffGeek Dec 09 '11 at 22:38
  • 2
    That's like saying eventually someone is going to make a machine that can solve the halting problem because someone said it was impossible. It's not a matter of finding the equation, it actually is random according to all the experiments that have been run and that math that backs it up. – Alex Dec 10 '11 at 03:28
3

There is no way you can produce a random number without a special hardware. In my freshman year, a couple of classmates and I proposed a random number generator that has basically a AM receiver and tuned to 4 different channels, get the input into a A to D converter and add them all (modulo your max number). Since the combination of analog input from any arbitrary number of stations is random and we could produce a large number of random numbers from the A2D convertor we proposed this could be a good generator. Of course, even this is not truly random in a philosophical sense, though for most practical purposes this could work.

2

Determinism is essentially a function. Remember from Algebra that a function is a correspondence between a domain and range such that each member of the domain corresponds to exactly one member of the range.

So if f(x) = z, f(x) != y unless y is z. That is a function. Imagine JavaScript:

function Add(A, B) {
      return A + B;
}

var addedNumber = Add(2,3);//returns 5
addedNumber = Add(2,3);//still 5

No matter how many times you call Add(2,3) it will always return 5. In other words, Add() is a deterministic function.

External factors can make Add behave in a non-deterministic fashion. For example, if you introduce multithreading into the equation. Human input also causes non-determinism.

Now, this is where things get interesting.

“Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.”

Note Von Neumann states, "arithmetical methods of producing [...]". This is not talking about human input, concurrency, sample wind speeds read from a precise instrument or other non-algorithmic ways of producing random input to a deterministic function.

This simply states a function or system of functions is not going to suddenly become non-deterministic. In other words, Add(2,3) will not somehow return 6 or anything other than 5 given the same inputs. That is impossible.

The quoting author takes it a step further.

The best we can hope for are pseudo-random numbers, a stream of numbers that appear as if they were generated randomly.

The context is previously defined to be "on any deterministic device". I could end the argument here. But, what if we change up the context by introducing a new element to the system? A non-deterministic element added as input makes the system a non-deterministic system. Although, by removing the non-deterministic element we are reduced back to a deterministic system. If we can somehow trace or otherwise reproduce the inputs we can reproduce a result. But this entire paragraph is tangetenial to what the author is saying. Remember the context.

One could argue over the meaning of non-determinism. Once again, tangetenial. Remember the context.

So he is correct. On any deterministic device it is impossible for a deterministic system to produce a true random result.

P.Brian.Mackey
  • 11,123
  • 8
  • 48
  • 87