Written by Colin+ in core 2, logarithms, probability.

“If you had an infinite number of monkeys, there’d be no room for typewriters.”

— Jason Arnopp

Yes, an infinite number of monkeys would eventually — in fact, before very long at all — write Shakespeare. The problem, then, is finding which of the monkey-poo-smeared manuscripts is actually the whole of Shakespeare, rather than (say) a handful of Sonnets and Act 4 Scene V of Measure for Measure.

But I’m not going to solve that problem. I’m going to solve the more obvious probability question of ‘how many monkeys would you need to have a better than 50-50 chance of getting Shakespeare first time out?”

I’m going to estimate that Shakespeare’s 40-odd plays each have an average of 3,000 lines, each of which has an average of 80 characters in, to give me a nice round number of around 10,000,000 characters — good enough for government work[2].

Each monkey, in this thought experiment, has a typewriter with 27 keys (A-Z and a space bar) which he or she hits at random — so the probability of hitting the right key, by assumption, is $1/27$.

For a monkey to type the whole of Shakespeare, he/she would have to hit the correct keys 10,000,000 in a row, making the probability of typing all of the Bard’s plays in a oner $\left(\frac{1}{27}\right)^{10,000,000}$, which is an enormously small number. Your calculator will call it zero. I will call it $3^{-30,000,000}$, so its base-10 logarithm is about $-30,000,000 \times 0.5$. That makes the probability about 1 in $10^{15,000,000}$ — about the same as winning the lottery 2.1 million times in a row.

It’s unlikely.

So, how many monkeys would you need to make sure you had at least a 50-50 chance of getting Shakespeare? This is one of those “one-minus-the-probability-of-it-not-happening” questions. Let’s call the probability of a given monkey producing Shakespeare $p$. The probability of the first monkey not going all Shakey on us is $(1-p)$. The probability of neither of the first two coming up with the goods is $(1-p)^2$… and so on.

That means we’ll need to solve the inequality $(1-p)^n < P$, for the $p$ we already found and $P$ — the probability of not getting our Shakespeare after $n$ monkeys — $P = 0.5$. (We can adjust $P$ to be 99% sure, or any other number we like — so I’ll leave it as $P$ for now.) That’s a simple logs question, then: $n \log(1-p) < \log(P)$ It gets even simpler if you use natural logs rather than base 10 — it’s ok, base 10, you’re still my friend, just this is $\ln$’s big talent. There’s a nice approximation that says $\ln(1-p) \approx -p -\frac{1}{2}p^2 - O(p^3)$ — but since $p$ is already minuscule, we can get away with just the first term. So, since $\ln(1-p)$ is negative, we need to turn the crocodile around when we divide it over: $N > \frac{\ln{P}}{\ln(1-p)}$

We can look up $\ln(0.5) \approx 0.7$ — so our number of monkeys needs to be $\frac{0.7}{10^{-15,000,000}}$, which means we need something on the order of $10^{15,000,000}$ — one with 15 million zeros — monkeys before we have a 50-50 chance of producing Shakespeare first time out.

Of course, that’s a lot of monkeys, but it’s less than an infinite number of monkeys. Whatever $P$ you pick[3], you can always find a number of monkeys $n$ that will give you a better than $(1-P)$ chance of producing Shakespeare — which means an infinite number of monkeys is certain to produce all of the plays first time out.

No, I don’t know where you can find an infinite number of monkeys.

[1]￼ It's OK - I'm going to get the monkeys rooms in a Hilbert hotel

[2] If you pays peanuts...

[3] Between 0 and 1, pedantic-pants