“How is it,” asked my friend Jo, “that five of my friends are celebrating birthdays today?”

This is, although it might not seem it, closely related to reading the minds of Alex Bellos’s parents. By which I mean: randomness doesn’t always look random.

From here on in, I’m going to make the sensible simplifying assumption that if I pick one of Jo’s friends at random and a day out of the calendar at random, the chance of that day being the friend’s birthday is 1/365.

In fact, it’s not particularly unusual for someone with about 300 facebook friends to find at least five of them share a birthday - it’s not quite 50-50, but it’s not far off. How do I know? I ran a simulation.

But that’s cheating!

Of course it is. But I have a cold, so I’m allowed. And once I had an idea of the answer, I went back and did it properly, making yet another simplifying assumption: that the number of people with a given birthday follows (at least approximately) a Poisson distribution.

It’s a bit fiddly to explain why a) this is a reasonable assumption and b) why it’s a bit statistically dubious, but I asked the mathematical ninja about it; he rolled his eyes but said ‘go on, then,’ so it’s probably ok. ((Proof by authority, one of the strongest proof methods there is.))

What is a Poisson distribution?

A Poisson distribution describes the probabilities of relatively rare events that happen independently with a fixed probability ((score three for the birthdays!)) over a continuous timeframe ((ok, so three out of four ain’t bad)). If you know the number of events you expect to happen over a given time (call it $\lambda$), you can work out the probability of any number of events happening over that time using this, rather ugly, formula:

\[P(X = n) = e^{-\\lambda} \\frac{\\lambda^n}{n!}\]

It’s ok, I’ll do the sums so you don’t have to.

The first thing to do is to estimate the value of $\lambda$ - I did some spying and spotted that Jo has 290 friends, so the expected number of birthdays on any given day is $\frac{290}{365} \simeq 0.795$. You can then use the formula to find out the probability of any given day having no birthdays (45% or so), one birthday (36%), two (14%) or three (4%). The chance of four birthdays is about three-quarters of a percent, and five birthdays… well, it’s tiny! One in 838, apparently. The probability of five or more birthdays, though, is a little larger: about 1 in 730, 0.14%.

That means it shouldn’t ever happen, doesn’t it?

Well, it’s a rare event, right enough. However, you have to take into account that there are 365 possible days for a confluence of birthdays to happen on - which makes it a lot more likely.

It’s the old birthday paradox rearing its head again: if you repeat an experiment enough times, your chances of avoiding something improbable get consistently smaller. The probability of January 1st having four or fewer birthdays is about $0.9986$. The probability of all 365 days of the year having four or fewer birthdays is around ((This assumes independence, which isn’t strictly true - but it’s good enough for government work)) $0.9986^{365}\simeq 0.6063%$ - you have about a three in five chance of avoiding a five-way celebration. Or, if you prefer, if you have 290 friends, there’s about a 40% chance that five (or more) of them share a birthday.

Edited 2014-05-05 to add a link.