A student asked:

What’s the link between the Poisson formula and the binomial?

… and I started to cry a little bit.

Infinitely many trials

You use the Poisson distribution when you have events happening at a constant rate, on a continuous time-frame - as opposed to the binomial, which involves $n$ distinct events happening, each with its own probability. They are, however, related: in fact, as $n$ gets large, the binomial looks more and more like a Poisson (the distinct events get closer and closer together and, in the limit, become continuous.)

So, formulas

You can see that the binomial formula for having $r$ successes when you expect $\lambda$ is:

$^nC_r p^r (1-p)^{n-r}$, where $p := \frac{\lambda}{n}$. I plan to show that, as $n \rightarrow \infty$, that’s the same thing as the Poisson formula, $ e^{-\lambda} \frac{\lambda^r}{r!}$.

I’ve got to tell you, it’s not very nice - even when you take natural logs: we’re starting with $\ln\left( ^nC_r \right) + r \ln(p) + (n-r) \ln(1-p)$ and want to get to $-\lambda + r \ln(\lambda)- \ln(r!)$.

OK - well, $\ln(p) =\ln( \lambda) - \ln(n)$, and $\ln(1-p) = \ln(n-\lambda) - \ln (n)$. Let’s sub those in:

$LHS = \ln \left( ^n C_r \right) + r\ln (\lambda) - r\ln(n) + (n-r) \ln(n-\lambda) - (n-r)\ln(n)$

$ = \ln \left( ^n C_r \right) + r \ln(\lambda) + (n-r) \ln(n-\lambda) - n \ln (n)$, as the $r\ln(n)$ terms vanish.

The good news…

… is that we have the $r \ln(n)$ term. The bad news is, we have a lot more besides.

Time to bust out Stirling

Stirling’s approximation is a neat way to estimate factorials for large numbers: $\ln(z!) \simeq \frac 12 \ln(2\pi) + \left(z + \frac 12\right) \ln(z) - z$. It leads to a rather messy expression for $\ln\left(^nC_r\right) \simeq \left(n+\frac12\right) \ln (n) - \left(n -r + \frac 12\right) \ln(n-r) - r - \ln (r!)$.

You see why I cried a bit? Notice, too, that $r$ isn’t necessarily very big, so I haven’t Stirlinged it. (I’ve also got one eye on the final answer: I want an $r!$ in the there!)

Limiting assumptions

I can make it nicer, though - I know, because it’s going to infinity, that $n$ is much bigger than $\frac 12$. $n - r$ is also much bigger than $\frac12$, so I’m going to kick out some halves.

$\ln\left(^nC_r\right) \simeq n\ln (n) - (n -r ) \ln(n-r) - r - \ln(r!)$.

Back to what we had:

$LHS = \ln \left( ^n C_r \right) + r \ln(\lambda) + (n-r) \ln(n-\lambda) - n \ln (n)$ $ = n\ln (n) - (n -r ) \ln(n-r) - r - \ln(r!) + r \ln(\lambda) + (n-r)\ln(n-\lambda) - n\ln(n)$ - aha! the $n\ln(n)$s vanish. $ = - (n -r ) \ln(n-r) - r - \ln(r!) + r \ln(\lambda) + (n-r)\ln(n-\lambda) $ - now let’s combine the things with $(n-r)$ at the front:

$ = (n-r) \left[ \ln(n-\lambda) - \ln(n-r) \right] - r - \ln(r!) + r\ln(\lambda)$. All we need now is to show that the first two terms converge towards $-\lambda$ and we have the RHS.

Another approximation

For smallish $z$, $\ln(1 + z) \simeq z$. Unfortunately, we don’t have $\ln(1 + x)$, we have $\ln(n + \text{stuff})$. However, we can factor that out (and get the nice bonus that the $\ln(n)$s we produce vanish, too. That first term is equivalent to:

$(n-r) \left[ \ln\left(1-\frac{\lambda}{n}\right) - \ln\left(1-\frac{r}{n}\right) \right]$

For large $n$, that tends to:

$(n-r) \left[ - \frac{\lambda}{n} + \frac{r}{n} \right] = \frac{(\lambda-r)(r-n)}{n}$

Back to where we were:

$LHS = (n-r) \left[ \ln(n-\lambda) - \ln(n-r) \right] - r - \ln(r!) + r\ln(\lambda)$ $ = \frac{(\lambda-r)(r-n)}{n} - r - \ln(r!) + r\ln(\lambda)$

Now, I’m going to move the rogue $r$ into the fraction and expand the bracket:

$ = \frac{(\lambda r - n \lambda - r^2 + rn) - rn}{n} - \ln(r!) + r\ln(\lambda)$

Oh, goody, the $rn$s vanish:

$ = \frac{\lambda r - n \lambda - r^2}{n} - \ln(r!) + r\ln(\lambda)$ $ = \frac{\lambda r}{n} - \lambda - \frac{r^2}{n} - \ln(r!) + r\ln(\lambda)$

As $n \rightarrow \infty$, the two fractional terms go to zero, leaving us with:

$= - \lambda - \ln(r!) + r\ln(\lambda) = RHS$, and a well-deserved $\blacksquare$.