Why are there so many equations for variance?

A student asks:

Why are there so many equations for the variance?

In S1, depending on the board you’re working with, you might need to know three equations for variance. For listed data, it’s:

$\Var(X) = \frac{\sum x^2}{n} - \left(\frac{\sum x}{n}\right)^2$

For grouped data, it’s:

$\Var(X) = \frac{\sum fx^2}{\sum f} - \left(\frac{\sum fx}{\sum f}\right)^2$

And for a discrete random variable, it’s:

$\Var(X) = {\sum px^2} - \left({\sum px}\right)^2$

Wow. That’s an awful lot of equations.

Until you realise there’s just one formula

If you play ‘spot the difference’ with the four equations, you might notice a few similarities – for example, they’re always the difference between two things. The first one is usually something squared inside a fraction, and the second is usually a fraction squared. That’s not a coincidence. You might even notice that the second term in each equation is the square of the mean of the variable. That’s not a coincidence, either.

In fact, all of the variance formulas come from one, single master formula:

(Variance) = (mean of the squares) - (square of the mean).

Let’s look at them one by one

The first one is probably the easiest:

$\Var(X) = \frac{\sum x^2}{n} - \left(\frac{\sum x}{n}\right)^2$

The first term: you add up all of the squares of the numbers, and divide by how many things are in the list ($n$). That’s the mean of the squares of the numbers. The second term is “add up all of the numbers, divide by how many there are, and square the result” – that’s simply squaring the mean.

The second isn’t much harder:

$\Var(X) = \frac{\sum fx^2}{\sum f} - \left(\frac{\sum fx}{\sum f}\right)^2$

If you start with the second term and accept that $\frac{\sum fx}{\sum f}$ is the mean of $x$ – which it is, you’ve been doing that since GCSE – then you pretty much have to accept that $\frac{\sum fx^2}{\sum f}$ is the mean of the squares of $x$.

Lastly, the probability-based one:

$\Var(X) = {\sum px^2} - \left({\sum px}\right)^2$

This is really just the same as the previous one, only with $p_i = \frac{f_i}{\sum f_i}$ – that is to say, the probability of any event is its frequency divided by the total of the frequencies. One you see that, the template falls into place: it’s, again, the mean of the squares minus the square of the mean.

Knowing that even helps you in later modules: for a probability density function $f(x)$, the mean of $x$ works out to be $\frac{\int xf(x) \dx}{\int f(x) \dx}$ between appropriate limits. From that, you can jump straight to the variance formula for a continuous random variable:

$\Var(X) = \frac{\int x^2f(x) \dx}{\int f(x) \dx} - \left( \frac{\int xf(x) \dx}{\int f(x) \dx}\right)^2$.

Nice ((For not-very-nice values of nice))!

Until you realise there’s just one formula

Let’s look at them one by one

A selection of other posts