If I had £35 every time a student said “I don’t get linear interpolation,” I’d have pretty much the same business model as I do right now.

Everyone knows it’s something to do with finding medians and quartiles, and something to do with the class width and… stuff. Some can even start writing down a fraction.

However, there’s another way that involves less remembering and more… doing maths.

Why do you think it’s called linear?

You remember at GCSE, if you wanted to find the median from a cumulative frequency graph, you’d draw a line across from the index of the median and follow it down to the axis. That’s exactly what you do with linear interpolation except:

  • a) You don’t need to draw the graph, and
  • b) The graph is made up of straight lines rather than a smooth curve.

After that, the question is, what’s the straight line?

Well, if you were to draw the graph, it ought to go through the point at the end of the class below the median, and the end of the class containing the median. For example, if we had the table:

Height Frequency Cumulative
$0 \le x \lt 10$ 5 5
$10 \le x \lt 20$ 8 13
$20 \le x \lt 30$ 19 32
$30 \le x \lt 40$ 4 36

Since there are 36 whatevers, the median will be the height of the 18th thing. That’s in the third group - which begins at (20, 13) and ends at (30, 32). All we need to do is find the gradient of the line through those two points, and then the value of $x$ that gives us $y=18$.

So, the gradient is $\frac{32-13}{30-20} = 1.9$, which makes the equation of the line $y - 13 = 1.9(x-20)$, picking the lower point. It works just as well with the other.

Now, we want where $y=18$, so substitute that in to get $5 = 1.9 (x-20)$ and you can say $x-20 = \frac{5}{1.9}$ - or $x = 22.63$. That’s your interpolated value for the median!

One thing to watch out for

I’d be remiss in my duties if I didn’t tell you to watch out for sneaky frequency tables that don’t quite join up - where the measurements are rounded to the nearest whole number or similar. In that case, the lower end is half a unit lower than you might expect, and the upper end half a unit higher. For example, if you had classes of 7-8, 9-11 and 12-16 (all to the nearest whole number), the middle class is really 8.5 to 11.5.

This bit of sneakiness affects any method you use for finding the median - you’d do well to watch out for it!

  • Edited 2014-02-17 to fix a typo