Written by Colin+ in probability.

A nice prompt from @shahlock, some time ago:

Math Prompt #apstats #mtbos

Two players A, B. A is 4-0 against B. How would you estimate probability A wins next match? Assume independence— M Shah (@shahlock) November 27, 2016

Stand back, everyone: I’m going to apply Bayes’s Theorem.

Let’s assume that, before we knew anything about the teams, we could have believed equally well in every possible value for $p_A$, the probability of team A winning. If someone had said “Team A has a 40% chance!” or “… a 90% chance” or “… no chance at all”, we’d have had no reason to believe any one of them over any other. So let’s roll with a *uniform prior* – each value of $p_A$ is equally likely. The probability density for any value of $p_A$, between 0 and 1, is 1 (giving us an area of 1, as required).

Our best guess of $p_A$ would be the median or the mean of the distribution – here, both turn out to be 0.5, which is reasonable.

Now, given that team A is on a bit of a roll here, we’d be within our rights to suspect that their probability of having had a 10% chance is rather lower than that of having had a 90% shot – it’s time to update our beliefs!

So, suppose $p_A$ was, for example, 0.5. What’s the likelihood of A winning the first four games under that scenario? That’s easy, it’s $0.5^4$ or 0.0625. Had $p_A$ been 0.8, we’d get a 4-0 result about 40% of the time.

But there are infinitely many possible $p_A$s, and we don’t want to do that for all of them. Or rather, we do – but all at once.

The likelihood of a 4-0 result, given $p_A$, is simply $p_A^4$. We can give an updated distribution of the likelihood of each probability.

(Beware, this is not a probability distribution, because its integral isn’t 1; however, dividing by the total area would give us a probability distribution.) And the total area? That’s $\int_0^1 x^4 \dx$, which is $\frac{1}{5}$.

So, the probability distribution works out to be $P(p_A = x) = 5x^4$. That’s clearly much larger for large $p_A$ than it is for small, which is good.

The mean of the distribution is $\int_0^1 x P(p_A =x) \dx$, or $\int_0^1 5x^5$. That turns out to be $\frac{5}{6}\approx 0.833$, which doesn’t seem implausible.

As for the median, that’s the number $m$ such that $\int_0^m P(p_A=x)\dx = \frac{1}{2}$. In this case, that’s $\int_0^m 5x^4 \dx = \frac{1}{2}$, so $m^5 =\frac{1}{2}$ and $m = 0.5^{0.2} \approx 0.871$.

The mode, of course, is 1: the most likely outcome is that A will win every time.

It’s hard to say. I tend to lean towards the median in cases like this, but only because I like medians. I would say that the mode is a poor indicator (after all, if team A were 1-0 up, the posterior mode would also be 1 – which is definitely an over-claim).

The mean and the median are fairly close together here (and would, as more information came in, get closer and closer), so an answer of “somewhere in the mid-80s” is likely as solid an answer as I’d be happy to give.

## Barney

Really liked this blog, even though it’s taken a while for me to get round to reading it! Would it be fair to say the median is preferable to the mean due to the extreme skewness of the kx^4 distribution?

## Colin

That would be fair, I think.