"He's not Judge Judy and executioner!"
- Danny Butterman, Hot Fuzz
There's a reason statistical experiments are sometimes called trials: they take up a lot of time and are pretty much the epitome of suffering.
What's that? Oh, sorry. No, apparently, it's because they're very similar to a court case.
So, let's think about what happens in court. You've got two sides: the prosecution (who want to prove beyond reasonable doubt that the bad guy - THAT GUY THERE! HIM! IN THE BAD SUIT! - did whatever dastardly and illegal deed he's on trial for) and the defence (who want to get their respectable client off the hook for these outrageous allegations, no matter how bad his suit is).
There's a jury made up of twelve of the defendant's peers who have to decide whether the prosecution made its case; and, of course, there's a judge, whose job it is to make sure everything is fair, and come up with a conclusion at the end.
That 'reasonable doubt' thing is important, so I want to flag it up again. The jury can only say 'he did it!' if they're completely sure he did. (Even so, they still make mistakes, but they're only human - and easily swayed by clever lawyers). If they're not sure, they have to give the defendant the benefit of the doubt and find him not guilty.
In a statistical trial, you have analogies to all of these parties - and even to the 'reasonable doubt' rule, although it's more explicit in statistics.
The part of the defence is played by something called the null hypothesis. This is the default, fall-back position: nothing has changed, there's nothing to see here, not guilty m'lud. You need quite a lot of evidence to overturn the null hypothesis - just like you need to be quite sure of the evidence before you decide someone is guilty.
How about if you're convinced the null hypothesis is wrong? In that case, you'd accept the alternative hypothesis - which is the thing the prosecution wants to prove, that the bad guy did it.
Now, the nice thing about statistics is that you can put numbers on things. In particular, you can say exactly how reasonable you mean by reasonable doubt (this number is the significance level, written as α (alpha)). Normally for stats, 5% is reasonable enough - you'd need to think there was a less that 1/20 likelihood that all of the evidence you've accumulated was due to luck alone before you'd convict the bad guy. That means you're willing to accept a 5% chance of a miscarriage of justice - which is quite high for a court of law; you might prefer 1% or 0.1% if you want to be more certain, but 5% is what you normally use.
As for the jury, they're the people who decide how convincing the evidence is. In this trial-by-statistics, the jury is represented by a statistical test - they decide whether a particular statistic (number representing the data) is in a critical range. Usually, if your test statistic is higher than a value you picked, you must convict.
And you, as the judge, have to keep everything fair. It's up to you to set the parameters of the trial: decide which tests are appropriate, decide what the critical region for the statistic should be - and come up with a conclusion at the end of it. And all the while, you want to make sure your decision is appeal-proof so that other statisticians don't come back and say 'hang on, something's not right here!' and kick you off the bench.
So, statistics is just like a court trial. Here's a quick summary table:
|Not guilty verdict||Null hypothesis|
|Guilty verdict||Alternative hypothesis|
|Reasonable doubt||Significance (usually α = 0.05)|
I rest my case. Leave any objections or considered legal opinions in the comments below