Sleazy P. Martini is up to his old tricks again, throwing dice on the wharf and winning lots of bets. You (an observer with statistical knowledge) suspect that Sleazy P. is pulling some shenanigans. That is, you suspect the die he is using is loaded.
Question: How do you prove this?
Sleazy P. Martini Goes On Trial
Partial Answer: You collect data.
You start writing down the result of each roll of Sleazy P.'s die, and record the following data:
5 1 6 5 5 6 4 5 4 3 4 3 5 6 2 3 2 4 6 2 6 6 2 6 6 2 2 2 4 2 6 5 5 2 2 4 6 6 3 6 1 4 3 3 3 4 6 4 3 2 6 6 6 6 3 4 3 5 1 6 6 5 2 5 2 1 3 1 6 3 4 3 5 6 3 6 6 5 6 4 3 4 1 3 6 4 2 6 5 3 2 6 2 5 5 5 5 5 1 4
The sample mean of the above data is $\overline{x}=4.02.$
Is this evidence that Sleazy P. has loaded the die?
The Sampling Distribution
If Sleazy P. is innocent, the sampling distributions should be very close to $$N(3.5,0.171).$$ Assuming that Sleazy P. is innocent, what is the probability of seeing the sample mean we recorded?
Busted! Sleazy P. Goes to Jail
Assuming that Sleazy P. is innocent, the chances of observing a sample mean as far from $3.5$ as $\bar{x}=4.02$ over $100$ rolls is $0.0024.$
This is very strong evidence that he is not innocent, but that he actually loaded the die.
Hypothesis Testing
We just performed a hypothesis test.
The null hypothesis is that Sleazy P. is that innocent and that the true mean of his die is $3.5.$ This is denoted as $$H_0: \mu=3.5.$$ The alternative hypothesis is that Sleazy P. is guilty, and really did load the die, altering the fair-die mean of $3.5.$ We denote this as $$H_a: \mu \neq 3.5$$
Hypothesis Testing
We saw that the probability that Sleazy P. is innocent given the sample mean we recorded was $0.0024.$
We computed a test statistic: $$z=\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}=\frac{4.02-3.5}{1.71/\sqrt{100}}=3.04$$ The probability of observing a test statistic AS EXTREME OR MORE EXTREME than $3.04$ is $0.0024.$
The value $0.0024$ is called the $p\mbox{-value}$ of the test.
Vocab
We say that the result of hypothesis test is statistically significant if the $p\mbox{-value}$ falls below a certain threshold.
Common thresholds are $\alpha=0.05$ and $\alpha=0.01.$
In this case, we say that a test is significant at the level of $\alpha$ when $p\mbox{-value} \lt \alpha.$
In the case of statistical significance ($p\mbox{-value} \lt \alpha$), we reject the null hypothesis.
Another way to say it:
When the $p\mbox{-value}$'s low, $H_0$ has got to go!
OR...
If the $p\mbox{-value}$'s low, reject the !
The Way Mr. Holt Likes to Think About It
The $p\mbox{-value}$ is a measure of our belief in the null hypothesis $H_0.$
In Sleazy P's case, at the level of significance $\alpha=0.01$, there is statistically significant evidence that he loaded the die since $$p\mbox{-value}=0.0024<0.01.$$ Note: The result is significant at the $0.05$ level too.
Consumer Advocacy: Sleazy P. Gets out of Jail
After serving his sentence, Sleazy P. Martini is now rehabilitated and ready to enter society as a soft drink manufacturer.
Sleazy P. has created a new brand of cola called Sleazy P.'s Easy Peazy.
Now, a $12 \mbox{ fl oz}$ can of soda should contain $355 \mbox{ mL}$ of product, but you (the statistically savvy citizen) notice that in general, there seems on average to be less.
Hmmmmm.... Is Sleazy P. up to his old tricks AGAIN!?
Big Question: How do we find out?
Big Answer: Gather data and perform a hypothesis test.
Because $355 \mbox{ mL}$ is printed on the can, we know the true mean $\mu$ should be a little bigger than $355$ to prevent underfilling.
After a little research, we learn from various soft drink manufacturers that, in fact, the amount should vary according to a normal distribution with mean $\mu = 355.2 \mbox{ mL}$ and standard deviation $\sigma = 0.5 \mbox{ mL}.$
Next, we begin collecting data...
The Data
Taking a simple random sample from around the country, we procured $40$ cans of Sleazy P.'s Easy Peazy.
Here is the data.
355.1 354.7 354.4 355.1 355.3 354.5 354.1 354.4 355.2 355.1 353.9 354.3 355.3 355.1 354.9 354.9 355.5 356.0 354.8 356.2 354.5 354.9 355.8 354.8 354.1 354.8 355.1 354.4 355.2 354.0 354.0 355.4 354.7 354.9 355.1 355.6 355.7 355.1 354.9 355.6
The Hypothesis Test
$$H_0: \mu=355.2$$ $$H_a: \mu<355.2$$
From our data we get $\overline{x}=354.935.$
Using $\sigma=0.5$ we compute the $z$-statistic: $$z=\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}=\frac{354.935-355.2}{0.5/\sqrt{40}}=-3.35$$ For this test statistic, $p\mbox{-value}=0.0004.$
Interpretation
Given that the null hypothesis ($H_0: \mu=355.2$) is true, the chances of seeing a test statistic of $z=-3.35$ OR LESS is $0.0004.$ That is, $p\mbox{-value}=0.0004.$
At the $\alpha=0.01$ level of significance, we reject the null hypothesis.
Conclusion: Reject $H_0.$ That is, Sleazy P. is underfilling. Thus, we can still count on Sleazy P. to be a shady wheeler-dealer.
The General Hypothesis Test
When you perform a hypothesis test, you need to:
Step 1: State your hypotheses: $H_0$ and $H_a.$
Step 2: Compute the test statistic (in this case, the $z$-statistic).
Step 3: Determine your $p\mbox{-value}.$
Step 4: State your conclusion (keep or reject $H_0$). If your $p\mbox{-value}$ falls below $\alpha,$ then we reject $H_0.$ Otherwise, we keep $H_0$. Also, summarize the conclusion using the language of the problem situation.
Example: The following are a random sample of $n=30$ IQ scores of seventh-grade students from a school district in Portland:
128, 96, 114, 100, 105, 114, 111, 132, 112, 91, 119, 98, 86, 74, 103, 103, 72, 107, 118, 93, 104, 114, 111, 130, 89, 112, 102, 112, 120, 108
Assume that the IQ scores in this population has a normal distribution with standard deviation $\sigma=15.$ A previous estimate of the mean IQ $\mu$ of seventh graders from this school district is $105.$ We suspect that the true value may actually be higher. To test our suspicion, we carry out a test of significance on the data we collected above.
At the $\alpha=0.05$ level of significance, what is the conclusion?
Step #1: We suspect that the true mean may actually be higher than $105.$
The null hypothesis says that there is "no difference: the population mean is $105,$" whereas the alternative hypotheses
says that "there is a difference: the population mean is higher than $105.$"
These hypotheses are expressed as
$$
\begin{array}{c}
H_0: \mu=105\\
H_a: \mu \gt 105
\end{array}
$$
Step #2: From our data set, we have that $\bar{x}=105.933$ and $n=30.$
Our null hypothesis assumes that $\mu=105$ and $\sigma=15.$ From these we
may now compute our test statistic, in this case, the $z$-statistic:
$$
z=\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}=\frac{105.933-105}{15/\sqrt{30}}=0.34
$$
to two decimal places.
Step #3: The probability of observing a $z$ test statistic of $0.34$ or higher is $$ P(z \gt 0.34)=1-P(z \lt 0.34)=1-0.6331=0.3669 $$ from Table A. Therefore, $p\mbox{-value}=0.3669.$
Step #4: Since our $p\mbox{-value}= 0.3669 \gt 0.05=\alpha,$ we keep the null hypothesis $H_0$. In plain language, the data set we have does not give significant evidence that that the mean IQ score of seventh graders from this school district is higher than $105.$
Step #3: The probability of observing a $z$ test statistic of $0.34$ or higher is $$ P(z \gt 0.34)=1-P(z \lt 0.34)=1-0.6331=0.3669 $$ from Table A. Therefore, $p\mbox{-value}=0.3669.$
Step #4: Since our $p\mbox{-value}= 0.3669 \gt 0.05=\alpha,$ we keep the null hypothesis $H_0$. In plain language, the data set we have does not give significant evidence that that the mean IQ score of seventh graders from this school district is higher than $105.$
Sleazy P.'s Die (Revisited):
We are now going to roll Sleazy P's crooked die and perform the following hypothesis test
after each roll at significance level $\alpha=0.01.$
$$\begin{array}{c} H_0: \mu=3.5 \\ H_a: \mu \neq 3.5 \end{array}$$