Question: What are some common statistics you hear about on the news, radio, advertisements, etc.?
At this point we know $2$ measures of center: the mean (average) and the median.
Today we add another to that list: the mode.
Measures of Center
The (Arithmetic) Mean: Add up all the data values and divide by the number of data values.
The Median: The very center of the data: $50\%$ of data lies below, $50\%$ lies above.
The Mode: The data value that occurs most often.
Some Notation
The population mean we denote with the Greek letter $\mu$ (pronounced "mew," the sound newborn kittens make).
A sample mean is denoted as $\bar{x}$.
We may formally write an average (adding up $n$ data points and then dividing by $n$) as $$\bar{x}=\frac{x_1+x_2+x_3+\cdots+x_n}{n}=\frac{\displaystyle \sum x_j}{n}.$$ The symbol "$\displaystyle \Sigma$" simply means "add 'em all up!"
Example: Find the mean, median, and mode of the data set: $$3, 3, 4, 5, 7.$$
Example: Consider a data set of the travel times in minutes for $15$ workers in North Carolina, chosen at random by the Census Bureau which is summarized by the stem plot below. $$ \begin{array}{|r|l|} \hline 0 & 5\\ \hline 1 & 000025\\ \hline 2 & 005\\ \hline 3 & 00\\ \hline 4 & 00\\ \hline 5 & \\ \hline 6 & 0\\ \hline \end{array} $$ What are the mean, median, and mode of this data set?
Technology Reminder
$$ \begin{array}{|r|l|} \hline 0 & 5\\ \hline 1 & 000025\\ \hline 2 & 005\\ \hline 3 & 00\\ \hline 4 & 00\\ \hline 5 & \\ \hline 6 & 0\\ \hline \end{array} $$ Don't forget you may (and should) use your TI-84 Calculator.
And, of course, there's always Holt.Blue.
Calculating the Mean from Grouped Data (i.e., Frequency Table)
Example: The figure below is a histogram of the self-reported number of daily servings of fruit eaten for a sample of $74$ students. Calculate the mean from this grouped data.
Since we know the frequency of each individual value, we can calculate the mean exactly.
$$ \begin{array}{|l|l|}\hline \mbox{Servings of Fruit} & \mbox{Frequency} \\ \hline \mbox{0} & 15 \\ \hline \mbox{1} & 11 \\ \hline \mbox{2} & 15 \\ \hline \mbox{3} & 11 \\ \hline \mbox{4} & 8 \\ \hline \mbox{5} & 5 \\ \hline \mbox{6} & 3 \\ \hline \mbox{7} & 3 \\ \hline \mbox{8} & 3 \\ \hline \end{array} $$
$$ \begin{array}{rcl} \bar{x}&=&\displaystyle \frac{\mbox{Total Servings of Fruit}}{\mbox{Number of Students}}\\ &=&\displaystyle \frac{0 \cdot 15+1 \cdot 11+2 \cdot 15+3 \cdot 11+4 \cdot 8+5 \cdot 5+6 \cdot 3+7 \cdot 3+8 \cdot 3}{15+11+15+11+8+5+3+3+3}\\ &=&\displaystyle \frac{194}{74}\\ &\approx& 2.62 \end{array} $$
For Those Who Like Formulas (You Don't Have to Memorize This) $$ \begin{array}{|c|c|}\hline \mbox{Data Point} & \mbox{Frequency} \\ \hline d_1 & f_1 \\ \hline d_2 & f_2 \\ \hline d_3 & f_3 \\ \hline \vdots & \vdots \\ \hline d_n & f_n \\ \hline \end{array} $$ $$ \bar{x}=\displaystyle \frac{\displaystyle \sum d_j \cdot f_j}{\displaystyle \sum f_j}=\displaystyle \frac{d_1 \cdot f_1+d_2 \cdot f_2+\cdots+d_n \cdot f_n}{f_1+f_2+\cdots+f_n} $$
Calculating the Mean from Grouped Data
Note: If given a histogram or frequency polygon, we can also calculate the mean.
Frequency | |
Number of Servings |
Estimating Mean from Grouped Data
Example: Below is a frequency table of a random sample of the miles per gallon ratings of $30$ cars. Estimate the mean from this grouped data.
Since we DON'T know the frequency of each individual value, we estimate the mean by using the midpoint of each interval instead of actual data values. $$ \begin{array}{|l|l|}\hline\mbox{Mileage} & \mbox{Frequency} \\ \hline \mbox{15 to 20} & 4 \\ \hline \mbox{20 to 25} & 3 \\ \hline \mbox{25 to 30} & 6 \\ \hline \mbox{30 to 35} & 6 \\ \hline \mbox{35 to 40} & 7 \\ \hline \mbox{40 to 45} & 3 \\ \hline \mbox{45 to 50} & 1 \\ \hline\end{array} $$
$$ \begin{array}{rcl} \bar{x}&=&\displaystyle \frac{\mbox{Sum of Mileages}}{\mbox{Number of Mileages}}\\ &\approx& \displaystyle\frac{17.5 \cdot 4+22.5 \cdot 3+27.5 \cdot 6+32.5 \cdot 6+37.5 \cdot 7+42.5 \cdot 3+47.5 \cdot 1}{4+3+6+6+7+3+1}\\ &=&\displaystyle \frac{935}{30}\\ &\approx& 31.167 \end{array} $$
For Those Who Like Formulas (You Don't Have to Memorize This) $$ \begin{array}{|c|c|}\hline \mbox{Data Points} & \mbox{Frequency} \\ \hline x_0 \mbox{ to } x_1 & f_1 \\ \hline x_1 \mbox{ to } x_2 & f_2 \\ \hline x_2 \mbox{ to } x_3 & f_3 \\ \hline \vdots & \vdots \\ \hline x_{n-1} \mbox{ to } x_n & f_n \\ \hline \end{array} $$ $$ \bar{x} \approx \displaystyle \frac{\displaystyle \sum m_j \cdot f_j}{\displaystyle \sum f_j}=\displaystyle \frac{m_1 \cdot f_1+m_2 \cdot f_2+\cdots+m_n \cdot f_n}{f_1+f_2+\cdots+f_n} $$
Estimating Mean from Grouped Data
Note: If given a histogram or frequency polygon, we can also estimate the mean, median, and mode.
Frequency | |
Miles Per Gallon |