Statistics and Sampling

Statistics help us make sense of data! Learn to analyze populations using measures of center (mean, median) and measures of spread (range, IQR, MAD) to make comparisons and draw conclusions about groups.

Do this: Read the concept below, then try the quiz or activity.

Lesson 176 of 190
92%

Concept

Statistical measures reveal patterns and differences between data sets!

MEASURES OF CENTER: Tell you the "typical" or "average" value

1. MEAN (Average): Sum of all values ÷ number of values

Example: Test scores {70, 85, 90, 95} Mean = (70 + 85 + 90 + 95) ÷ 4 = 340 ÷ 4 = 85

2. MEDIAN (Middle Value): The middle number when data is ordered

For ODD number of values: the middle one Example: {3, 5, 7, 9, 11} → Median = 7

For EVEN number of values: average of two middle ones Example: {4, 6, 8, 10} → Median = (6 + 8) ÷ 2 = 7

3. MODE: Most frequent value(s)

Example: {2, 3, 3, 5, 5, 5, 7} → Mode = 5 (appears 3 times)

MEASURES OF SPREAD: Tell you how spread out or varied the data is

1. RANGE: Maximum - Minimum

Example: {10, 15, 20, 35, 40} Range = 40 - 10 = 30

2. INTERQUARTILE RANGE (IQR): IQR = Q3 - Q1 (middle 50% of data)

Finding Quartiles: - Q1 (First Quartile): Median of lower half - Q2 (Second Quartile): Median of entire set - Q3 (Third Quartile): Median of upper half

Example: {1, 3, 5, 7, 9, 11, 13, 15, 17} - Q2 (median) = 9 - Lower half: {1, 3, 5, 7} → Q1 = (3+5)/2 = 4 - Upper half: {11, 13, 15, 17} → Q3 = (13+15)/2 = 14 - IQR = 14 - 4 = 10

3. MEAN ABSOLUTE DEVIATION (MAD): Average distance from the mean

Steps to calculate MAD: 1. Find the mean 2. Find distance of each value from mean (absolute value) 3. Find the average of those distances

Example: {2, 4, 6, 8, 10} - Mean = 6 - Distances: |2-6|=4, |4-6|=2, |6-6|=0, |8-6|=2, |10-6|=4 - MAD = (4+2+0+2+4) ÷ 5 = 12 ÷ 5 = 2.4

COMPARING TWO POPULATIONS:

Population A Test Scores: {75, 80, 82, 85, 88, 90, 92} Population B Test Scores: {60, 70, 85, 90, 95, 98, 100}

Population A: - Mean = 84.6 - Median = 85 - Range = 92 - 75 = 17 - More consistent scores (smaller range)

Population B: - Mean = 85.4 - Median = 90 - Range = 100 - 60 = 40 - More varied scores (larger range)

Conclusions: - Centers are similar (means within 1 point) - Population B has more spread (range 40 vs 17) - Population A is more consistent - Population B has more high scorers BUT also lower low scores

BOX PLOTS (Box-and-Whisker Plots): Visual representation showing five-number summary: - Minimum - Q1 (1st quartile) - Median (Q2) - Q3 (3rd quartile) - Maximum

Reading Box Plots: - Box = middle 50% of data (IQR) - Line inside box = median - Whiskers = extend to min and max - Longer box or whiskers = more spread

OUTLIERS: Extreme values that are much higher or lower than the rest

Outlier Rule: A value is an outlier if: - Below Q1 - 1.5(IQR), or - Above Q3 + 1.5(IQR)

Effect of Outliers: - Mean is AFFECTED by outliers (gets pulled toward them) - Median is NOT affected much by outliers - Range is AFFECTED by outliers

SAMPLING:

Population: Entire group you want to study Sample: Subset of population you actually measure

Types of Samples:

Random Sample: Every member has equal chance - Best for representing population - Minimizes bias

Biased Sample: Some members more likely to be chosen - Does NOT represent population fairly - Example: Surveying only your friends about a school issue

Representative Sample: Reflects characteristics of population - Similar mean, spread, and distribution

MAKING INFERENCES:

From a good sample, you can estimate population characteristics: - Sample mean ≈ Population mean - Sample spread ≈ Population spread

Example: Random sample of 100 students has mean height 64 inches. Inference: School population mean height ≈ 64 inches.

REAL-WORLD APPLICATIONS: - Comparing test scores between classes - Analyzing sports team performance - Quality control in manufacturing - Medical studies (treatment vs control groups) - Election polling - Market research - Climate data comparison

Try it

Analyze and compare data sets using statistics!

CALCULATE MEASURES OF CENTER: For data set: {12, 15, 18, 20, 22, 25}

1. Find the mean. 2. Find the median. 3. Find the range.

CALCULATE IQR: For data set: {3, 5, 7, 9, 11, 13, 15, 17, 19}

4. Find Q1, Q2 (median), and Q3. 5. Find the IQR.

CALCULATE MAD: For data set: {10, 12, 14, 16, 18}

6. Find the mean. 7. Find the MAD (Mean Absolute Deviation).

COMPARING TWO POPULATIONS:

Class A heights (inches): {60, 62, 64, 65, 66, 68, 70} Class B heights (inches): {58, 60, 65, 70, 72, 75, 78}

8. Find the mean for each class. 9. Find the median for each class. 10. Find the range for each class. 11. Which class has more variability? Explain. 12. Which class is generally taller? Use statistics to support your answer.

WORD PROBLEMS:

13. Store A's wait times (minutes): {5, 6, 7, 8, 9}
    Store B's wait times (minutes): {2, 5, 8, 11, 14}
    a) Find mean and range for each store.
    b) Which store has more consistent wait times?
    c) Which store would you choose? Why?
14. Two basketball players' points per game:
    Player 1: {18, 20, 22, 24, 26}
    Player 2: {10, 15, 25, 30, 35}
    a) Find mean for each player.
    b) Find MAD for each player.
    c) Which player is more consistent? Explain.

OUTLIERS:

15. Data set: {12, 14, 15, 16, 18, 40}
    a) Find the mean with and without 40.
    b) Find the median with and without 40.
    c) Which measure (mean or median) is more affected by the outlier?

BOX PLOTS:

16. Create a box plot for: {2, 4, 6, 8, 10, 12, 14, 16, 18}
    (Find min, Q1, median, Q3, max)

SAMPLING:

17. Identify if each sample is random or biased:
    a) Surveying every 10th student entering school
    b) Surveying only students in the cafeteria during lunch
    c) Drawing names from a hat containing all students
    d) Surveying only your friends

18. A random sample of 50 students has mean test score 78. About what would you estimate for the school's mean score?

CHALLENGE:

19. Two data sets have the same mean but different MADs. What does this tell you about the sets?

20. Create two different data sets (5 numbers each) that:
    - Have the same median (50)
    - Have different ranges
    - Show your work!