Unit 2: Descriptive Statistics
Students analyze contextual situations, focusing on single variable data and bivariate data, and are introduced to the concept of using data to make predictions and judgments about a situation.
In Unit 2, Statistics, students continue to analyze contextual situations, but in this unit, they focus on single variable data and then bivariate data. This is the first unit where students are introduced to the concept of using data to make predictions and judgments about a situation. Univariate data is described through shape, center, and spread by using mathematical calculations to support reasoning. Students begin to make judgments about whether data is consistent (analysis of spread) and whether mean or median is a better representation of a situation (center). Bivariate data is analyzed for whether the variables are related (correlation) and whether a linear model is the best function to fit a set of data (analysis of residuals), and students develop a linear model that can be used to predict future events. In Unit 2, students are introduced to the modeling cycle and complete a project on univariate data analysis and another on bivariate data analysis.
Unit 2 begins with analyzing and describing univariate data. Students expand on their knowledge of shape, center, and spread from 6th and 7th grade to further interpret and calculate measures of spread—learning about variance and standard deviation. Students capitalize on previous understandings of measures of center and different graphical representations to formalize their knowledge of which measures of center, shape, and spread are used in conjunction with one another, and how these help to inform the “big picture” of the data set they represent. A three-day project culminates study of this topic.
In Unit 2 students dive deeper into bivariate data—identifying categorical and numerical data, and choosing representations that match the data presented. Two-way tables are used to represent categorical data. Students calculate relative and conditional frequencies in two-way tables and expand on their understanding of the tool from 8th grade. Scatterplots are explored heavily in this unit, and students use what they know about association from 8th grade to connect to correlation in Algebra 1. Students base their understanding of regression on their previous learning about line of best fit. Also in this unit, students will learn to assess the validity of the model they have used (be it linear or another function) by using residuals. A three-day project culminates this topic, with a loose framework provided.
As Algebra 1 progresses, students will identify shapes of data sets according to the functions, and they will continue to bring in ideas about how to model data in line with functions. Students will explore S-ID.7 more heavily as they progress through the units of Algebra 1.
Pacing: 24 instructional days (22 lessons, 1 flex day, 1 assessment day)
The following assessments accompany Unit 2.
Use the resources below to assess student mastery of the unit content and action plan for future units.
Post-Unit Assessment Answer Key
Suggestions for how to prepare to teach this unit
Internalization of Standards via the Unit Assessment:
Internalization of Trajectory of Unit:
Unit-Specific Intellectual Prep:
The central mathematical concepts that students will come to understand in this unit
Terms and notation that students learn or use in the unit
measures of center (mean, median)
The materials, representations, and tools teachers and students will need for this unit
Topic A: Descriptive Statistics in Univariate Data
Describe statistics. Represent data in frequency graphs and identify the center of a data set.
Describe center and spread. Represent data in a box plot (box-and-whisker plot) and calculate the center and spread.
Represent data in a histogram and calculate the center. Identify when the median and mean are not the same value.
Describe the shape of the data in box plots and histograms. Choose an appropriate measure of center (or an appropriate shape) based on the shape and the relationship between the mean and the median.
Calculate and interpret the spread (variance) of a data set.
Calculate the standard deviation and compare two symmetrical distributions based on the mean and standard deviation.
Interpret the standard deviation and interquartile range.
Calculate population percentages using the standard deviation.
Given summary statistics, describe the best measures of center and spread. Describe reasoning.
Develop and answer statistical questions through data analysis of existing data using appropriate statistical measures and displays. (Part 1/3)
Develop and answer statistical questions through data analysis of existing data using appropriate statistical measures and displays. (Part 2/3)
Develop and answer statistical questions through data analysis of existing data using appropriate statistical measures and displays. (Part 3/3)
Create a free account to access thousands of lesson plans.
Already have an account? Sign In
Topic B: Descriptive Statistics in Bivariate Data
Define categorical and numerical data. Create two-way tables to organize bivariate categorical data.
Describe relative and relative conditional frequencies of two-way tables.
Create scatterplots and identify function shapes in scatterplots.
Calculate, with technology, the correlation coefficient for a data set. Explain why correlation does not determine causation.
Determine the function of best fit and create a linear equation from least squares regression using technology.
Use residuals to assess the strength of the model for a data set.
Describe the relationship between two quantitative variables in a contextual situation represented in a scatterplot using the correlation coefficient, least squares regression, and residuals as evidence.
The content standards covered in this unit
— Use units as a way to understand problems and to guide the solution of multi-step problems; choose and interpret units consistently in formulas; choose and interpret the scale and the origin in graphs and data displays.
— Represent data with plots on the real number line (dot plots, histograms, and box plots).
— Use statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets.
— Interpret differences in shape, center, and spread in the context of the data sets, accounting for possible effects of extreme data points (outliers).
— Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets for which such a procedure is not appropriate. Use calculators, spreadsheets, and tables to estimate areas under the normal curve.
— Summarize categorical data for two categories in two-way frequency tables. Interpret relative frequencies in the context of the data (including joint, marginal, and conditional relative frequencies). Recognize possible associations and trends in the data.
— Represent data on two quantitative variables on a scatter plot, and describe how the variables are related.
— Fit a function to the data; use functions fitted to data to solve problems in the context of the data. Use given functions or choose a function suggested by the context. Emphasize linear, quadratic, and exponential models.
— Informally assess the fit of a function by plotting and analyzing residuals.
— Fit a linear function for a scatter plot that suggests a linear association.
— Interpret the slope (rate of change) and the intercept (constant term) of a linear model in the context of the data.
— Compute (using technology) and interpret the correlation coefficient of a linear fit.
— Distinguish between correlation and causation.
— Understand statistics as a process for making inferences about population parameters based on a random sample from that population.
Standards covered in previous units or grades that are important background for the current unit
— Construct a function to model a linear relationship between two quantities. Determine the rate of change and initial value of the function from a description of a relationship or from two (x, y) values, including reading these from a table or from a graph. Interpret the rate of change and initial value of a linear function in terms of the situation it models, and in terms of its graph or a table of values.
— Display numerical data in plots on a number line, including dot plots, histograms, and box plots.
— Summarize numerical data sets in relation to their context, such as by:
— Understand that statistics can be used to gain information about a population by examining a sample of the population; generalizations about a population from a sample are valid only if the sample is representative of that population. Understand that random sampling tends to produce representative samples and support valid inferences.
— Use data from a random sample to draw inferences about a population with an unknown characteristic of interest. Generate multiple samples (or simulated samples) of the same size to gauge the variation in estimates or predictions.
For example, estimate the mean word length in a book by randomly sampling words from the book; predict the winner of a school election based on randomly sampled survey data. Gauge how far off the estimate or prediction might be.
— Informally assess the degree of visual overlap of two numerical data distributions with similar variabilities, measuring the difference between the centers by expressing it as a multiple of a measure of variability.
For example, the mean height of players on the basketball team is 10 cm greater than the mean height of players on the soccer team, about twice the variability (mean absolute deviation) on either team; on a dot plot, the separation between the two distributions of heights is noticeable.
— Use measures of center and measures of variability for numerical data from random samples to draw informal comparative inferences about two populations.
For example, decide whether the words in a chapter of a seventh-grade science book are generally longer than the words in a chapter of a fourth-grade science book.
— Construct and interpret scatter plots for bivariate measurement data to investigate patterns of association between two quantities. Describe patterns such as clustering, outliers, positive or negative association, linear association, and nonlinear association.
— Know that straight lines are widely used to model relationships between two quantitative variables. For scatter plots that suggest a linear association, informally fit a straight line, and informally assess the model fit by judging the closeness of the data points to the line.
— Use the equation of a linear model to solve problems in the context of bivariate measurement data, interpreting the slope and intercept.
For example, in a linear model for a biology experiment, interpret a slope of 1.5 cm/hr as meaning that an additional hour of sunlight each day is associated with an additional 1.5 cm in mature plant height.
— Understand that patterns of association can also be seen in bivariate categorical data by displaying frequencies and relative frequencies in a two-way table. Construct and interpret a two-way table summarizing data on two categorical variables collected from the same subjects. Use relative frequencies calculated for rows or columns to describe possible association between the two variables.
For example, collect data from students in your class on whether or not they have a curfew on school nights and whether or not they have assigned chores at home. Is there evidence that those who have a curfew also tend to have chores?
Standards in future grades or units that connect to the content in this unit
— Describe events as subsets of a sample space (the set of outcomes) using characteristics (or categories) of the outcomes, or as unions, intersections, or complements of other events ("or," "and," "not").
— Understand that two events A and B are independent if the probability of A and B occurring together is the product of their probabilities, and use this characterization to determine if they are independent.
— Understand the conditional probability of A given B as P(A and B)/P(B), and interpret independence of A and B as saying that the conditional probability of A given B is the same as the probability of A, and the conditional probability of B given A is the same as the probability of B.
— Construct and interpret two-way frequency tables of data when two categories are associated with each object being classified. Use the two-way table as a sample space to decide if events are independent and to approximate conditional probabilities.
For example, collect data from a random sample of students in your school on their favorite subject among math, science, and English. Estimate the probability that a randomly selected student from your school will favor science given that the student is in tenth grade. Do the same for other subjects and compare the results.
— Recognize and explain the concepts of conditional probability and independence in everyday language and everyday situations.
For example, compare the chance of having lung cancer if you are a smoker with the chance of being a smoker if you have lung cancer.
— Find the conditional probability of A given B as the fraction of B's outcomes that also belong to A, and interpret the answer in terms of the model.
— Apply the Addition Rule, P(A or B) = P(A) + P(B) - P(A and B), and interpret the answer in terms of the model.
— Apply the general Multiplication Rule in a uniform probability model, P(A and B) = P(A)P(B|A) = P(B)P(A|B), and interpret the answer in terms of the model.
— Use permutations and combinations to compute probabilities of compound events and solve problems.
— Decide if a specified model is consistent with results from a given data-generating process, e.g., using simulation.
For example, a model says a spinning coin falls heads up with probability 0.5. Would a result of 5 tails in a row cause you to question the model?
— Recognize the purposes of and differences among sample surveys, experiments, and observational studies; explain how randomization relates to each.
— Use data from a sample survey to estimate a population mean or proportion; develop a margin of error through the use of simulation models for random sampling.
— Use data from a randomized experiment to compare two treatments; use simulations to decide if differences between parameters are significant.
— Evaluate reports based on data.
— Make sense of problems and persevere in solving them.
— Reason abstractly and quantitatively.
— Construct viable arguments and critique the reasoning of others.
— Model with mathematics.
— Use appropriate tools strategically.
— Attend to precision.
— Look for and make use of structure.
— Look for and express regularity in repeated reasoning.
Functions, Graphs and Features
Linear Expressions & Single-Variable Equations/Inequalities