Prepare for the AP Statistics exam with this practice test and answers. This guide covers exploring data, sampling, probability, statistical inference, and data analysis.
Q: 5 number summary
Answer: The minumum value, lower quartile, median, upper quartile, and maximum value for a data set. These five values give a summary of the shape of the distribution and are used to make box plots.The five numbers that help describe the center, spread and shape of data
Q: z score
Answer: a measure of how many standard deviations you are away from the norm (average or mean)-Number of standard deviations a score is above or below the mean (positive above, negative below
Q: standard deviation
Answer: A statistical measure of how far away each value is, on average, from the mean.A measure of spread. Specifically, the typical distance the data points are from the mean.
Q: population
Answer: (statistics) the entire aggregation of items from which samples can be drawnWhat the sample in an experiment or study usually reperesents
Q: categorical data
Answer: Data that can be placed into categories . For example “gender” is a categorical data and the categories are “male” and “female”.Labels or names used to identify categories of like itemsIf you asked people in which month they were born or what their favorite class is, they would answer with names, which would be categorical data. However, if you asked them how many siblings they have, they would answer with numbers, not categoriesLabels or names used to identify categories of like items
Q: bar graph
Answer: a type of graph in which the lengths of bars are used to represent and compare data in categoriesA graph that uses horizontal or vertical bars to represent data.
Q: sample
Answer: A relatively small proportion of people who are chosen in a survey so as to be representative of the whole.a small part of a population that represents the wholeA survey in star city representing the entire state of arkansas
Q: random
Answer: Assigning participants to experimental and control conditions by chance, thus minimizing preexisting differences between those assigned to the different groups.Assigning subjects to expenrimental groups based on chance.pulling names or numbers out of a hat
Q: bias
Answer: Any systematic failure of a sampling method to represent its populationAny way that tampers with the accuracy of the sample
Q: Undercoverage
Answer: A sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population.When some groups in the population are left out of the process of choosing the sample
Q: nonresponse
Answer: bias introduced to a sample when a large fraction of those sampled fails to respondWhen many people of a sample do not respond
Q: voluntary response bias
Answer: Bias introduced to a sample when individuals can choose on their own whether to participate in the sample.
Q: statistic
Answer: Application of mathematics to describing and analyzing data
Q: independent
Answer: (statistics) a variable whose values are unaffected by changes in the values of other variables
Q: histogram
Answer: graphical representation of a frequency distribution using vertical bars but bars touch each other to indicate variables are related
Q: box plot
Answer: A dsiplay that shows the distribution of values in a data set seperated into four equal-sized groups. A box plot is constructed from the five number summary of the data.
Q: scatterplot
Answer: A graphed cluster of dots, each of which represents the values of two variables. The slope of the points suggests the direction of the relationship between the two variables. The amount of scatter suggests the strength of the correlation (little scatter indicates high correlation).
Q: correlation
Answer: A measure of the extent to which two factors vary together, and thus of how well either factor predicts the other. The correlation coefficient is the mathematical expression of the relationship, ranging from -1 to +1
Q: skewness
Answer: The extent to which cases are clustered more at one or the other end of the distribution of a quantitative variable rather than in a symmetric pattern around its center
Q: variance
Answer: common measure of spread about the mean as center, standard deviation squared
Q: statistical significance
Answer: A statistical statement of how likely it is that an obtained result occurred by chance/The condition that exists when the probability that the observed findings are due to chance is very low
Q: empirical rule
Answer: The rules gives the approximate % of observations w/in 1 standard deviation (68%), 2 standard deviations (95%) and 3 standard deviations (99.7%) of the mean when the histogram is well approx. by a normal curve
Q: lurking variable
Answer: A variable that has an important effect on the relationship among the variables in a study but is not one of the explanatory variables studied
Q: probability
Answer: A number with a value from 0 to 1 that describes the likelihood that an event will occur. example, if a bag contains a red marble, a white marble and a blue marble then the probability of selecting a red marble is 1/3.
Q: descriptive statistics
Answer: Mathematical procedures for organizing collections of data, such as determining the mean, the median, the range, the variance, and the correlation coefficient and using them to describe data
Q: mean
Answer: A measure of center in a set of numerical data, computed by adding the values in a list and then dividing by the number of values in the list.
Q: median
Answer: A measure of center in a set of numerical data. The median of a list of values is the value appearing at the center of a sorted version of the list – or the mean of the two central values if the list contains an even number of values.
Q: mode
Answer: Measure of central tendency that uses most frequently occurring score.
Q: range
Answer: Distance between highest and lowest scores in a set of data.
Q: data
Answer: Facts and statistics collected together for reference or analysis
Q: Q1
Answer: A location measure of the data such that has one fourth or 25% of the data is smaller than it. Found by dividing the ordered data set in half (excluding the middle observation if n is odd) and finding the median of the lower half of the data.
Q: Q3
Answer: A location to measeure when counting data to such as the median where instead of counting 50% it is 75% from the beginning of the sorted data
Q: minimum
Answer: (n.) the smallest possible amount; (adj.) the lowest permissible or possible
Q: outlier
Answer: A value much greater or much less than the others in a data set
Q: simple random sample
Answer: A sample selected in such a way that every element in the population or sampling frame has an equal probability of being chosen. Equivalently, all samples of size n have an equal chance of being selected.A sample of size n selected from the population in such a way that each possible sample of size n has an equal chance of being selected.
Q: stratified random sample
Answer: A method of sampling that involves dividing your population into homogeneous subgroups and taking a simple random sample in each subgroup.a sampling design in which the population is divided into several groups, and random samples are then drawn from each stratum
Q: systematic sample
Answer: A sample drawn by selecting individuals systematically from a sampling frameA sample drawn by selecting individuals systematically from a sampling frame. When there is no relationship between the order of the sampling frame and the variables of interest, a systematic sample can be representative.
Q: cluster sample
Answer: Is obtained by selecting all individuals within a randomly selected collection or group of individuals.
Q: Interpolation
Answer: The estimation of an unknown number between known numbers. ex. BBQ sauce business, I’ve catered for 25 people and 50 people. I’ve been asked to cater 35. I feel comfortable estimating for 35 based on my prior experience because it is between 25 and 50 people.
Q: theoretical probability
Answer: A probability obtained by analyzing a situation. If all of the outcomes are equally likely, you can find the theoretical probability of an event by listing all of the possible outcomes and then finding the ratio of the number of outcomes producing the desired event to the total number of outcomes. For example, there are 36 possible equally likely outcomes (number pairs) when two fair number cubes are rolled. of these six have a sum of 7, so the probability of rolling a sum of 7 is 6/36 or 1/6
Q: block design
Answer: The subjects in an experiment are first divided into groups (called ‘blocks’) based on some common characteristic (such as gender) that is hypothesised to have an effect on the response. Randomization of treatments then happens within each block (each block is like its own mini-experiment).”
Q: blinding
Answer: The practice of concealing group assignment from study subjects, investigators, and/or those who assess subject outcomes, typically in the context of a randomized controlled trial. For ex, study subjects may receive capsules with identical appearance and taste; however, the treatment group receives the active drug, whereas the control group receives the placebo.
Q: double blind
Answer: An experiment in which neither the subjects nor the people who work with them know which treatment each subject is receivingNeither the subjects nor the people who have contact with them know which treatment a subject received
Q: placebo
Answer: A fake treatment.A chemically inert substance that produces real medical benefits because the patient believes it will help her
Q: least squares regression line
Answer: the line with the smallest sum of squared residuals
Q: matched pairs
Answer: an observational technique that involves matching each participant in the experimental group with a specific participant in the control group in order to eliminate the possibility that a third variable (and not the independent variable) caused changes in the dependent variable
Q: conditional prabability
Answer: probability given that something else has already occurred
Q: sample space
Answer: Set of all possible outcomes of an experiment
Q: confounded variable
Answer: A variable whose effect on the response variable cannot be separated from the effect of the explanatory variable on the response variable. (Note: Usually confounded variables are lurking variables but only a few lurking variables are also confounded.)
Q: marginal frequency
Answer: A set of intervals, usually adjacent and of equal width, into which the range of a statistical distribution is divided, each associated with a frequency indicating the number of measurements in that interval.
Q: coefficient of determination
Answer: r-squared: The statistic or number determined by squaring the correlation coefficient. Represents the amount of variance accounted for by that correlation.Statistic that represents amount of variance accounted for by a correlation.
Q: unimodal
Answer: having one mode; this is a useful term for describing the shape of a histogram when it’s generally mound-shapeda data set with one mode sucha normal distribution usually has only one mode
Q: bimodal
Answer: A type of distribution, where there is two or more categories with an equal count or cases and with more cases than the other categories.A distribution with two modes
Q: experiment
Answer: A kind of research in which the researcher controls all the conditions and directly manipulates the conditions, including the independent variable.Testing the hypothesis
Q: extrapolation
Answer: calculation of the value of a function outside the range of known values: ex. BBQ sauce business, I’ve catered for 35 people and for 50 people. I’ve just been asked to cater for 500 people. I am nervous about making a prediction for 500 because it is WAY outside of the range that I have actually experienced. Much less reliable prediction.
Q: IQR
Answer: A measure of variability, based on dividing a data set into quartiles and finding the Difference between upper and lower quartile of a boxplot
Q: Residual
Answer: observed value – predicted value ; we also called this y – yhat
Q: Convenience sample
Answer: Whenever a sample is taken it gives an improper results because the sample was taken from a very convenient area instead of representing a population
Q: simulation
Answer: A representation of a situation or problem with a similar but simpler model or a more easily manipulated model in order to determine experimental results.
Q: two way table
Answer: A table containing counts for two categorical variables. It has r rows and c columns.describes to categorical variables with row variable and column variable
Q: spread
Answer: The visible variation in a sample distribution
Q: center
Answer: The measure of the distance the mode is from the center of a distribution
Q: shape
Answer: a random variable that can take one of a finite number of distinct outcomes
Q: discrete random variable
Answer: The z-score obtained from standardizing an x-value.
Q: standardized value
Answer: Events that cannot occur at the same time.
Q: mutually exclusive
Answer: Whenever a bias is created in a sample by the way the survey is worded to favor one question
Q: wording bias
Answer: A cause and effect relationship in which one variable controls the changes in another variable.
Q: causation
Answer: A grouping of qualitative data into mutually exclusive classes showing the number of observations in each class.A chart showing the number of times a specific event happens.
Q: frequency table
Answer: A multiple column table depicting the individual digits of the scores. A score of 95 would have a stem of 9 and a leaf of 5, a score of 62 would have a stem of 6 and a leaf of 2. If a particular stem has more than one leaf, such as the scores 54, 58, and 51, the stem of 5 has three leaves, in this case 458.. It shows the range of values of the variable
Q: stem and leaf display
Answer: Describes a graph of quantitative data with more than two clear peaks.A distribution with more than two modes
Q: multimodal
Answer: A histogram doesn’t appear to have any mode and in which all the bars are approximately the same heightEvenly spaced
Q: uniform
Answer: When in a normal distribution both sides are identical
Q: symmetric
Answer: standard deviation of residuals
Q: se
Answer: the percent of variation in y that is explained by x
Q: interpret r^2
Answer: an observation that when removed would markedly change the LSRL specifically the slope of the LSRL
Q: influential point
Answer: When a survey has no sample but instead tests or surveys the entire population
Q: census
Answer: a sampling design where several sampling methods are combined
Q: multistage sample
Answer: Choosing a sample because it is convenient.failing to get a proper representation of the population becauseIf you survey everyone on your soccer team who attends tonight’s practice, you are surveying a convenience sample.
Q: convenience sample
Answer: Anything in a survey design that influences responses falls under the heading of response bias. One typical response bias arises from the wording of questions, which may suggest a favored response. Voters, for example, are more likely to express support of “the president” than support of the particular person holding that office at the moment.Anything that changes the response in a surveyA police officer asking teenagers about drug use
Q: response bias
Answer: A study based on data in which no manipulation of factors has been employed.A study that observes characteristics of an existing population.usually a survey
Q: observational study
Answer: In an experiment, the group that is not exposed to the treatment; contrasts with the experimental group and serves as a comparison for evaluating the effect of the treatment.
Q: control group
Answer: The practice of concealing group assignment from study subjects, investigators, and/or those who assess subject outcomes, typically in the context of a randomized controlled trial.For ex, study subjects may receive capsules with identical appearance and taste; however, the treatment group receives the active drug, whereas the control group receives the placebo.
Q: blinding
Answer: Experimental results caused by expectations alone; any effect on behavior caused by the administration of an inert substance or condition, which is assumed to be an active agent.
Q: placebo effect
Answer: each repetition or observation of an experiment