INVESTIGATING STATISTICAL CONCEPTS, APPLICATIONS, AND METHODS, Fourth Edition

NOTES FOR INSTRUCTORS

December, 2024

Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5

Some features of the textbook you may want to highlight on the first day of class and discuss with students include:

· how to make the most use of include the glossary/index,

· links to applets and data files,

· study conclusion boxes, and

· errata page.

You should make sure students have access to the materials every day, either on their personal device (e.g., laptop or iPad) or printed out, as they will want to add notes as they go along. You may also want to insist on a calculator every day. You will also want to make sure they have detailed instructions for the technology tools you want them to use in class (e.g., R or Minitab or JMP), and ability to run javascript applets (which should run on all platforms including mobile devices). In particular, you will want to make a direct link to the ISCAM.RData workspace file available to R users or to the JMP journal file for JMP users. They can follow these links to directly open the software and have access to some prewritten functions that make use of R/JMP much simplified. In particular, the JMP journal file includes direct access to the probability distribution calculator, common one-sample and two-sample inference procedures, and common preference settings. (Students may need to download the JMP file to their computers first rather than clicking the link to open the file in the browser.) The R functions are periodically updated (e.g., now include variable width boxplots, slightly nicer graphics) so is worthwhile to use the link rather than download the file only. The R commands are purposely simple rather than requiring additional downloads if possible (and also not taking advantage of many new R features like the tidyverse). If you are using an R or Minitab or JMP-only version of the text, there will still be occasional mentions of the other package. You can use this as a reminder to students that there are other options, but generally ignore them. If using the pdf files, you should be able to copy and paste commands from the pdf into the technology (e.g., not including the > symbol in R and watching for straight vs. curly quotations).

Fall 2024: One newer applet feature – if you know the name of the data file, you can often type the name of the file in the data window and it will load it in rather than copying and pasting from a webpage. You can also copy and paste many urls into the data window box. Several applets now also include a File Upload button for a .txt or .csv file on your computer. In the Descriptive Statistics applet (more soon), you can also use a “data=” parameter in the url to load (clean) txt and csv files from other sites, e.g.,

https://www.rossmanchance.com/applets/2021/test/Dotplot.htm?data=http://www.rossmanchance.com/chance/stat301W24/data/MLB_FCI_22.txt

https://www.rossmanchance.com/applets/2021/test/Dotplot.htm?data=https://www.isi-stats.com/isi/data/prelim/NationalAnthemTimes.txt

Some applets also have links at the bottom for “how to” videos that we are continuing to update.

Please reach out if you have other questions or suggestions!

Investigations A and B

Although a major goal of this text is to get students to complete a statistical investigation as early in the course as possible, we find taking a few days at the beginning to overview some key ideas (namely distribution, variability, and probability) can be helpful. We offer two investigations, which can be done inside or outside of class and in varying levels of detail and in either order, to explore these ideas. One more abstract idea you may want to emphasize from these first two investigations is the concept of a random process. Many of the investigations in Chapter 1 are often better conceived of as a random process, and you may want to differentiate between a finite population and an on-going, potentially infinite process. You may also choose to have them do a more involved data analysis or other simulation models as follow-up before proceeding (e.g., see Chapter 0 Exercises).

New Winter 2025: We are trying out a new investigation A. Many of the learning goals are the same but we hope this new activity can lead to more class discussion. Some of the definitions and have been to Investigation 1.1 and you may want an early homework problem that reviews the formula/big ideas of a sample standard deviation. Keep in mind some, but not, students will have seen this formula previously. Having a discussion of SD of data early in the course should help when chapter soon talks about the SD of a null distribution. In particular, you will want to ask “standard deviation of what?” many times early in the course.

Investigation A: Modelling Hurricanes

Materials: You may want to find a recent news article that comments on increasing number and severity of tropical storms in recent years.

Timing: As a class discussion, this investigation should take 30-45 minutes. You should have time to review the syllabus and other course logistics, while still giving students a good amount of time to talk in small groups about the graphs involved. Asking for groups to share out answers at the end of class can reinforce the multitude and observations that can be made from this very simple data set. You will also want to remind students that this is just one variable and there are many other more complicated variables to consider as well.

Question (a) is an opportunity to review a simple calculation that will be useful later in the course (and is often misinterpreted). Questions (b)-(d) are good to ask students to talk about together before sharing out. Questions (e)-(g) can remind students of the main features of a distribution to focus on, without distracting calculation details. Question (i) is a new emphasis on getting students to think about what we mean by a statistical model and how we can evaluate a model. Questions (k) and (l) do not have single correct answers, encourage students to practice defending their arguments from the data/graphs. The final questions present some different graphs, again with different pros and cons for students to consider.

The intention is for the following discussion to be an out of class reading assignment, and an opportunity for students who haven’t taken a previous statistics course to fill in a few of the early definitions. You may want to spend more time here depending on your students’ background (e.g., mean as balance point vs. median, using technology to find standard deviation), but we encourage you to not spend too much time, these ideas will return quite often later in the course. In particular, many students may struggle with the idea of standard deviation. Caution them not to focus on the calculation, but as long as they can look at graphs and decide which has more variability in terms of “width” of the distribution, they have an appropriate understanding for this stage of the course. You could begin introducing students to the main technology (e.g., statistical software) you will use in the course, or wait and demo the technology at the end or start the next class period with that. You can also discuss how/whether you plan to use the practice problems at the end of each investigation.

For practice, have them look at good/bad definitions of variables (e.g., those with hearing loss).

Investigation B: Random Babies

Materials: Index cards (4 per student or pair of students). Calculators. Random Babies applet.

An alternative “lab version” of the Monty Hall problem.

Timing: This can be used as an initial or “filler” exploration that focuses on the relative frequency interpretation of probability and using simulation to explore chance models. One key distinction you might want to emphasize for students is the estimated probability from simulation vs. the exact probability found analytically, which is not always so straight forward. There is also a tremendous amount of flexibility on how deeply you want to go into probability rules, expected value, and variance. Many students don’t have a good understanding of what is meant by “simulation” so you may want to highlight that as you will return to the concept throughout the course. (You may even want to follow-up with additional simulation exercises, see Ch. 0 homework.) If you do the first part of this investigation in class, you probably want students bringing calculators as well. For the hands-on simulation, bring four index cards for each person or pair of students. You may want to watch/highlight use of the terms repetition vs. trial.

For question (n), we encourage you to show students a visual representation as well, mapping each outcome in the sample space to a value of the random variable. Then in discussion of the E(X) formula, can show a weighted average and how the weights (relative frequencies) converge to the theoretical probabilities. There is also brief discussion around V(X) and we encourage you to emphasize why variability is an important consideration as well as the distinction between variability in data and variability in a random process. (May want to consider this riddle: why did the median hiker still get mauled by the bear? She made sure she hiked the median speed to be sure to be in the middle of the group, but failed to consider variability and often found herself hiking alone…)

The first practice problem changes the context to a more traditional “hat problem” from probability courses and the second practice problem extends the scenario to 8 babies. (Note: If you use the applet for other than four babies, you will not see the animation.) One interesting observation is that the expected value is the same each time (and the comparison of P(X = 0) and P(X = 1) alternate).

Chapter 0 HW Exercises

Exercises 1-7 follow up Investigation A and Exercises 8-11 follow up Investigation B. There are some interesting contexts which you can return to later in the course. You might always want to give them some “messier” data sets to work with.

2022 FCI for baseball (PP 2.1A)

rowers2020.xls

rowers2024.xlsx

CHAPTER 1: ANALYZING ONE CATEGORICAL VARIABLE

Section 1: Analyzing a Process Probability

This section will cover all the big ideas of a full statistical investigation: considering the research question, describing data, stating hypotheses, tests of significance, confidence intervals, power, and scope of conclusions. (Ideas of generalizability are raised but not focused on until later but you can bring in lots of statistical issues, like ethics, data snooping, types of error.) Once you finish this section, you may want to emphasize to students that the remaining material will cycle through these same ideas with different data scenarios (e.g., quantitative data, comparing groups). In fact, you may want to reassure students that you are starting with some of the tougher ideas first to give them time to practice and reinforce the ideas. More recently, I have postponed the material on power until after theory-based approaches.

Investigation 1.1: Friend or Foe?

Materials: You may want to bring some extra coins to class and/or tell students to in advance. The One Proportion Inference applet is javascript and should run on all platforms. The applet initially assumes the coin tossing context for = 0.5 but then changes to spinners and “probability of success” more generally once you change the value of the input probability. We encourage you to show some videos detailing the experiment, which can be required viewing prior to class.

Nature article Supplementary Information page

Overview video from Mind in the Making - The Essential Life Skills Every Child Needs, by Ellen Galinsky

UBC Center for Infant Cognition Lab website

Yale Social Evaluation page (needs QuickTime plug in)

Three specific videos: Triangle helps, Square hinders, 10-month old chooses

Fall 2020: Here is a “lab” version of this investigation.

Timing: The entire activity takes about 80 minutes. A good place to split up the activity is after the tactile simulation (part (o)). Technology instructions for tallying the outcomes and construct bar graphs are included, these can also be moved outside of class. You will want to bookmark this page for instructions on getting the data into the software.

We find this investigation a very good entry to the big ideas of the course. We encourage you to build up and discuss the design of the study (e.g., show the videos) and highlight the overall reasoning process (proof by contradiction, evidence vs. proof, ruling out "random chance" as a plausible explanation, possible vs. plausible). Try not to let students get too bogged down in the study details (e.g., the researchers did rotate the colors, shapes, left/right side presentation, we are "modelling" each infant having the same probability, these infants may or may not be representative of all infants, etc.).

The Winter 2025 version now starts with basic definitions of observation unit, variable, sample, population, and research question. You will want to give students lots of practice with these terms, which will be used throughout the course, outside of class. You may also want to discussion between population and random process here, modelling the infants as identical observations from an ongoing process. Encourage students to take the time to review how the study was conducted before jumping into any analyses.

In question (f), you can emphasize that examining a graph of the observed data is our usual first step. Because the graph is so simple to construct, you could focus on the idea of creating an “active title” and communication of results.

In question (i), it works well to get students to think for themselves about possible explanations for the high number picking the helper toy. They can usually come up with: there is something to the theory or it was a huge coincidence. Now we have to decide between these explanations, but the random chance explanation is one that we can investigate. Maybe ask students how many infants would have needed to pick the helper toy for them to find the result convincing evidence against random chance. Depending on your class size, you may ask them to do more or fewer repetitions in question (m). You can also ask them to predict what they will see in the dotplot before it’s displayed.

When you switch to the applet, emphasize that the computer is simply repeating what they did. Also discuss use of the symbol . Try to give students time to work through the applet and begin to draw their own conclusions. It will be important to emphasize to students why you are looking at the results for = .5, and how that will help address the research question. Students can usually get the idea that small p-values provide evidence against the “random chance alone” or “fluke” or “coincidence” explanation, and then warn them you will refine that interpretation (and expect more detailed interpretations on their part) as the course progresses. (But may want to highlight for them what the random component is in their probability calculation.) As you wrap up the investigation you will want to focus on the reasoning behind the conclusion and precise phrasing of the conclusions. In particular, you may want to emphasize the distinction between interpreting the p-value and evaluating the p-value. The strategy outlined in the Summary gives a nice framework to the process (statistic, simulation, strength of evidence). You will especially want to repeatedly emphasize how the p-value is calculated assuming the null hypothesis is true. You may want to emphasize the “tail area” aspect of the p-value as a way to standardize the measurement of what’s unusual across different types of distributions (e.g., if we had a large sample size than any one outcome would have a low probability). You may want to poll a few students to report the p-value they estimated from the simulation to see that they vary slightly. This may motivate some students to want to calculate an “exact” answer that everyone should agree on, and this is certainly possible using a bit of probability theory.

Make sure to ask students lots of questions on the reasoning process. This is also a good point for a “loaded dice” type of activity where you can illustrate the reasoning process (“You wouldn’t roll only 7 and 11 with a pair of fair dice”) they naturally use. We suggest emphasizing that p-values are just one way of measuring how extreme an observation is in a distribution and try to improve the motivation for “or more extreme” in the calculation of the p-value.

Technology: The Data Files and Applets page includes a link to “raw data” and you might want to get their feet wet reading the data in from a webpage or a URL (see Technology Detour).

· With Minitab, you can copy and paste from the webpage into the data window (but watch that an extra missing value is not added to the end of the column). You may want to introduce students to the “commands” window (Desktop version) in case they want to copy and paste earlier instructions rather than using the menus/save the commands for documentation.

· With JMP, you can copy and paste from the webpage into the data window (but watch that you include the column headings).

· With R, you can copy and paste the data onto the clipboard and use read.table. With data files, you use read.table (tab delimited) or read.csv (for comma delimited). With R Studio, you can use Import Data > From Text (rear). With R Markdown, you can use:

```{r}

Infants <- read.delim("http://www.rossmanchance.com/iscam3/data/InfantData.txt")

load(url("http://www.rossmanchance.com/iscam3/ISCAM.RData"))

```

to load in (tab delimited) data and the ISCAM workspace. You can tell students how to “attach” their data rather than referencing (e.g., data = ). We consider attaching ok for a first course that typically uses one dataset at a time. To convert the * values to NaN, you can use as.numeric(as.character(x)). Also make sure students see the question mark features with R and with the iscam R functions and/or the auto-fill feature in RStudio.

For the simulations, the text generally defaults to 1,000 trials, but you should feel free to mix that up a bit as well or occasionally do more just to make sure the answer is “stable.”

Investigation 1.2: Can Wolves Understand Human Cues

Materials: The journal article has some nice/helpful pictures of the experimental set-up. You can also easily access the data for the other animals.

Timing: This investigation can be completed together in 20-30 minutes, especially if you give them additional background reading on calculating binomial probabilities.

The main focus is on calculating binomial probabilities, so you have some flexibility in how much detail you want to cover. We like this context (repeated measurements on one wolf) to be a good application of a binomial process, which introducing some basic terminology to follow up the reasoning from Investigation 1.1. This investigation doesn’t introduce any symbols yet but does try to help students make a detailed interpretation of the p-value. This is also a good place to practice the structure of a “simulation model” which can choose to return to several times in the course. We find emphasizing the distinction between “real data” and “hypothetical data” to be useful. This investigation also allows you to draw the distinction between the simulated p-value and the exact p-value. Having 2-3 methods for finding the p-value (exact vs. simulation vs. normal-model) will be a theme throughout the book. Another issue that you may want to emphasize but may take getting used to is always defining the parameter as a process probability (rather than a population proportion). You can also have good discussions on what p-values consistent sufficiently strong evidence.

Practice Problem 1.2A repeats the analysis for communicate cues rather than the original behavioral cues. Practice Problem 1.2B is a great follow-up using data you collect (in advance) on the students. link to the pictures can be found here. We have tried randomizing the order of the faces/choices and have not found that to affect the results (but you may want to incorporate this into your data collection as well). A google docs type survey for collecting student responses can work well here. Please email me if you are interested in an online lab version of the matching names to faces study.

Technology: In Investigation 1.2, the technology instructions focus on calculating binomial probabilities. In the Investigation 1.4, they will see how to incorporate this into the more “complete” binomial test commands.

Investigation 1.3: Are You Clairvoyant?

Materials: You can use the One Proportion Inference applet and/or statistical software to calculate probabilities from the binomial distribution and/or a formal binomial test.

Timing: 45 minutes, especially if you have them self-test their clairvoyance before coming to class.

The main goals of this investigation are giving students a bit more practice with binomial calculations, as well as emphasizing using as the symbol for an unknown process probability and introducing standardization as an alternative measure of extremeness. Students can play with an online version of the clairvoyance test (predicting what’s going to happen) before hand and/or you can do a quick demonstration or have them test each other’s ESP with actual cards in class. We do think it’s useful for students to analyze their “own” data once in a while, though here we suspect most will not find a significant difference, and the result may not even be in the conjectured direction (good time to talk about p-values > 0.50). The investigation presents E(X) and SD(X) but you could point them to this page that explains a bit more about where those expressions come from. By the end, we would like students to have in mind “less than 5% of the time” or “more than 2 standard deviations away” as indicators or unusual observations. (Although z-scores won’t be used in more detail until they look at the normal approximation to the binomial.) You may need to remind students that variance is standard deviation squared.

Be sure to highlight the distinction between the unknown parameter and a conjectured value for the parameter, as well as the distinctions between parameter, statistic, and variable. Again, you may want to emphasize whether you are going to ask students to distinguish an interpretation of the p-value (what does it measure) from an evaluation (of the p-value, deciding whether it is small).

Technology note: Students will be learning several different ways to do these calculations the first few days. You may want to move some of these calculations outside of class and have them submit their numerical answers between classes, but we mostly encourage you to let the technology handle the tedious calculations from this point on.

· For the ISCAM workspace functions, initially they are told to use the lower.tail feature, mimicking the more generic R functions, but then they will switch to “above/below” type inputs. (Some instructors prefer going straight to iscambinomtest.) Be sure to emphasize some of the subtle issues in R, like punctuation, continuing on to another line, short-cuts in the inputs, capitalization matters, etc.). You could use binom.test in R but again we prefer the automatic picture.

· Many of the iscam R functions return key values for you to access individually, for example

> results=iscambinomtest(observed=14, n=16, hyp=.5, alt="greater")

will store the p-value and the endpoints of the confidence interval:

> results$pvalue

The confidence level inputs can also usually be either the decimal (.95) or the percentage (95) or a vector (c(.95, .99)). The return values are especially useful for R Markdown (e.g., embedding the result into text), e.g., This p-value of `r myresults$pvalue`…

Investigation 1.4: Heart Transplant Mortality

Timing: This investigation is mostly review and may only take 30-45 minutes.

This investigation provides additional practice (but with a hypothesized probability other than 0.5) and highlights the different analysis approaches. Try to get students comfortable with the overall reasoning process but that there are alternative reasonable approaches to obtaining a p-value. You may want to focus on Ho as the "by chance" hypothesis (ho hum) and Ha as the research conjecture (a ha!). Also emphasize the distinction between St. George's (underlying) death rate and the observed proportion of deaths and the 0.15 value. You may want to emphasize the statistical thinking underlying this investigation – e.g., how we really shouldn’t reuse the initial data in testing the theory that the data pointed to (see also question k). So ethnical issues, as well as the role of sample size, can be emphasized in discussing this activity. Students may struggle a bit with seeing these observations as a sample from some larger process with an underlying probability of “success.” Students may resist defining “death” as a “success” as is common in many epidemiological studies. The text also draws their attention to “failing to reject” null hypothesis versus “accepting the null hypothesis.”

The graphs of the two binomial distributions provide an opportunity to discuss how the mean, standard deviation, and shape of the binomial distribution change with the sample size. In (o), we are hoping students may notice the difference in the granularity of the distributions as a precursor to the normal approximation and the need for a continuity correction. Students may also suggest the difference in the scaling of the vertical axis which is a related idea and also ties into the motivation for “or more extreme” in calculating a p-value that can be compared across distributions/studies. You will also want to strongly emphasize that p-value calculations take sample size into account. So a small p-value should not be discredited by a small sample size.

Technology notes: The One Proportion Inference applet does have the option for showing the summary statistics (mean and standard deviation) of the simulated values, but we advise not to pay much attention to those details yet. You can also input only the probability of success, sample size, and value of interest, (press the Count button), and continue directly to the check the Exact Binomial box. You may want to use R or another technology that reports the binomial probabilities to more decimal places so they can see the exact equivalence in question (g). Students may wonder about the distinction between “calculating a binomial probability” and “carrying out a binomial test”; you may want to give them a diagram of the steps you want included whenever you ask them to carry out a test (of which finding the probability for the p-value is a middle step).

This investigation includes introducing them to formal “binomial tests” (but emphasize this is just doing in one step what they had done previously).

· In R, the materials make use of iscambinomtest which gives the output and graph together. You may also want to compare this output to R's standard binom.test. You could of course use pbinom (e.g., 1-pbinom(13, 16, .5, TRUE) or pbinom(14, 16, .5, FALSE)), but we prefer the iscam function which includes the graph and uses non-strict inequalities for both tails.

· In JMP, we recommend asking students to download the Distribution Calculator for use throughout the course. They can download it to their Desktop and simply double click on it in the future. You will also want to consider this option in a lab environment. You will need to be careful in finding non-strict greater than probabilities with discrete random variables. In JMP, if you only enter one probability that outcome is defined as “success.” Otherwise JMP assumes “success” to be the first category alphabetically unless you specify otherwise with Column Properties > Value Ordering.

· In Minitab, the Probability Distribution Plot feature is advantageous in ease of use and including a visual of the (shaded) probability distribution. You may want to ask students to double click on the axis label to change the text to something more descriptive.

Investigation 1.5: Butter-Side Down Again?

Materials: There are some videos of the episode online, including some free “trailers” but you may have to pass some ads. The “butter-side down again?” is a quote from one of the videos. You can also emphasize the processes that the Mythbusters went through before collecting their final data.

Timing: 30 minutes depending on how much time you want to spend on two-sided p-value calculations

The main goal is introducing a “not equal to” research question where the researchers have no prior suspicion or specific concern as to the direction of the alternative hypothesis. You will want to emphasize that this distinction needs to be made before the data are collected (and in fact, the Mythbusters did have the direction wrong, though do watch for the distinction between falling off a table and falling from the roof of a building). Students will have good initial intuition for “checking the other side,” but the idea of “smallest p-value” will be more difficult and you may want to emphasize recognizing a two-sided alternative hypothesis and then conveying that information to the software. It is probably not worth spending much time on the "method of small p-values" other than to warn students that different packages may use slightly different algorithms. In fact, in 2024, Minitab changed the two-sided p-value calculate to the “smallest tail probability.” In this case, the other side is determined by the value closest to the hypothesized expected value that produces a one-sided p-value less than or equal to the observed. This approach is now an option in the One Proportion applet (after checking Two-sided, you can change the pull-down menu to Tails rather than Individual – the investigation does now assume you will do this first). In many situations, this distinction may not impact the p-value.

Practice Problem 1.5 follows-up with their pilot test to develop the “unbiased” dropping mechanism in the first place. Here the Mythbusters did draw an erroneous statistical conclusion, thinking that 3 out of 10 was outside a statistical sample. You could encourage students to write their explanations in language that the hosts could understand. Or questions (c)-(e) explore some of the finer details in the p-value calculation that may not be of interest at this time.

Investigation 1.6: Kissing the Right Way

Materials: Tic Tac candy advertisement

Timing: 45 minutes, depending on how much time you give students to develop the interval themselves or crowd-source the information (e.g., assigning each pair of students on value to check out)

The transition here is to estimate the parameter vs. only testing one specific value. You may want to build up this “second inferential approach” more than the text does and how confidence intervals are really as, if not more, informative than tests of significance. Do discourage students from thinking of a CI as “new evidence,” but rather a different way of looking at the same evidence.

This investigation introduces students to the concept of interval estimation by asking them to construct a set of plausible values for the parameter based on inverting the (two-sided) test. In (f), we are hoping they will mention as an obvious first choice. Make sure they realize they are changing the hypothesized value of rather than now. Students often like to get into the “game” of seeing how close they can find the p-value to the cut-offs. The goal is also for students to see how that p-value changes as the hypothesized parameter moves away from the observed proportion (the R output provides a nice illustration of this but will need to be discussed with the students) and also the effect of the confidence level on the interval. You may want to emphasize that the “inversion” necessary for these binomial intervals is actually pretty complicated and ends up being conservative because of the discreteness (when you plug in 0.05, it will return a value where the probability is at least 0.05). This conservativeness is reduced with the tail probability approach (So they are using a method more like Blaker than Clopper-Pearson).

You may want to emphasize the distinction between “do a majority lean left” (which would be one-sided) and “how often do couples lean left.” You can also highlight properties of confidence intervals (e.g., widen with confidence level, shorten with larger sample size), but these ideas will definitely come up in the Section 2 as well.

Technology: The One Proportion Inference applet includes a slider for the process probability (and one for the sample size). You may still want students to do the first chart in (c) “manually” and then move to the slider in (d).

The R functions (binomtest, onepropz, twopropz, onesamplet) allow you to enter multiple confidence levels, e.g., iscambinomtest(11, 20, conf.level = c(.90, .95)) to display the intervals together in the output window.

Option: You could begin Ch. 2 through the normal distribution before Section 2.

Section 2: Normal Approximation for Sample Proportions

This section reviews the previous material but through the “traditional” computational model – the normal distribution. The audience for this course is typically capable of working with the normal model without a lot of background detail, but some background will be helpful when you get to critical values and if you cover the continuity correction ideas. (You may want to detour with some traditional normal probability model calculations for practice; example Canvas module) Try to emphasize the parallels with the previous methods (simulation, exact binomial), the correspondence between the binomial and normal results, and some tongue-in-cheek that these normal-based methods aren’t really necessary anymore with modern computing. Although students may appreciate the simplification in some of the calculations, in particular how to improve the calculation of confidence intervals.

Investigation 1.7: Reese’s Pieces

Timing: This activity is now optional and really only here if your students need additional practice thinking in terms of distributions of proportions.

Technology Note: The One Proportion applet also has an option for more M&M-like colors (see link at bottom of page).

The beginning of this activity is largely review, but students will appreciate the candies. The fun size bags of Reese's work well, or you can bring dixie cups and fill them up individually. (Do watch for peanut allergies.) We would continue to stress the distinction between the sample data for each student (e.g., the bar graphs) and the accumulated class data (the sampling distribution). You may want to adopt a catchy term for the latter distribution such as "what if" (what if the null had been true) or "null" distribution. (We use "could have been" to refer to an individual simulated sample in these early investigations but not throughout the course.)

The applet transitions to focusing on using the sample proportion as the statistic and some advantages (and correspondences) in doing so. In particular, the role of sample size and the commonality of the shape of the null distribution. You can also refer back to the end of Investigation 1.1 to motivate the formula for the theoretical standard deviation. With candies in hand, they may have the energy for more details of these derivations leading to the Key Result. We definitely recommend showing the correspondence between the simulation results and the theoretical results for the mean and SD. You may want to briefly discuss the “2SD rule” and this is a great time to connect that with the 5% cut-off they may be used to using. You may also want to emphasize that this is an “interval” for sample proportions if you know (to help contrast later with a confidence interval for ).

The Reese's applet has a feature that displays the normal distribution’s probability value as well. That visual will help emphasize the continued focus on how to interpret a p-value in terms of the percentage of samples with a result at least that extreme (under the hypothesized probability). You may also want to assign homework that has them look at how the SD formula changes (e.g., play with derivatives) with changes in n and (especially maximizing at = 0.5).

You may also want to detour to general practice with normal probability calculations for quantitative variables (example Canvas module), or wait until Chapter 2.

Investigation 1.8: Halloween Treats

Timing: 45-60 minutes, especially if they are familiar with the study context in advance (you might even want to assign parts (a)-(c) to be completed before class). You also need to decide how strongly you want to cover the continuity correction.

Technology: You may need to clarify the term “radio button” in using the applet.

This investigation provides practice in using the Normal Distribution for calculating probabilities of sample proportions, including in the context of a two-sided p-value and the (optional) continuity correction. Continue to focus on interpreting the calculations. You may need to spend a few minutes justifying the use of the normal model as an alternative (approximate) model and the z-score provides one advantage and confidence intervals will provide a strong second one (and later power calculations). Math majors will also appreciate seeing how the factors affecting the p-value show up in the z-score formula. Make sure they see the link to Ho and Ha (and ₀). You can also discuss why the z value is positive/negative and the short cuts they can take with a symmetric distribution. The term “test statistic” is new here, but you can link back to earlier discussion of standardizing.

You will have to caution them against "proving" the null and also that they are talking about the process in general vs. individual children. You may want to try to get them in the habit of telling you the inputs to the technology they are using.

It is pretty impressive how much the continuity correction helps and you will want to refer to pictures in explaining it. Make sure students understand when you add or subtract the correction term, but otherwise you may not want to spend much time on the correction. (The Theory-Based Inference applet has a check box for this correction, adjusting both the z-value and the p-value.) It does force them to think deeply about many of the earlier ideas in the course (e.g., continuous vs. discrete and shape, center, and spread, strict inequalities).

The practice problems focus on applying the normal approximation to other contexts and be sure to emphasis some cases (e.g., St. George’s hospital) where this may not be appropriate. Make sure students don’t miss the study conclusions box and the overall Summary of One Proportion z-test box.

Investigation 1.9: Kissing the Right Way (cont.)

Timing: 45-60 minutes

This investigation introduces the one proportion z-interval and you have some options as to how much detail you want to include (e.g., finding percentiles). Students may need more guidance in coming up with critical values. You will also want to discuss how this interval procedure relates to what they did before and to the general ± 2 SD form. (This will hopefully provide some motivation for the normal model.) They will need some guidance on question (d) and how we can think of 2 standard deviations as a maximum plausible distance. Students seem to do well with this “two standard deviation rule.” Try to get students to think in terms of the formula in (i)-(k) versus a specific calculated interval.

Especially if you skipped Inv 1.7, this could be a good time to remind them of the equivalence of using either X or as the statistic, motivating using moving forward.

Not clear how much time you want to spend on sample size determination but student intuition here can be rather poor. Also, if not covered before, they will need to realize that the SD formula is maximized when the proportion equals 0.5. In the end, you may want to be a bit open in what they choose to use for (e.g., 0.645 from the previous study or 0.5 to be safe, or both and contrast, or their own guess).

If you have extra time, you may want to introduce the Simulating Confidence Intervals applet here to help them interpret “statistical confidence.” It works best to generate 200 intervals at a time and then accumulate to 1000. Also have them think about what it means for the interval to be narrower - what that implies in response to the research question. It can be nice to show how the binomial intervals are a bit longer and not symmetric. You will also need to decide how much emphasis you want to put on the frequentist interpretation of confidence level and the idea of “reliability” of a method. Students will need lots of practice and feedback, even in recognizing when the question asks about interpretation of the interval vs. interpretation of the confidence level.

You may also want to make them think carefully about the difference between standard deviation and standard error at this point (Practice Problem 1.9B(b)). The rest of PP 1.9B provides some different interpretations for them to critique. This can be very effective for class discussion, highlighting what needs to be changed to convert to a correct interpretation.

Investigation 1.10: Hospital Mortality Rates (cont.)

Timing: This can be done in combination with other activities and may take 15-30 minutes (it can tie in nicely with a review of the Wald interval and Wald vs. binomial intervals and how to decide between procedures). You may have time at the end of a class period to collect their sample data for the first part of the next investigation (choosing 10 “representative” words).

Materials: You could also use student collected data here (e.g., left-handers, vegetarians) though be careful as to how you define the random process in that case.

Many statisticians (and JMP) now feel only the “plus four” method should be taught. We prefer to start with the traditional method for its intuition (estimate + margin of error) but do feel it’s important for students to see and understand the goals of this modern approach. You can also highlight to students how they are using a method that has gained in popularity in the last 20 years vs. some of their other math courses which only use methods that are centuries old. Make sure students don't get bogged down by the numerous methods but gain some insight into how we might decide one method is doing “better” than another or not.

Do make sure you use 95% confidence in the Simulating Confidence Intervals applet with the Adjusted Wald. (It’s also illuminating to Sort the intervals and see the intervals with zero length with = 0 for the Wald method.) Then let them practice on calculating the Adjusted Wald in (n), making sure students remember to increase n in the denominator as well as using the adjusted estimate. You may also want to assign a simple calculation practice for a practice problem to this investigation or focus on benefits/disadvantages of the procedures.

Investigation 1.11: Ganzfeld Experiments

We have replaced the old “batting averages” context with this one that make easier use of the normal approximation, though we suspect there is still a better connection to student intuition with the batting averages context. This investigation uses the Power Simulation applet and then at the end provides the ability to confirm the calculations with different technology tools (which may or may not allow you to toggle between binomial and normal probabilities). We did find the “baggage” of “observed level of significance” problematic with the binomial case, but do provide one question at the end where they should switch back to binomial and you may want to discuss the unfortunate consequences of the discreteness of that distribution. The emphasis here should be on the definitions and reminding students that they already know how to make these calculations, it just becomes a two-step process.

Lots of practice problems here depending on whether you want to emphasize the definitions, the relationships, the visuals, or the calculations. Our first midterm is often around this time, and we sometimes delay this investigation until after the exam, perhaps even asking related questions, applying what they have learned so far, that you can define later in terms of power.

Section 3: Sampling from a Finite Population

There is a transition here to focusing on samples from a finite population rather than samples from a process. It may be worth defining a “process” (like coin tossing, or measurements from a stream) for them to help them see the distinction. The goal is to help them quickly see that they will be able to apply the same analysis tools but now we will consider more closely whether or not we can generalize our conclusions to a larger population based on how the sample was selected.

Investigation 1.12: Sampling Words

Timing: About 30 minutes and can probably be combined with 1.13 in one class period.

Materials: We now have an integrated webpage to collect students’ results online into a google form, having them click on words on a webpage rather than circling them in the text. Our recommendation is to first use this link to create a URL that will be specific to your class (even section). That will give you a URL to copy and provide to your students. To use the webpage, students specify the number of words they will select (default is 10), and then once they have clicked on 10 words, they press the Submit button. Then they calculate the average length of their 10 words, enter their average, and press Submit Average. If the answer they input is incorrect, they will see the lengths and be asked to try again. Once they enter the correct mean for their sample, they should enter their name and instructor (the default will match the link you created so they shouldn’t change), and then submit their results. They will then be provided several links, including all the data collected so far (including potentially lots of “test data”), including a link that will import the data they have generated into the Descriptive Statistics applet, or just the results for the link you created which can then be copy and pasted (csv). Keep this page open, as there is also a web version of the sampling frame which they can use for their random selection of words. This page also allows different populations (e.g., song lyrics – let me know if you want me to add others!). In the google sheet, we also report the proportion of short (3 or fewer letters) words and the population (to make sure all students used the same population). This is still a work in progress, so please reach out if you have questions or issues.

In selecting the words, students may not like the vagueness of the directions and you can tell them that you want the words in the sample to give you information about the population. In fact, we claim that even if you tell them you are trying to estimate the proportion of short words, you will still see a bias in the sampling method. We recommend getting students up and out of their seats to share their results for at least one of the distributions. In R, if using the ISCAM workspace in a lab, do be cautious that everyone may have the same seed! (You can remove the seed first.). Earlier in the text we said we were most interested in the standard deviation of the simulated distribution; here you can contrast with focusing on the mean of the sampling distribution for judging bias.

A nice extension to this activity if you have time is to consider stratified sampling and have them see the sampling distribution with less sampling variability (see Chapter 1 Appendix for brief introduction to this idea). Note: If multiple columns are pasted into the Sampling Words (Sampling Quantitative Variable) applet, categorical variables can be selected as stratifying or clustering variables. The computer will stratify based on population proportion. The user needs to specify desired number of clusters. Bad choices can lead to poor results.

For practice problem 1.12A, consider whether or not you want students to write out the explanations for their answers. You might also again emphasize the distinction between the distribution of the sample and the distribution of the sample statistics.

Investigation 1.13: Sampling Words (cont.)

We have tried to streamline the discussion of finite population correction factor, wanting to raise students’ awareness of the issue but then also how we usually get to ignore it.

This investigation works well as a class discussion. You will want to again emphasize to students that this is one of the rare cases where we have the entire population and our goal is to investigate properties of the procedures so that we can better understand what they are (and are not) telling us in actual research studies. You will also want to empathize with students how counter intuitive it is that the population size is not relevant in most situations. You can also remind them of the parallels to sampling from a process.

Investigation 1.14: Teen Hearing Loss

Timing: 15 minutes

This investigation again provides practice with the new terms and gets students back to the inferential world. You may want to compare the binomial calculations with the “exact” hypergeometric calculations, and even show the insensitivity of this calculation to the population size if the population proportion is held constant.

Investigation 1.15: Counting Concussions

This is student project data where a finite population correction is necessary. You could just highlight that feature as well as the definition of non-sampling errors if you haven’t yet. You will also want to think about whether you want to cover the hypergeometric distribution here or wait for the more intuitive use with randomization tests.

Investigation 1.16: Literary Digest

This investigation was recently expanded to include optional discussion on potential methods that could have been used to at least partially correct for the bias in the LD process. Discussion (a)-(c) would be sufficient, but many students will appreciate the additional exploration as a demonstration (vs. depth, assessment). PP 1.6C is a nice “spot check your data” example!

Timing: (a)-(c) can be discussed as a class in 10 minutes as a class discussion; other questions can be assigned as homework or optional exploration.

This famous example is an opportunity to practice with the new terms. Can also embellish with lots of historical fun facts as well- for example:

GEORGE GALLUP, JR.: It had been around in academic settings and so forth, and Roper had started the Fortune Poll a few months earlier. And so, there were sporadic efforts to do it, but my dad brought it to the national scene through a syndicated newspaper service, actually. Well, the reaction to that first report was dismay among New Dealers, of course, and the attacks started immediately. How can you interview merely - I think in those days it was two or three thousand people - and project this to the entire country.

That began to change, actually, the next year, when Gallup, Roper and Crossley, Archibald Crossley was a pollster, and Elmo Roper was another pollster, these were the three scientific pollsters, if you will, operating in 1936, and their success in the election, and the failure of the Literary Digest really changed some minds. And people started to think, well, maybe polls are accurate after all.

From Wikipedia:

In 1936, his new organization achieved national recognition by correctly predicting, from the replies of only 5,000 respondents, the result of that year’s presidential election, in contradiction to the widely respected Literary Digest magazine whose much more extensive poll based on over two million returned questionnaires got the result wrong. Not only did he get the election right, he correctly predicted the results of the Literary Digest poll as well using a random sample smaller than theirs but chosen to match it. …

You can also touch on nonsampling errors with Gallup or the more recent New Hampshire primary between Clinton and Obama.

Twelve years later, his organization had its moment of greatest ignominy, when it predicted that Thomas Dewey would defeat Harry S. Truman in the 1948 election, by five to 15 percentage points. Gallup believed the error was mostly due to ending his polling three weeks before Election Day.

Here is an excerpt from Gallup.com on how they conduct a national poll that illustrates all of these points!

Results are based on telephone interviews with 1,014 national adults, aged 18 and older, conducted March 25, 2009. ….

Polls conducted entirely in one day, such as this one, are subject to additional error or bias not found in polls conducted over several days.

Interviews are conducted with respondents on land-line telephones (for respondents with a land-line telephone) and cellular phones (for respondents who are cell-phone only).

In addition to sampling error, question wording and practical difficulties in conducting surveys can introduce error or bias into the findings of public opinion polls.

You may want to end with follow-up discussion on how non-random samples may or may not still be representative depending on the variable(s) being measured.

Some versions include an additional practice problem (when was convenient space)

Practice Problem 1.16D

(a) Does your class constitute a random sample of students from your school? Explain why or why not.

(b) Suggest a variable for which your class should not be considered a representative sample of all students at your school. Explain why not.

(c) Suggest a variable for which it might be reasonable to consider your class to be representative of all students at your school. Justify your choice.

Investigation 1.17: Cat Households

Timing: About 15 minutes. Students should be able to complete 1.17 and 1.18 largely on their own in one class period. They also work well as homework or review exercises.

This is a quick investigation to draw students' attention to the difference between statistical and practice significance.

Investigation 1.18: Female Senators

Timing: About 15 minutes.

Another quick investigation with an important reminder. You will get a lot of students to say yes to (d).

End of Chapter

Definitely draw students’ attention to the examples at the end of the chapter. (We hope to soon post video summaries of the examples.) The goal of the examples is for them to attempt the problems first, and then have detailed explanations available. Note that non-sampling errors are formally introduced in Investigation 1.15. The chapter concludes with a Chapter Summary, a bullet list of key ideas, and a technology summary (the Quick Reference Guides have been reintroduced). You may want to help students bookmark the technology guide(s).