Chapter 1 Introduction

1.1 Goals of quantitative research (describe, predict, cause-and-effect)

Researchers and practitioners employ quantitative methods to answer important questions and make decisions. Regardless of the discipline or application, they typically have one of three goals (Cozby and Bates 2020):

  • describe

  • predict

  • cause-and-effect

Consider medical researchers investigating a newly discovered variant of the coronavirus. Initially, they might focus on describe studies. How prevalent is the variant in the population? How does the prevalence vary over space and time? What groups of individuals are most vulnerable to the variant? What else is associated with the variant? Together, these studies help researchers better understand the variant and identify important variables for refined predict studies.

Predict studies identify individuals most at risk for the variant or for transmitting it to others. Researchers select potential predictors as inputs to algorithms and statistical methods that transform them into estimates of each individual’s risk for the variant. These methods rely upon associations between the inputs and the variant. In addition, these methods estimate the error associated with their predictions. Researchers try to ensure error estimates obtained within their study are good estimates of the error when their models are applied to other populations. While predict studies help us identify who is at risk, they do not tell us the effect if we change something about individuals (quit smoking, lose weight, etc.) To answer these “what if?” questions, researchers use cause-and-effect studies.

Cause-and-effect studies try to estimate the effect of intervening or changing some aspect of an individual. These studies are tremendously important in public policy (where they are aptly called intervention studies) to justify legislation and regulations to reduce risk. Unlike prediction studies, which are highly algorithmic and only require researchers to identify appropriate inputs, cause-and-effect studies use subject matter knowledge of the causal structure of the inputs to select appropriate statistical models (Hernán, Hsu, and Healy 2019). For example, to assess the affect of quitting smoking on risk for the variant, we need to understand why some people quit smoking and others do not, as these reasons themselves may be responsible for some changes in risk between the two groups. Unlike prediction studies which rely upon what we see, cause-and-effect studies require us to ask what would happen if we do something (Pearl and Mackenzie 2018). Only human beings are capable of such counterfactual thinking (“what would have happened if”) - even our most sophisticated computer algorithms require human supervision to address cause-and-effect questions.

1.2 Validity

When designing a study, researchers carefully consider its internal and external validity.

  • internal validity - How well does the study establish a cause-and-effect relationship between a treatment and outcome?

  • external validity - How broadly can the study’s results be applied?

Often, there is a trade off between internal and external validity when designing a study. For example, to increase internal validity, researchers sometimes restrict the study to a subset of similar subjects to eliminate the effects of other variables. However, in doing so, they reduce the external validity of the study if the results do not apply to the other subjects. In an area of research, our knowledge comes from many studies with different levels of internal and external validity – rarely does a single study answer every question. Consider the recent development of Covid-19 vaccines. For example, initial Covid-19 vaccine trials focused on internal validity to understand its effectiveness against the virus and obtain approval for mass distribution. Polack et al. (2020) randomly assigned 43,548 persons 16 years of age or older to receive two doses of either placebo or the BNT162b2 vaccine (Pfizer). They observed eight cases of Covid-19 among participants who received BNT162b2 and 162 cases among those who received the placebo. This was a huge moment in the pandemic! Randomly assigning subjects to receive the vaccine increased the internal validity of the study. However, for ethical reasons, the subjects had to be relatively healthy and volunteer for the trials. These limitations reduced the external validity of the study because it is not clear whether the vaccine has the same effect in the general population. However, conducting more randomized trials on other populations has ethical implications of its own, as they would require withholding the vaccine from large segments of the population. In this case, an observational study is more appropriate.

Dagan et al. (2021) conducted an observational study after the mass vaccination campaign in Israel using the health records of 4.7 million patients enrolled in its largest integrated health care organization. The researchers matched vaccine recipients to controls on important variables: age, sex, sector, neighborhood of residence, etc.. In this larger sample, they found vaccine effectiveness results consistent with the randomized trial, thus providing evidence of the vaccine’s effectiveness in the general public. The authors specifically address the importance of observational studies:

“Although randomized clinical trials [experiments] are considered the “gold standard” for evaluating intervention effects, they have notable limitations of sample size and subgroup analysis, restrictive inclusion criteria, and a highly controlled setting that may not be replicated in a mass vaccine rollout."

Exercises

  1. For the following studies, identify the goal (describe, predict, cause-and-effect) of the study. Briefly explain your choice:
  • A researcher obtains school records to collect information on the extracurricular activities of students.

  • A researcher obtains school records to determine the impact of extracurricular activities on grades.

  • A researcher obtains school records on extracurricular activities to identify students who are at risk to graduate.

  1. Select an area of interest to you. Write a sentence describing a study in your area of interest for each of the three goals (describe, predict, cause-and-effect).

  2. Read the article “Dozens to be deliberately infected with coronavirus in UK ‘human challenge’ trials” in Nature News. Write a paragraph discussing issues of internal and external validity.