2.5d Correlation and Causation

Correlation

Most social scientific research tests for correlation or whether a relationship exists between two or more variables. Suppose variable A and variable B are correlated. In that case, variable A does not cause a change in variable B, but a relationship exists between them. For example, carrying an umbrella is correlated with a forecast for rain.

Causation

In contrast, causation means that a change in one variable causes a change in another variable. For example, Figure 2.13 shows that as family income increases, SAT scores also increase (a positive correlation). Researchers have matched SAT scores with parents’ annual federal income tax records to identify this pattern (Chetty et al., 2023). However, does high family income cause high SAT scores? Data must meet four criteria to confirm one variable causes a change in another variable (Best, 2021):

  1. The variables must obey a time order. The cause must come before the effect. For example, high family income must occur before a high SAT score.
  2. A patterned variation is observable. A pattern must exist in the data. The pattern in Figure 2.13 shows an upward trajectory, which means that the higher the family income, the higher the SAT score on average.
  3. The relationship must make sense. There must be a reason variable A should cause a change in variable B. Higher-income parents can pay for private tutors, test prep courses, and for retests to reach a desired score. The relationship between family income and SAT scores makes sense. Lower-income parents do not have extra income to pay for these advantages.
  4. The relationship should be non-spurious. If the relationship between two variables is spurious, it is false. The relationship is caused not by their association but by a third variable. Causation can be shown if no third variable can explain the relationship between the two variables of interest. Family income predicts SAT scores, but higher income alone does not cause higher test scores. More income means that a household can better direct resources toward activities that improve test scores. For example, Buchmann et al. (2010) found a positive relationship between social class, the use of SAT preparation programs, and SAT scores. As social class position increased, so did the use of SAT preparation programs and SAT scores. Further, more income also increases the likelihood of taking part in other educational activities outside school, which also improves SAT scores (Vincent & Maxwell, 2016).

Figure 2.13

SAT Scores and Family Income for the High School Graduating Class of 2024

Lowest quintile ($0-$53,263): 891
2nd lowest quintile ($53,264-$69,092): 942
Middle quintile ($69,093-$86,073): 984
2nd highest quintile ($86,074-$113,340): 1039
Highest quintile (>$113,341): 1148

Correlation or Causation?

Sociological research does not usually prove causation. Experimental research is usually needed to prove causation. Doing ethical experimental research in sociology is difficult and often not possible. For example, it is unethical to assign children randomly at birth to parents of different incomes to measure the effect of income on education or other outcomes. The lack of experimental research and limited ability to show causation weakens sociology’s ability to predict. Sociologists can, however, show how variables like race or gender are correlated with other variables like income. Correlational research can still guide policy and decision-making. For example, because SAT scores are strongly correlated with family income, many universities have become test-optional or reduced the weight of test scores in their admission process.

Study Resources for Chapter 2

🔑Key Terms

🎓Review

🔤Glossary

📚References