Draft:Parallel trends
![]() | Review waiting, please be patient.
This may take 3 months or more, since drafts are reviewed in no specific order. There are 2,955 pending submissions waiting for review.
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
Reviewer tools
|
Parallel trends assumption (also known as the common trends or parallel paths assumption) is a key identifying assumption in causal inference, especially in the context of the difference-in-differences (DiD) research design in econometrics. It posits that, in the absence of a treatment or intervention, the difference between the outcomes for a "treatment group" and a "control group" would remain constant over time—in other words, the two groups would have followed parallel outcome trajectories if no treatment had occurred.[1]

This assumption allows researchers to use changes in the untreated group as an estimate of the counterfactual changes that would have occurred in the treated group, thereby isolating the treatment’s effect.[2]
The parallel trends assumption is fundamental to the validity of DiD analyses and is widely used in program evaluation, policy analysis, and other observational studies to draw causal conclusions when randomized experiments are not feasible. However, it is inherently unobservable (as it concerns a counterfactual scenario) and has important limitations and critiques.
Historical development in difference-in-differences
[edit]The concept of parallel trends emerged alongside the development of the difference-in-differences (DiD) methodology in econometrics. Early applications of DiD reasoning can be traced back to 19th-century analyses—for instance, John Snow’s 1855 study on cholera has been cited as a proto-DiD example.[3] However, the formal econometric framework gained prominence in the late 20th century.
In the 1970s and 1980s, labor economists evaluating employment programs began to use DiD-like approaches. Orley Ashenfelter (1978) noted a systematic dip in earnings for participants just before they entered job training programs.[4] This phenomenon—later known as Ashenfelter’s dip—highlighted that participants and non-participants had different pre-treatment trends, foreshadowing the importance of a parallel trends assumption.
In 1989, James Heckman and V. Joseph Hotz emphasized the need for specification tests in nonexperimental evaluations, effectively suggesting that researchers examine pre-treatment outcome trends for treatment and control groups as a diagnostic check.[5] Their work was a response to earlier skepticism about nonrandomized evaluations, such as Robert LaLonde's 1986 critique, and helped crystallize the idea that one should justify any DiD strategy by arguing that, absent the intervention, the groups would have evolved similarly.
The parallel trends assumption became more widely recognized with the influential study of David Card and Alan Krueger (1994), who used a DiD design to analyze the employment effect of a minimum wage increase. The credibility of their findings—that the policy did not reduce employment—rested on the assumption that, in the absence of the minimum wage change, employment trends in New Jersey (the treatment state) would have been similar to those in Pennsylvania (the control state).
Throughout the 1990s and 2000s, applied researchers increasingly adopted DiD methods, often explicitly discussing the parallel trends assumption as a key condition for causal interpretation. By the time textbooks like Mostly Harmless Econometrics (Angrist & Pischke, 2008) were published, the parallel trends (or "common trends") assumption was enshrined as one of the core assumptions underlying DiD and was routinely taught as such. It continues to be a cornerstone of causal inference pedagogy and practice, while ongoing research refines and critiques its use.
Formal definition of the assumption
[edit]In a canonical two-period, two-group difference-in-differences (DiD) setup, let there be a treatment group (T) and a control group (C). Denote by Yg,t the average outcome for group g ∈ {T, C} at time t ∈ {0, 1}, where t = 0 is a pre-treatment period and t = 1 is a post-treatment period (with the treatment applied to group T at t = 1). The parallel trends assumption formally states that the change in outcomes over time would have been the same for the treatment and control groups in the absence of treatment.<[6]
In terms of potential outcomes, this can be written as:
where Yg,t(0) denotes the outcome for group g at time t under no treatment. This states that the expected change from t = 0 to t = 1 for the treated group (if untreated) equals the actual change for the control group. Equivalently, the difference in outcomes between the groups would have remained constant from pre- to post-treatment in the absence of intervention.[7]
Under a typical DiD regression model specification, the assumption implies no interaction between group and time effects aside from the treatment itself. Consider the model:
Here, yit is the outcome for individual i at time t, Treatmenti is a treatment indicator, Postt indicates post-treatment, and δ is the DiD estimator. The assumption entails that in the absence of treatment (δ = 0), the time trend is fully captured by γ Postt, implying no group-specific time shocks. If this holds, δ estimates the causal effect of the treatment. Otherwise, δ is biased by differential trends.
Importantly, the assumption concerns the trajectory of untreated potential outcomes—not outcome levels. It allows groups to differ in baseline outcomes, as long as trends are similar. Graphically, the groups may differ in level but move in parallel. This distinction enables DiD to account for baseline differences by differencing them out, provided these differences are stable.[8]
Role and relevance in causal inference
[edit]The parallel trends assumption is central to identification of causal effects in DiD and other quasi-experimental designs. In observational studies, where treatment is not randomly assigned, causal claims require strong assumptions. Parallel trends helps control for time-varying confounders that affect both groups equally, such as macroeconomic conditions or seasonality.
If the assumption holds, changes in the control group approximate the counterfactual evolution of the treated group. Subtracting control changes from treated changes removes common trends and isolates the treatment effect. Hence, the name difference-in-differences.
In essence, the untreated group acts as a proxy for the treated group’s counterfactual trajectory.[9] This is why the assumption is sometimes paraphrased as: "the control group provides the correct counterfactual for the treatment group."[10]
The credibility of a DiD study hinges on the plausibility of parallel trends. Researchers enhance credibility by choosing comparable control groups or using matching methods to align pre-treatment trends. When the assumption is plausible, DiD is a powerful tool; when it fails, results may be misleading.
Assessing the plausibility of parallel trends
[edit]A critical challenge with the parallel trends assumption is that it cannot be directly tested, as it pertains to an unobservable counterfactual scenario—the trajectory the treated group would have taken without treatment.No statistical test can definitively confirm this assumption, since we never observe both “treatment” and “no treatment” outcomes for the same unit in the same period. Nevertheless, researchers employ several strategies to assess its plausibility and build a case for its validity in applied studies. These provide suggestive evidence, not proof.[11]
1. Pre-treatment trend analysis: If data include multiple time series observations prior to the intervention, one can examine whether the treatment and control groups followed similar trends. A standard approach is to plot group averages over time and visually inspect whether trends were approximately parallel before treatment. Parallel movement before the intervention increases confidence that the groups would have evolved similarly without treatment.[12] Alternatively, researchers use event study regressions (leads and lags models), interacting group indicators with time dummies. Statistically insignificant pre-treatment interaction terms suggest no differential pre-trends, supporting the assumption.For example, Melissa Kearney and Phillip Levine (2015), in their study of MTV’s 16 and Pregnant and teen birth rates, showed indistinguishable trends pre-treatment between comparison groups.[13]
2. Placebo or falsification tests: Researchers may run placebo DiD tests in time periods or with outcomes where no true treatment effect should exist. For instance, re-estimating DiD with pre-treatment data only, pretending treatment occurred earlier, helps detect spurious “effects” caused by pre-existing group differences. Detecting an effect in such placebo settings casts doubt on the assumption.[14] Another variant involves testing outcomes not plausibly affected by treatment. Any apparent treatment effect in these outcomes may signal violation of parallel trends.
3. Matching and covariate adjustment: Researchers often use matching methods or covariate adjustment to improve similarity between groups. Matching on baseline covariates and pre-treatment outcomes increases the likelihood of parallel counterfactual trends. Common strategies include propensity score matching or restricting samples to similar units. Basu and Small (2020) show that combining matching with DiD can reduce bias when strict parallel trends are implausible. Including group-specific time trends in regressions can help adjust for systematic differences, though it changes the identifying assumption.
4. External information and theory: Substantive context may support parallel trends. If groups were similarly exposed to macroeconomic forces or lacked differential access to other interventions, one might argue for parallelism. For example, stable regional income gaps under shared national business cycles might justify the assumption. Conversely, divergent economic conditions undermine it. Researchers often complement statistical diagnostics with qualitative justification.
It is important to emphasize that passing a pre-trend check or placebo test does not guarantee validity.[15] Absence of evidence against parallel trends is reassuring, but statistical power may be insufficient to detect subtle violations. Minor pre-treatment differences do not always invalidate findings, especially if they can be theoretically or empirically addressed.[16] Ultimately, evaluating the assumption requires a blend of empirical testing and context-specific reasoning.
Evidence from applied research
[edit]Applied research provides examples both supporting and challenging the parallel trends assumption. In practice, most difference-in-differences (DiD) studies present evidence on pre-treatment trends as part of their credibility checks. For instance, Melissa Kearney and Phillip Levine (2015) found no significant difference in pre-intervention trends in teen pregnancy rates between areas with high and low MTV viewership, supporting the claim that divergence post-intervention was due to the program 16 and Pregnant.[17] Many other studies report null pre-trend tests or visually parallel pre-treatment graphs, which have become standard components in DiD empirical work.
There are also notable cases documenting violations. Orley Ashenfelter’s (1978) study of job training programs revealed that earnings of participants were already falling relative to non-participants before the program—a phenomenon known as Ashenfelter’s dip[18] This invalidates a naive DiD comparison, as the assumption of parallel pre-treatment trends is violated. Researchers now address such issues by using longer panels and flexible specifications (e.g., individual fixed effects and time trends).
A re-analysis of Kearney and Levine’s work by David Jaeger, Ted Joyce, and Robert Kaestner (2019) questioned the parallel trends assumption once longer-term trends and demographic differences (e.g., unemployment, racial composition) were controlled for. They found that once adjusted, the estimated effect of the show disappeared, challenging the original DiD interpretation.[19] Their critique, and Kearney and Levine’s response, illustrate how DiD results can hinge on the assumption’s validity.
More broadly, applied studies find that the assumption is more credible when treatment and control groups come from similar populations and settings. Using neighboring regions or matching units based on observable characteristics tends to improve credibility.[20] Simulation studies (e.g., Ryan et al. 2018) show that methods like matched DiD outperform standard DiD when differential trends are present.
The growing econometric literature now includes methods to test, adjust for, or relax parallel trends. Jonathan Roth (2020) and Kahn-Lang and Lang (2019) caution against over-relying on pre-trend tests, emphasizing that failing to reject a difference does not confirm equivalence. Synthetic control methods create weighted control groups with exact pre-trend matches, addressing some violations but relying on other assumptions. New DiD estimators for multiple time periods and staggered adoption (e.g., Callaway and Sant’Anna 2021) allow for varying trends and offer more flexibility.
Criticisms and limitations
[edit]The parallel trends assumption, while foundational, faces several critiques:
- Untestability and counterfactual nature: The assumption concerns an unobserved counterfactual and cannot be verified directly from data. As Kahn-Lang and Lang (2019) stress, failing to reject pre-trend differences does not prove equivalence in the post-treatment counterfactual.[21]
- Violations due to omitted variables: If one group is affected by an external shock or policy unrelated to the treatment, outcomes will diverge for reasons other than treatment. This is often called the Achilles' heel of DiD designs.
- Selection biased and changing composition: If group membership changes over time, or units select into treatment based on anticipated trends, the assumption may fail. Ashenfelter’s dip exemplifies this, and researchers must account for such selection dynamics.
- Heterogeneous trends and functional form issues: Whether trends are parallel can depend on the functional form of the outcome variable (e.g., levels vs. logs). Researchers must justify their modeling choices and consider transformations with care.[22]
- Recent methodological critiques: Roth (2022) and others highlight that common DiD estimators may yield biased estimates when effects vary over time or across groups. Two-way fixed effects DiD models can be especially problematic in staggered treatment designs. This has led to newer estimators with more transparent assumptions and diagnostic tools.[23]
These limitations reinforce the importance of careful design, sensitivity checks, and methodological transparency in DiD applications.
Selected references
[edit]- Ashenfelter, Orley (1978). “Estimating the Effect of Training Programs on Earnings.” The Review of Economics and Statistics, 60(1): 47–57. Origin of the term Ashenfelter’s dip, documenting pre-treatment earnings declines for trainees.[24]
- Card, David & Krueger, Alan B. (1994). “Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania.” American Economic Review, 84(4): 772–793. Classic application of DiD in labor economics.
- Heckman, James J. & Hotz, V. Joseph (1989). “Choosing Among Alternative Nonexperimental Methods for Estimating the Impact of Social Programs: The Case of Manpower Training.” Journal of the American Statistical Association, 84(408): 862–874. Early discussion of specification tests for program evaluation, including pre-trend checks.[25]
- Angrist, Joshua D. & Pischke, Jörn-Steffen (2008). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press. Textbook treatment of DiD and the parallel trends assumption.
- Cunningham, Scott (2021). Causal Inference: The Mixtape. Yale University Press. Applied guide to causal inference methods including difference-in-differences and event studies.
- Greene, William H. (2018). Econometric Analysis. 8th ed. Pearson. Advanced graduate-level treatment of econometric models, including fixed effects and DiD estimators.
- Wooldridge, Jeffrey M. (2020). Introductory Econometrics: A Modern Approach. 7th ed. Cengage Learning. Widely used undergraduate/graduate textbook covering DiD within panel data methods.
- Bertrand, Marianne; Duflo, Esther; & Mullainathan, Sendhil (2004). “How Much Should We Trust Differences-in-Differences Estimates?” Quarterly Journal of Economics, 119(1): 249–275. Discusses pitfalls of DiD, including serial correlation and Ashenfelter’s dip.
- Kearney, Melissa S. & Levine, Phillip B. (2015). “Media Influences on Social Outcomes: The Impact of MTV’s 16 and Pregnant on Teen Childbearing.” American Economic Review, 105(12): 3597–3632. Applied DiD study testing parallel pre-trends in a natural experiment context.[26]
- Jaeger, David A.; Joyce, Ted; & Kaestner, Robert (2019). “A Cautionary Tale of Evaluating Identifying Assumptions: Did Reality TV Really Cause a Decline in Teenage Childbearing?” Journal of Business & Economic Statistics, 37(3): 443–454. Re-analysis questioning the parallel trends assumption in the Kearney & Levine study.[27]
- Kahn-Lang, Ariella & Lang, Kevin (2019). “The Promise and Pitfalls of Differences-in-Differences: Reflections on 16 and Pregnant and Other Applications.” Journal of Business & Economic Statistics, 37(3): 414–421. General discussion warning against misinterpretation of pre-trend tests.[28]
- Roth, Jonathan (2022). “Pre-test with Caution: Event-Study Estimates After Testing for Parallel Trends.” American Economic Review, 112(5): 1698–1731. Methodological critique of testing for parallel trends and inference consequences.[29]
- Callaway, Brantly & Sant’Anna, Pedro H.C. (2021). “Difference-in-Differences with Multiple Time Periods.” Journal of Econometrics, 225(2): 200–230. Develops DiD estimators for staggered treatments; modifies the assumption for multiple cohorts.
- Basu, Pallavi & Small, Dylan (2020). “Constructing a More Closely Matched Control Group in a Difference-in-Differences Analysis: Its Effect on Bias.” Observational Studies, 6: 103–130. Shows matching can improve DiD validity by achieving more parallel trends.
References
[edit]- ^ Angrist, Joshua D., and Jörn-Steffen Pischke. (2008). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press.
- ^ Bertrand, Marianne; Duflo, Esther; & Mullainathan, Sendhil (2004). “How Much Should We Trust Differences-in-Differences Estimates?” Quarterly Journal of Economics, 119(1): 249–275.
- ^ Cunningham, Scott. (2021). Causal Inference: The Mixtape. Yale University Press.
- ^ Ashenfelter, Orley (1978). “Estimating the Effect of Training Programs on Earnings.” The Review of Economics and Statistics, 60(1): 47–57.
- ^ Heckman & Hotz (1989)
- ^ Greene, William H. (2018). Econometric Analysis, 8th ed. Pearson.
- ^ Greene, William H. (2018). Econometric Analysis, 8th ed. Pearson.
- ^ Revisiting the Difference-in-Differences Parallel Trends Assumption - World Bank
- ^ Revisiting the Difference-in-Differences Parallel Trends Assumption - World Bank
- ^ Chapter 18 - Difference-in-Differences | The Effect
- ^ Angrist, Joshua D., and Jörn-Steffen Pischke. (2008). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press.
- ^ Cunningham, Scott. (2021). Causal Inference: The Mixtape. Yale University Press.
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Wooldridge, Jeffrey M. (2020). Introductory Econometrics: A Modern Approach, 7th ed. Cengage Learning.
- ^ Angrist, Joshua D., and Jörn-Steffen Pischke. (2008). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press.
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Ashenfelter, Orley (1978). “Estimating the Effect of Training Programs on Earnings.” The Review of Economics and Statistics, 60(1): 47–57.
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ What's trending in difference-in-differences?
- ^ Ashenfelter, Orley (1978). “Estimating the Effect of Training Programs on Earnings.” The Review of Economics and Statistics, 60(1): 47–57.
- ^ Heckman & Hotz (1989)
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Revisiting the Parallel Trends Assumption - World Bank
- ^ Revisiting the Parallel Trends Assumption - World Bank