Difference in differences

Short description: Statistical technique to use observational data for causal analysis

Difference in differences (DID^[1] or DD^[2]) is a quasi-experimental statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differential effect of a treatment on a "treatment group" versus a "control group" in a natural experiment.^[3] It calculates the effect of a treatment (i.e., an explanatory variable or an independent variable) on an outcome (i.e., a response variable or dependent variable) by comparing the average change over time in the outcome variable for the treatment group to the average change over time for the control group. Although it is intended to mitigate the effects of extraneous factors and selection bias, depending on how the treatment group is chosen, this method may still be subject to certain biases (e.g., mean regression, reverse causality and omitted variable bias).

In contrast to a time-series estimate of the treatment effect on subjects (which analyzes differences over time) or a cross-section estimate of the treatment effect (which measures the difference between treatment and control groups), the difference in differences uses panel data to measure the differences, between the treatment and control group, of the changes in the outcome variable that occur over time.

General definition

Difference in differences requires data measured from a treatment group and a control group at two or more different time periods, specifically at least one time period before "treatment" and at least one time period after "treatment". In the example pictured, the outcome in the treatment group is represented by the line P, and the outcome in the control group is represented by the line S. The outcome (dependent) variable in both groups is measured at time 1, before either group has received the treatment (i.e., the independent or explanatory variable), represented by the points P₁ and S₁. The treatment group then receives or experiences the treatment and both groups are again measured at time 2. Not all of the difference between the treatment and control groups at time 2 (that is, the difference between P₂ and S₂) can be explained as being an effect of the treatment, because the treatment group and control group did not start out at the same point at time 1. DID, therefore, calculates the "normal" difference in the outcome variable between the two groups (the difference that would still exist if neither group experienced the treatment), represented by the dotted line Q. (Notice that the slope from P₁ to Q is the same as the slope from S₁ to S₂.) The treatment effect is the difference between the observed outcome (P₂) and the "normal" outcome (the difference between P₂ and Q).

Formal definition

Consider the model

y_{i t} = γ_{s (i)} + λ_{t} + δ I (\dots) + ε_{i t}

where $y_{i t}$ is the dependent variable for individual $i$ and time $t$ , $s (i)$ is the group to which $i$ belongs (i.e. the treatment or the control group), and $I (\dots)$ is short-hand for the dummy variable equal to 1 when the event described in $(\dots)$ is true, and 0 otherwise. In the plot of time versus $Y$ by group, $γ_{s}$ is the vertical intercept for the graph for $s$ , and $λ_{t}$ is the time trend shared by both groups according to the parallel trend assumption (see Assumptions below). $δ$ is the treatment effect, and $ε_{i t}$ is the residual term.

Consider the average of the dependent variable and dummy indicators by group and time:

\begin{aligned} n_{s} & = number of individuals in group s \\ {\overline{y}}_{s t} & = \frac{1}{n_{s}} \sum_{i = 1}^{n} y_{i t} I (s (i) = s), \\ {\overline{γ}}_{s} & = \frac{1}{n_{s}} \sum_{i = 1}^{n} γ_{s (i)} I (s (i) = s) = γ_{s}, \\ {\overline{λ}}_{s t} & = \frac{1}{n_{s}} \sum_{i = 1}^{n} λ_{t} I (s (i) = s) = λ_{t}, \\ D_{s t} & = \frac{1}{n_{s}} \sum_{i = 1}^{n} I (s (i) = treatment, t in after period) I (s (i) = s) = I (s = treatment, t in after period), \\ {\overline{ε}}_{s t} & = \frac{1}{n_{s}} \sum_{i = 1}^{n} ε_{i t} I (s (i) = s), \end{aligned}

and suppose for simplicity that $s = 1, 2$ and $t = 1, 2$ . Note that $D_{s t}$ is not random; it just encodes how the groups and the periods are labeled. Then

\begin{aligned} ({\overline{y}}_{11} - {\overline{y}}_{12}) - ({\overline{y}}_{21} - {\overline{y}}_{22}) \\ = & [(γ_{1} + λ_{1} + δ D_{11} + {\overline{ε}}_{11}) - (γ_{1} + λ_{2} + δ D_{12} + {\overline{ε}}_{12})] \\ - [(γ_{2} + λ_{1} + δ D_{21} + {\overline{ε}}_{21}) - (γ_{2} + λ_{2} + δ D_{22} + {\overline{ε}}_{22})] \\ = & δ (D_{11} - D_{12}) + δ (D_{22} - D_{21}) + {\overline{ε}}_{11} - {\overline{ε}}_{12} + {\overline{ε}}_{22} - {\overline{ε}}_{21} . \end{aligned}

The strict exogeneity assumption then implies that

E [({\overline{y}}_{11} - {\overline{y}}_{12}) - ({\overline{y}}_{21} - {\overline{y}}_{22})] = δ (D_{11} - D_{12}) + δ (D_{22} - D_{21}) .

Without loss of generality, assume that $s = 2$ is the treatment group, and $t = 2$ is the after period, then $D_{22} = 1$ and $D_{11} = D_{12} = D_{21} = 0$ , giving the DID estimator

\hat{δ} = ({\overline{y}}_{11} - {\overline{y}}_{12}) - ({\overline{y}}_{21} - {\overline{y}}_{22}),

which can be interpreted as the treatment effect of the treatment indicated by $D_{s t}$ . Below it is shown how this estimator can be read as a coefficient in an ordinary least squares regression. The model described in this section is over-parametrized; to remedy that, one of the coefficients for the dummy variables can be set to 0, for example, we may set $γ_{1} = 0$ .

Assumptions

All the Gauss–Markov assumptions of the OLS model apply equally to DID, since DID is a special version of OLS. In addition, DID requires a parallel trend assumption. The parallel trend assumption says that $λ_{2} - λ_{1}$ are the same in both $s = 1$ and $s = 2$ . Given that the formal definition above accurately represents reality, this assumption automatically holds. However, a model with $λ_{s t} : λ_{22} - λ_{21} \neq λ_{12} - λ_{11}$ may well be more realistic. In order to increase the likelihood of the parallel trend assumption holding, a difference-in-differences approach is often combined with matching.^[4] This involves "matching" known "treatment" units with simulated counterfactual "control" units: characteristically equivalent units which did not receive treatment. By defining the outcome variable as a temporal difference (change in observed outcome between pre- and posttreatment periods), and matching multiple units in a large sample on the basis of similar pre-treatment histories, the resulting ATE (i.e. the ATT: average treatment effect for the treated) provides a robust difference-in-differences estimate of treatment effects. This serves two statistical purposes: firstly, conditional on pre-treatment covariates, the parallel trends assumption is likely to hold; and secondly, this approach reduces dependence on associated ignorability assumptions necessary for valid inference.

As illustrated in the figure, the treatment effect is the difference between the observed value of y and what the value of y would have been with parallel trends, had there been no treatment. However, a shortcoming of DID is when something other than the treatment changes in one group but not the other at the same time as the treatment, implying a violation of the parallel trend assumption.

To guarantee the accuracy of the DID estimate, the composition of individuals of the two groups is assumed to remain unchanged over time. When using a DID model, various issues that may compromise the results, such as autocorrelation^[2] and Ashenfelter dips, must be considered and dealt with.

Implementation

The DID method can be implemented according to the table below, where the lower right cell is the DID estimator.

$y_{s t}$	$s = 2$	$s = 1$	Difference
$t = 2$	$y_{22}$	$y_{12}$	$y_{12} - y_{22}$
$t = 1$	$y_{21}$	$y_{11}$	$y_{11} - y_{21}$
Change	$y_{21} - y_{22}$	$y_{11} - y_{12}$	$(y_{11} - y_{21}) - (y_{12} - y_{22})$

Running a regression analysis gives the same result. Consider the OLS model

y = β_{0} + β_{1} T + β_{2} S + β_{3} (T \cdot S) + ε

where $T$ is a dummy variable for the period, equal to $1$ when $t = 2$ , and $S$ is a dummy variable for group membership, equal to $1$ when $s = 2$ . The composite variable $(T \cdot S)$ is a dummy variable indicating when $S = T = 1$ . Although it is not shown rigorously here, this is a proper parametrization of the model formal definition, furthermore, it turns out that the group and period averages in that section relate to the model parameter estimates as follows

\begin{aligned} {\hat{β}}_{0} & = \hat{E} (y ∣ T = 0, S = 0) \\ {\hat{β}}_{1} & = \hat{E} (y ∣ T = 1, S = 0) - \hat{E} (y ∣ T = 0, S = 0) \\ {\hat{β}}_{2} & = \hat{E} (y ∣ T = 0, S = 1) - \hat{E} (y ∣ T = 0, S = 0) \\ {\hat{β}}_{3} & = [\hat{E} (y ∣ T = 1, S = 1) - \hat{E} (y ∣ T = 0, S = 1)] \\ - [\hat{E} (y ∣ T = 1, S = 0) - \hat{E} (y ∣ T = 0, S = 0)], \end{aligned}

where $\hat{E} (\dots ∣ \dots)$ stands for conditional averages computed on the sample, for example, $T = 1$ is the indicator for the after period, $S = 0$ is an indicator for the control group. Note that ${\hat{β}}_{1}$ is an estimate of the counterfactual rather than the impact of the control group. The control group is often used as a proxy for the counterfactual (see, Synthetic control method for a deeper understanding of this point). Thereby, ${\hat{β}}_{1}$ can be interpreted as the impact of both the control group and the intervention's (treatment's) counterfactual. Similarly, ${\hat{β}}_{2}$ , due to the parallel trend assumption, is also the same differential between the treatment and control group in $T = 1$ . The above descriptions should not be construed to imply the (average) effect of only the control group, for ${\hat{β}}_{1}$ , or only the difference of the treatment and control groups in the pre-period, for ${\hat{β}}_{2}$ . As in Card and Krueger, below, a first (time) difference of the outcome variable $(Δ Y_{i} = Y_{i, 1} - Y_{i, 0})$ eliminates the need for time-trend (i.e., ${\hat{β}}_{1}$ ) to form an unbiased estimate of ${\hat{β}}_{3}$ , implying that ${\hat{β}}_{1}$ is not actually conditional on the treatment or control group.^[5] Consistently, a difference among the treatment and control groups would eliminate the need for treatment differentials (i.e., ${\hat{β}}_{2}$ ) to form an unbiased estimate of ${\hat{β}}_{3}$ . This nuance is important to understand when the user believes (weak) violations of parallel pre-trend exist or in the case of violations of the appropriate counterfactual approximation assumptions given the existence of non-common shocks or confounding events. To see the relation between this notation and the previous section, consider as above only one observation per time period for each group, then

\begin{aligned} \hat{E} (y ∣ T = 1, S = 0) & = \hat{E} (y ∣ after period, control) \\ [3 p t] \\ = \frac{\hat{E} (y I (after period, control))}{\hat{P} (after period, control)} \\ [3 p t] \\ = \frac{\sum_{i = 1}^{n} y_{i, after} I (i in control)}{n_{control}} = {\overline{y}}_{control, after} \\ [3 p t] \\ = {\overline{y}}_{12} \end{aligned}

and so on for other values of $T$ and $S$ , which is equivalent to

{\hat{β}}_{3} = (y_{11} - y_{21}) - (y_{12} - y_{22}) .

But this is the expression for the treatment effect that was given in the formal definition and in the above table.

Variants of difference-in-difference frameworks include ones for staggered implementation of treatment as well as an estimator introduced for multiple time periods and other variations by Brantly Callaway and Pedro H.C. Sant'Anna.^[6]

Example

The Card and Krueger article on minimum wage in New Jersey, published in 1994,^[5] is considered one of the most famous DID studies; Card was later awarded the 2021 Nobel Memorial Prize in Economic Sciences in part for this and related work. Card and Krueger compared employment in the fast food sector in New Jersey and in Pennsylvania, in February 1992 and in November 1992, after New Jersey's minimum wage rose from $4.25 to $5.05 in April 1992. Observing a change in employment in New Jersey only, before and after the treatment, would fail to control for omitted variables such as weather and macroeconomic conditions of the region. By including Pennsylvania as a control in a difference-in-differences model, any bias caused by variables common to New Jersey and Pennsylvania is implicitly controlled for, even when these variables are unobserved. Assuming that New Jersey and Pennsylvania have parallel trends over time, Pennsylvania's change in employment can be interpreted as the change New Jersey would have experienced, had they not increased the minimum wage, and vice versa. The evidence suggested that the increased minimum wage did not induce a decrease in employment in New Jersey, contrary to what some economic theory would suggest. The table below shows Card & Krueger's estimates of the treatment effect on employment, measured as FTEs (or full-time equivalents). Card and Krueger estimate that the $0.80 minimum wage increase in New Jersey led to an average 2.75 FTE increase in employment per store.

	New Jersey	Pennsylvania	Difference
February	20.44	23.33	−2.89
November	21.03	21.17	−0.14
Change	0.59	−2.16	2.75

A software example application of this research is found on the Stata's command -diff- ^[7]

Applications

The difference-in-differences (DID) framework has been applied widely beyond labor economics and minimum wage studies. In public health, DID has been used to evaluate the effect of new medical guidelines or vaccination campaigns by comparing regions before and after policy implementation.^[8] In education, DID methods help measure the impact of reforms such as changes in school funding or class size. In environmental economics, they are used to assess regulations on pollution, energy consumption, or climate policy. These applications rely on the key assumption of parallel trends, but when carefully designed, they provide policymakers with robust causal estimates using observational data.

In economic history

Difference-in-differences has also been applied to the study of historical events, particularly in the field of economic history, where researchers rely on natural experiments to investigate long-run outcomes. By comparing regions or groups that were differentially exposed to shocks such as disease, institutional change, or wartime destruction, scholars have used the method to identify causal effects that cannot be observed directly.

In 2021, Elena Esposito used DID to examine how the arrival of malaria influenced the expansion of African slavery in the United States.^[9] She compared counties that were more ecologically suitable for malaria transmission with those that were less suitable, before and after the introduction of the disease in the late seventeenth century. Results showed that malaria-prone counties experienced a much greater increase in the share of enslaved Africans after the disease became endemic. In addition, enslaved individuals from parts of Africa with high malaria prevalence sold at higher prices in Louisiana slave markets, suggesting that buyers placed a premium on resistance to malaria. This application demonstrated how DID can be used to link environmental shocks with institutional development over the long run.

González, Marshall, and Naidu in 2017 used DID to analyze how the abolition of slavery in Maryland affected patterns of entrepreneurship.^[10] They combined census data with contemporary credit reports to compare business formation by slaveowners and non-slaveowners before and after the uncompensated abolition of slavery in 1864. They found that slaveowners were more likely to start businesses before emancipation, but this advantage disappeared once slavery was abolished.In this case, DID made it possible to treat emancipation as a sudden institutional change and to see how it affected business activity.

In 2022, James Feigenbaum, James Lee, and Filippo Mezzanotti used DID to measure the economic effects of General Sherman’s March during the American Civil War.^[11] Using county-level data from 1850 to 1920, they compared areas directly in the path of the march with nearby counties that were spared. Their findings showed large and immediate declines in farm values, agricultural investment, and manufacturing activity in the affected counties. While manufacturing output eventually recovered by the late nineteenth century, agricultural effects lasted for decades, with lower levels of improved farmland still evident in 1920. The study also showed that the lack of credit and the collapse of banks after the Civil War slowed down the recovery, especially in places that relied more on borrowing. Overall, the study used DID to demonstrate that conflict had lasting effects on the economy and local institutions.

In 2012, Richard Hornbeck used DID to study the long-term economic consequences of the American Dust Bowl of the 1930s.^[12] He compared counties that experienced severe soil erosion with nearby counties that were less affected, before and after the disaster. His findings show that heavily eroded counties suffered persistent declines in land values and agricultural revenues of 20 to 30 percent, with little recovery even 10 years later. Many residents migrated away as a result, and population decline became the primary adjustment. This work demonstrates how DID can be applied to environmental shocks in economic history, pointing out the long run effects of ecological disasters on regional development.

References

↑ Abadie, A. (2005). "Semiparametric difference-in-differences estimators". Review of Economic Studies 72 (1): 1–19. doi:10.1111/0034-6527.00321.
↑ ^2.0 ^2.1 Bertrand, M.; Duflo, E.; Mullainathan, S. (2004). "How Much Should We Trust Differences-in-Differences Estimates?". Quarterly Journal of Economics 119 (1): 249–275. doi:10.1162/003355304772839588. http://www.nber.org/papers/w8841.pdf.
↑ Angrist, J. D.; Pischke, J. S. (2008). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press. pp. 227–243. ISBN 978-0-691-12034-8. https://books.google.com/books?id=ztXL21Xd8v8C&pg=PA227.
↑ Basu, Pallavi; Small, Dylan (2020). "Constructing a More Closely Matched Control Group in a Difference-in-Differences Analysis: Its Effect on History Interacting with Group Bias". Observational Studies 6: 103–130. doi:10.1353/obs.2020.0011. https://muse.jhu.edu/article/793352/pdf.
↑ ^5.0 ^5.1 Card, David; Krueger, Alan B. (1994). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania". American Economic Review 84 (4): 772–793.
↑ Callaway, Brantly; Sant’Anna, Pedro H. C. (2021-12-01). "Difference-in-Differences with multiple time periods". Journal of Econometrics. Themed Issue: Treatment Effect 1 225 (2): 200–230. doi:10.1016/j.jeconom.2020.12.001. ISSN 0304-4076. https://www.sciencedirect.com/science/article/abs/pii/S0304407620303948.
↑ Villa, Juan M. (2016). "diff: Simplifying the estimation of difference-in-differences treatment effects". The Stata Journal 16 (1): 52–71. doi:10.1177/1536867X1601600108.
↑ Wing, C.; Simon, K.; Bellos, K. (2018). "Designing Difference in Difference Studies: Best Practices for Public Health Policy Research". Annual Review of Public Health 39: 453–469. doi:10.1146/annurev-publhealth-040617-013507.
↑ Esposito, Elena (2021). "Malaria and the African slave trade in the United States". Econometrica 89 (5): 2189–2222. doi:10.3982/ECTA17668.
↑ González, Pablo; Marshall, Guillermo; Naidu, Suresh (2017). "Start-up Nation? Slave Wealth and Entrepreneurship in Civil War Maryland". American Economic Journal: Applied Economics 9 (3): 70–104. doi:10.1257/app.20150497.
↑ Feigenbaum, James; Lee, James; Mezzanotti, Filippo (2022). "Capital Destruction and Economic Growth: The Effects of Sherman’s March, 1864–1865". American Economic Review 112 (8): 2577–2614. doi:10.1257/aer.20191694.
↑ Hornbeck, Richard (2012). "The Enduring Impact of the American Dust Bowl: Short and Long-run Adjustments to Environmental Catastrophe". American Economic Review 102 (4): 1477–1507. doi:10.1257/aer.102.4.1477.

External links

Difference in Difference Estimation, Healthcare Economist website

0.00

(0 votes)

Original source: https://en.wikipedia.org/wiki/Difference in differences. Read more

[1] Abadie, A. (2005). "Semiparametric difference-in-differences estimators". Review of Economic Studies 72 (1): 1–19. doi:10.1111/0034-6527.00321.

[Bertrand-2] 2.0 ^2.1 Bertrand, M.; Duflo, E.; Mullainathan, S. (2004). "How Much Should We Trust Differences-in-Differences Estimates?". Quarterly Journal of Economics 119 (1): 249–275. doi:10.1162/003355304772839588. http://www.nber.org/papers/w8841.pdf.

[3] Angrist, J. D.; Pischke, J. S. (2008). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press. pp. 227–243. ISBN 978-0-691-12034-8. https://books.google.com/books?id=ztXL21Xd8v8C&pg=PA227.

[4] Basu, Pallavi; Small, Dylan (2020). "Constructing a More Closely Matched Control Group in a Difference-in-Differences Analysis: Its Effect on History Interacting with Group Bias". Observational Studies 6: 103–130. doi:10.1353/obs.2020.0011. https://muse.jhu.edu/article/793352/pdf.

[David-5] 5.0 ^5.1 Card, David; Krueger, Alan B. (1994). "Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania". American Economic Review 84 (4): 772–793.

[6] Callaway, Brantly; Sant’Anna, Pedro H. C. (2021-12-01). "Difference-in-Differences with multiple time periods". Journal of Econometrics. Themed Issue: Treatment Effect 1 225 (2): 200–230. doi:10.1016/j.jeconom.2020.12.001. ISSN 0304-4076. https://www.sciencedirect.com/science/article/abs/pii/S0304407620303948.

[StataDiff-7] Villa, Juan M. (2016). "diff: Simplifying the estimation of difference-in-differences treatment effects". The Stata Journal 16 (1): 52–71. doi:10.1177/1536867X1601600108.

[8] Wing, C.; Simon, K.; Bellos, K. (2018). "Designing Difference in Difference Studies: Best Practices for Public Health Policy Research". Annual Review of Public Health 39: 453–469. doi:10.1146/annurev-publhealth-040617-013507.

[9] Esposito, Elena (2021). "Malaria and the African slave trade in the United States". Econometrica 89 (5): 2189–2222. doi:10.3982/ECTA17668.

[10] González, Pablo; Marshall, Guillermo; Naidu, Suresh (2017). "Start-up Nation? Slave Wealth and Entrepreneurship in Civil War Maryland". American Economic Journal: Applied Economics 9 (3): 70–104. doi:10.1257/app.20150497.

[11] Feigenbaum, James; Lee, James; Mezzanotti, Filippo (2022). "Capital Destruction and Economic Growth: The Effects of Sherman’s March, 1864–1865". American Economic Review 112 (8): 2577–2614. doi:10.1257/aer.20191694.

[12] Hornbeck, Richard (2012). "The Enduring Impact of the American Dust Bowl: Short and Long-run Adjustments to Environmental Catastrophe". American Economic Review 102 (4): 1477–1507. doi:10.1257/aer.102.4.1477.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

Anonymous

Search

Difference in differences

Namespaces

More

Page actions

Contents

General definition

Formal definition

Assumptions

Implementation

Example

Applications

In economic history

See also

References

Further reading

External links

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Difference in differences

General definition

Formal definition

Assumptions

Implementation

Example

Applications

In economic history

See also

References

Further reading

External links

Navigation

Wiki tools

Page tools

Other projects

Categories