Estimation and Statistical Tests for Di erence-in-Di Taeyong Park 1. M
by user
Comments
Transcript
Estimation and Statistical Tests for Di erence-in-Di Taeyong Park 1. M
Estimation and Statistical Tests for Difference-in-Differences Models with Binary Response Data Taeyong Park Department of Political Science The Delta method results in τ̂DID that asymptotically follow a normal distribution: DID DID ∂τ ∂τ , where b denotes a vector of all coefficients. τ̂DID ∼ N τDID, 0 Ωb ∂b ∂b DID I The asymptotic variance of τ is estimated consistently by DID DID ∂ τ̂ ∂ τ̂ Var(τ̂DID) = , where Ω̂b is a consistent covariance estimator of b. 0 Ω̂b ∂b ∂b Participation in treatment is only occurred where T = 1 and G = 1. Therefore, E[Y0|T, G, X] = αT + βG + Xθ, and E[Y1|T, G, X] = αT + βG + γTG + Xθ. As a result, E[Y0|T = 1, G = 1, X] = α + β + Xθ E[Y0|T = 1, G = 1, X] = α + β + γ + Xθ τDID = γ. I The significance test of τ Problem: The treatment effect cannot be constant across the treated population in nonliear DID models because the expectation of the outcome variable is bounded. I A solution: Apply the DID assumption of a constant difference between groups across time to the unobserved latent linear index such that I I 1) Treat the regressors X1, ..., Xk as random variables and resample one set of response variable value and regressors. 2) Repeat R times of sampling Z with replacement. The resulting bootstrap samples Z1 = {Z11, Z12, ...Z1n}, Z2 = {Z21, Z22, ...Z2n}, ..., ZR = {ZR1 , ZR2 , ...ZRn} produce R sets of coefficients br = [αr, βr, γr, θr]0, r = 1, ..., R. 3) Plug these bootstrapped coefficients into τ̂DID = logit−1(α̂ + β̂ + γ̂ + Xθ̂) − logit−1(α̂ + β̂ + Xθ̂) to obtain R sets of τ̂DID. I The mean of R sets of τ̂DID can be an estimate of τDID, and the standard deviation around the R sets of τ̂DID is the standard error of τ̂DID. 5. Bayesian Approach I 1) Generate a sample of the posterior distribution of each of the parameters of interest. I I I 2) Generate a sample of the posterior distribution of τ parameter outputs from step 1) into equation τ̂DID = logit−1(α̂ + β̂ + γ̂ + Xθ̂) − logit−1(α̂ + β̂ + Xθ̂). by plugging the 3) The posterior mean can be an estimate of τDID, and the standard deviation of the posterior distribution of τDID is the standard error of τ̂DID. 500 40 400 250 200 300 100 10 100 0.01 0.02 0.03 0.04 Mean SE 200 Density 150 Mean SE 0 0.00 0.02 0.04 0.06 0.08 0.10 0.000 0.005 Estimated standard errors 0.010 0.015 0.020 0.025 Estimated standard errors Figure : Distribution of Estimated Standard Errors 8. Empirical Application The effect of personal experience of employment on egotropic/sociotropic evaluations – To examine the microfoundation of decision making in the economic voting process I The American Panel Study (TAPS) Nov 2011 and Nov 2012 data I Dependent V: “getting better” = 1; “getting worse” = 0 I Treatments: Unemployed → Employed / Employed → Unemployed I Control: No change I Three Getting a job Losing a job Estimators Mean S.D. Lower Upper Mean S.D. Lower Upper τ̂Delta 0.319 0.126 0.072 0.567 -0.160 0.121 -0.397 0.078 Lib. Democrat τ̂Boot 0.258 0.140 0.035 0.560 -0.144 0.137 -0.469 0.061 τ̂Bayesian 0.304 0.113 0.106 0.511 -0.153 0.118 -0.403 0.085 τ̂Delta 0.241 0.104 0.037 0.446 -0.108 0.086 -0.276 0.061 Mod. Independent τ̂Boot 0.204 0.133 0.018 0.510 -0.103 0.110 -0.393 0.042 τ̂Bayesian 0.227 0.095 0.073 0.433 -0.101 0.082 -0.303 0.065 τ̂Delta 0.222 0.099 0.027 0.416 -0.097 0.079 -0.251 0.057 Con. Republican τ̂Boot 0.190 0.129 0.018 0.496 -0.092 0.099 -0.347 0.036 τ̂Bayesian 0.214 0.092 0.060 0.418 -0.095 0.078 -0.272 0.048 Table : Estimated Treatment Effects on Egotropic Evaluations We may employ the MCMCpack function MCMClogit() to generate a sample of the posterior distribution, if we want to deal with a logit DID model. DID Large latent values 50 1.0 0.8 0.6 Figure : Different Ranges of the Latent Values Lib. Democrat Mod. Independent Con. Republican Table : PolMeth 2014 Poster Session Density Simulations study 3: Large latent values 0 Simulations study 2: Medium latent values Mean SE Estimated standard errors More complicated models with, for instance, high-dimensional data or complicated combinations of variables? – Simulation-based approaches (bootstrapping and Bayesian) should be considered. I Approximation Bootstrap Bayesian 0.00 As a result, an observation that can be resampled is displayed as Zi = [Yi, Xi1, Xi2, ...Xik]. 0 I That is, Z = [Z1 , Z2 , ...Zn ] can be resampled, where n is the number of observations of the original data set, and Z1 = [Y1, X11, X12, ...X1k], ..., Zn = [Yn, Xn1, Xn2, ...Xnk]. I ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Simulations study 1: Small latent values I E[Y0|T = 1, G = 1, X] = Φ(α + β + Xθ) E[Y0|T = 1, G = 1, X] = Φ(α + β + γ + Xθ), τDID = E[Y1|T = 1, G = 1, X] − E[Y0|T = 1, G = 1, X] = Φ(α + β + γ + Xθ) − Φ(α + β + Xθ). 0.4 0.2 I Moderate latent values A logistic distribution and the latent values 6. The Bootstrapping Approach I where Φ(·) denotes the conditional distribution function of the standard normal distribution for the probit case. Then, the treatment effect in this DID probit model is estimated by Small latent values where exp(•) = exp(α̂ + β̂ + γ̂ + Xθ̂) and exp(⊕) = exp(α̂ + β̂ + Xθ̂). : t-test 3. Review: Puhani’s Nonlinear DID Table : Estimation Coverage Ω̂b comes from the variance-covariance matrix. DID DID ∂ τ̂ ∂ τ̂ DID I For , take the partial derivative of τ̂ with respect to each element of the vector of 0 and ∂b ∂b the coefficients. For example, the derivative of τ̂DID with respect to a coefficient β̂ is displayed as ∂τ̂DID ∂ = exp(•)[1 + exp(•)]−1 − exp(⊕)[1 + exp(⊕]−1 ∂β̂ 0 ∂β̂ 0 = exp(•) [1 + exp(•))]−1 − exp(•)[1 + exp(•)]−2 − exp(⊕) [1 + exp(⊕))]−1 − exp(⊕)[1 + exp(⊕)]−2 , I τDID = E[Y1|T = 1, G = 1, X] − E[Y0|T = 1, G = 1, X]. DID 2) Var(τ̂DID) can be computed analytically: 30 I 0 T: time; G: group; X: covariates; Y and Y : potential outcomes with and without treatment; Y: outcome I The DID treatment effect = The difference between the expected potential outcome under treatment and the expected counterfactual outcome under treatment: I 1) Apply the Delta method to calculate the asymptotic variance of τDID. 0.0 1 Simulation study 1 Small latent values (quantiles) Estimation coverage at the 95% level 25% 50% 75% Approximation Bootstrapping Bayesian -5.464 -3.650 -1.950 325/1000 1000/1000 1000/1000 Simulation study 2 Moderate latent values (quantiles) Estimation coverage at the 95% level 25% 50% 75% Approximation Bootstrapping Bayesian -1.476 -0.052 1.365 997/1000 1000/1000 1000/1000 Simulation study 3 Large latent values (quantiles) Estimation coverage at the 95% level 25% 50% 75% Approximation Bootstrapping Bayesian 1.685 3.532 5.361 197/1000 1000/1000 1000/1000 I 2. Review: Puhani’s Linear DID I Undertake simulation studies to examine the relative performance of the three approaches with regard to the coverage property of the true treatment effect. 20 I I 0 Consider a logit DID model. DID I τ̂ = logit−1(α̂ + β̂ + γ̂ + Xθ̂) − logit−1(α̂ + β̂ + Xθ̂), based on Puhani’s nonlinear DID estimator. I Prob(Y=1) For the standard DID model with a continuous dependent variable, it is straightforward to estimate DID treatment effect and conduct statistical tests. (But see Athey and Imbens (2006)) I Then, what if we want to deal with a limited dependent variable case such as binary or ordered categorical? I Puhani’s 2012 Economics Letters paper derives a nonlinear DID estimator. I However, one limitation is that Puhani (2012) does not discuss how to estimate the DID treatment effect in practice and how to perform statistical tests. I The present paper focuses on exploring three approaches to deriving the variance of Puhani’s nonlinear DID estimator and applying these approaches to the study of economic voting. I 7. Simulation Studies Density 5. Approximation-based Delta Method Approach 1. Motivation Three Getting a job Losing a job Estimators Mean S.D. Lower Upper Mean S.D. Lower Upper τ̂Delta 0.199 0.154 -0.104 0.501 0.006 0.111 -0.211 0.222 τ̂Boot 0.302 0.253 -0.070 0.930 0.135 0.341 -0.461 0.830 τ̂Bayesian 0.190 0.155 -0.137 0.500 -0.006 0.152 -0.346 0.248 τ̂Delta 0.136 0.110 -0.079 0.351 0.003 0.068 -0.130 0.137 τ̂Boot 0.306 0.236 -0.076 0.884 0.105 0.300 -0.437 0.756 τ̂Bayesian 0.136 0.118 -0.089 0.347 -0.009 0.112 -0.260 0.153 τ̂Delta 0.066 0.059 -0.050 0.182 0.001 0.030 -0.057 0.060 τ̂Boot 0.279 0.207 -0.078 0.736 0.062 0.223 -0.378 0.571 τ̂Bayesian 0.068 0.062 -0.033 0.211 -0.005 0.057 -0.160 0.083 Estimated Treatment Effects on Sociotropic Evaluations Taeyong Park ([email protected])