Testing for Rank Invariance or Similarity in Program Evaluation: The
by user
Comments
Transcript
Testing for Rank Invariance or Similarity in Program Evaluation: The
Testing for Rank Invariance or Similarity in Program Evaluation: The Effect of Training on Earnings Revisited Yingying Dong and Shu Shen UC Irvine and UC Davis Sept 2015 @ Chicago 1 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Introduction Motivation: QTE/LQTE Literature In program evaluations, applied researchers care about treatment effect heterogeneity and often look at distributional/quantile effects of treatments. In quantile treatment effect (QTE) models, rank invariance or rank similarity is required either for identification: e.g., IVQR model of Chernozhukov and Hansen (2005, 06, 08), Chernozhukov, Imbens, and Newey (2007), Horowitz and Lee (2007). or for interpretation: e.g., LQTE framework (Abadie, Angrist and Imbens, 2002). Also Frolich and Melly (2013), Firpo (2007), and Imbens and Newey (2009). This paper studies the assumption of (unconditional) rank invariance and rank similarity. provides identification of the distribution of individuals’ (unconditional) potential ranks conditional on covariates. proposes nonparametric tests that are applicable to both exogenous and endogenous treatments 2 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Introduction Motivation: Program Evaluation Applications The Star Project: the effect of attending a small class (T ) in grade K on student outcome (Y , grade K test score) Figure: The Star Project 0 .2 Probability .4 .6 .8 1 Score Distributions 50 100 Total Score QTE Regular Class With Aid 3 / 37 Dong, Shen 150 Small Class Regular Class Testing for Rank Invariance or Similarity in Program Evaluation Introduction Motivation: Program Evaluation Applications JTPA (Job Training Partnership Act): the effect of job training (T ) on individual earnings (Y ). Randomly assigned (Z ) treatment with about 60% compliance rate. .8 Probability .4 .6 .2 0 0 .2 Probability .4 .6 .8 1 Potential Earnings Distributions Among Compiers, Male 1 Potential Earnings Distributions, Female 0 10000 20000 Earnings LQTE Control 4 / 37 30000 40000 0 20000 40000 60000 Earnings Treatment LQTE Control Dong, Shen Treatment Testing for Rank Invariance or Similarity in Program Evaluation Introduction Definition of Rank Invariance Y0 and Y1 are the potential outcomes under no treatment and under treatment, respectively. Ut = Ft (Yt ) ∼ U(0, 1) is the rank of the potential outcome Yt . U0 and U1 are unconditional and are never observed at the same time. Rank invariance is the condition that U0 = U1 Example: Yt = gt (X, V ), where Yt is test score, X is observed characteristics such as gender, race, and V is ability. If (X, V ) : Ω → W, so that Ut = Ft (gt (X(ω), V (ω))), then rank invariance is says that U0 (ω) = U1 (ω) for all ω ∈ Ω. Let qt (τ ) = FY−1 (τ ) and QTE (τ ) = q1 (τ ) − q0 (τ ). t Rank invaraince implies that QTE (τ ) is the individual treatment effect for anyone who is at quantile τ . Rank invariance is restrictive – does not allow for random slippages in potential ranks (e.g., caused by luck). 5 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Introduction Rank Similarity Suppose Yt = gt (X, V , St ), where X (gender, race) and V (ability) determine the common rank level, St (luck) is a random shock (luck) responsible for the random slippages. St is realized after a treatment is assigned. Rank similarity is the condition that U0 | (X = x, V = v ) ∼ U1 | (X = x, V = v ) for all (x, v ) ∈ W. If (X, V ) : Ω → W, then rank similarity says that U0 (ω) ∼ U1 (ω) for all ω ∈ Ω. 6 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Introduction Implications of Rank Similarity Rank similarity implies that Lemma 1 1 The distributions of observables and unobservables at the same rank are the same across treatment states. That is, Given rank similarity, FX,V |U0 (x, v |τ ) = FX,V |U1 (x, v |τ ), for all τ ∈ (0, 1) , (x, v ) ∈ W. 2 For any individual, her average treatment effect is a weighted average of the unconditional QTEs, where the weights are the individual’s probabilities of being at different quantiles. That is, Z 1 QTE (τ )dFU|X,V (τ |x, v ) for all (x, v ) ∈ W Given rank similarity, E [Y1 − Y0 |X = x, V = v ] = 0 3 (Main Testable Implication) Treatment should not affect the distribution of ranks among observationally equivalent individuals. That is, Given rank similarity, FU0 |X (τ |x) = FU1 |X (τ |x), for all τ ∈ (0, 1) , x ∈ X . 7 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Identification Exogenous Treatment Identification: the Exogenous Treatment Case If T is exogenous, identification of FU1 |X (τ |x) − F U0 |X (τ |x) for τ ∈ (0, 1) and x ∈X is trivial: = = = FU1 |X (τ |x) − F U0 |X (τ |x) E 1(U 1 ≤ τ )|X = x −E 1(U 0 ≤ τ ))|X = x E 1(Y 1 ≤ q 1 (τ )|X = x −E 1(Y 0 ≤ q 0 (τ ))|X = x E 1(Y ≤ q 1 (τ ))|X = x, T = 1 −E 1(Y ≤ q 0 (τ ))|X = x, T = 0 , where marginal quantiles q1 (τ ) and q0 (τ ) are directly identified from sub-samples with T = 1 and T = 0, respectively. 8 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment Identification: the Endogenous Treatment Case If T is endogenous, let Z = 0, 1 be an IV, and Tz for z = 0, 1 be potential treatment status. Interested in testing for rank similarity among compliers (T1 > T0 ): FU1 |C ,X (τ |x) = FU0 |C ,X (τ |x) for all τ ∈ (0, 1) and x ∈XC , where XC = {x ∈ X : Pr [T1 > T0 |X = x] > 0}. Assumption 1 Let (Yt , Tt , X , Z ), t = 0, 1 be random variables mapped from the common probability space (Ω, F, P). The following conditions hold jointly with probability one. 1 Independence: (Y0 , Y1 , T0 , T1 ) ⊥ Z |X. 2 First stage: E (T1 ) 6= E (T0 ). 3 Monotonicity: Pr(T1 ≥ T0 ) =1. 4 Nontrivial assignment: 0 < Pr (Z = 1|X = x) < 1 for all x ∈ X . 9 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment Identification: the Endogenous Treatment Case Theorem 1 Let I (τ ) ≡ 1 Y ≤ Tq1|C (τ ) + (1 − T ) q0|C (τ ) . Given Assumption 1, for all τ ∈ (0, 1), x ∈ XC , and t = 0, 1, FUt |C ,X (τ |x) is identified and is given by FUt |C ,X (τ |x) = E [I (τ )1 (T = t) |Z = 1, X = x] − E [I (τ )1 (T = t) |Z = 0, X = x] . E [1 (T = t) |Z = 1, X = x] − E [1 (T = t) |Z = 0, X = x] (1) FU1 |C ,X (.|x) = FU0 |C ,X (.|x) for x ∈ XC if and only if for all τ ∈ (0, 1) and x ∈ X E [I (τ )|Z = 1, X = x] = E [I (τ )|Z = 0, X = x] . Note: I (τ ) is a rank indicator. Notice the change from x ∈ XC to x ∈ X in the theorem. This is because Equation (2) holds trivially for X /XC . Use the identification result of Equation (2) to test for H0 : FU1 |C ,X (.|x) = FU0 |C ,X (.|x). Use the identification result of Equation (1) to estimate FU1 |C ,X (.|x) − FU0 |C ,X (.|x). 10 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation (2) Identification Endogenous Treatment Mean Test Theorem 1 also gives identification of specific features of the potential rank distribution such as the mean. Rank similarity implies FU1 |C ,X (τ |x) = FU0 |C ,X (τ |x) which further implies E [U1 |C , X = x] = E [U0 |C , X = x]. E [U1 |C , X = x] = E [U0 |C , X = x] holds if and only if E [U|Z = 1, X = x] = E [U|Z = 0, X = x] , where R R U ≡ TU1 + (1 − T )U0 = 01 1 Tq1|C (τ ) + (1 − T )q0|C (τ ) < Y dτ = 1 − 01 I (τ )dτ . U is identified because I (τ ) is identified. E [U1 |C , X = x] − E [U0 |C , X = x] represents the average rank change for each subpopulation. 11 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment Star Project: Small Classes V.s. Regular Classes .8 Probability .4 .6 .2 0 0 .2 Probability .4 .6 .8 1 Rank Distributions by Gender 1 Rank Distributions by Race 0 .2 .4 .6 Rank of Total Score Nonwhite, Small Class White, Small Class .8 1 0 .2 Nonwhite, Regular Class White, Regular Class .4 .6 Rank of Total Score Boy, Small Class Girl, Small Class .8 1 Boy, Regular Class Girl, Regular Class Regular Class with Aid V.s. Regular Classes .8 Probability .4 .6 .2 0 0 .2 Probability .4 .6 .8 1 Rank Distributions by Gender 1 Rank Distributions by Race 0 12 / 37 .2 .4 .6 Rank of Total Score .8 1 Dong, Shen 0 .2 .4 .6 Rank of Total Score .8 1 Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment Empirical Example: JTPA Female .8 Probability .4 .6 .2 0 0 .2 Probability .4 .6 .8 1 Potential Rank Distributions, by Employment Last Year 1 Potential Rank Distributions, by Education 0 20 40 60 80 100 0 20 40 Rank 60 80 100 Rank <HS, Treatment HS, Treatment <HS, Control HS, Control <13 Weeks, Treatment >=13 Weeks, Treatment <13 Weeks, Control >=13 Weeks, Control Male .8 Probability .4 .6 .2 0 0 .2 Probability .4 .6 .8 1 Potential Rank Distributions by Employment Last Year 1 Potential Rank Distributions by Education 0 20 40 60 80 100 Rank 13 / 37 <HS, Treatment 0 20 40 60 80 100 Rank <HS, Control Dong, Shen Testing<13 forWeeks, Rank Treatment Invariance or Similarity in Program <13 Weeks, ControlEvaluation Identification Endogenous Treatment Null Hypothesis and Test Statistic Let X = {x1 , x2 , ..., xJ } , Ω = {τ1 , τ2 , ..., τK } H0 : mj0 (τk ) = mj1 (τk ) for j = 1, ..., J − 1 and k = 1, ..., K , for z = 0, 1 mjz (τ k ) ≡ E 1 Y ≤ Tq1|C (τk ) + (1 − T )q0|C (τk ) |Z = z, X = xj . m̂jz (τ k ) = n1z P j Zi =z,Xi =xj 1 Yi ≤ Ti q̂1|C (τk ) + (1 − Ti )q̂0|C (τk ) , with n 1X q̂0|C (τk ), q̂1|C (τk ) = arg min ρτk (Y i −q 0 (1 − T i ) − q 1 Ti )ω̂i , q0 ,q1 n i=1 where ω̂i ≡ Zi 1−Zi − 1−b π b (Xi ) π (Xi ) (2Ti − 1) and π b(x) is a consistent estimator of Pn i=1 1(Z i = z, X i = x j ). π(x) = Pr (Z = 1|X = x), and njz = Wald-type test: W ≡ n m̂1 − m̂0 14 / 37 Dong, Shen 0 V̂−1 m̂1 − m̂0 . Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment Assumption for Asymptotic Properties Assumption 3 1 i.i.d. data: the data (Yi , Ti , Zi , Xi ) for i = 1, ..., n is a random sample of size n from (Y , T , Z , X). 2 For all τ ∈ Ω = {τ1 , τ2 , ..., τK }, the random variable Y1 and Y0 are continuously distributed with positive density in a neighborhood of q0|C (τ ) and q1|C (τ ) in the subpopulation of compliers. p For all j = 1, ..., J, π̂(xj ) is consistent, or π̂ xj → π xj . 3 4 15 / 37 Let fY |T ,Z ,X be the conditional density of Y given T , Z and X. For all t, z = 0, 1, j = 1, ..., J and τ ∈ Ω, fY |T ,Z ,X (y |t, z, xj ) has bounded first derivative with respect to y in a neighborhood of qt|C (τ ). Let fY |X (y |x) be the conditional density of Y given X. For all τ ∈ Ω and j = 1, ..., J, fY |X (.|xj ) is positive and bounded in a neighborhood of qt|C (τ ). Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment Asymptotics Let m̂z , z = 0, 1, be K (J − 1) dimentional vector with K (j − 1) + k-th element m̂jz (τk ). Theorem 2 Given Assumptions 1 and 3, √ n m̂1 − m̂0 − m1 − m0 ⇒ N(0, V) where 1) × K (J − 1) asymptotic variance-covariance matrix. The P V is the K (J − P J−1 J−1 0 − 1) + k 0 -th element of V is equal to K (j K (j − 1) + k, 0 hj=1 j =1 i E φ1j (τk ) − φ0j (τk ) φ1j 0 (τk 0 ) − φ0j 0 (τk 0 ) with φzj (τk ) ≡φzj (τk ; Y , T , Z , X) = − − I (τk ) − mjz (τk ) pZ ,X (z, xj ) 1(Z = z, X = xj ) fY |T ,Z ,X (q0|C (τk )|0, z, xj )(1 − pT |Z ,X (z, xj )) Pc f0|C (q0|C (τk )) fY |T ,Z ,X (q1|C (τk )|1, z, xj )pT |Z ,X (z, xj ) Pc f1|C (q1|C (τk )) ψ0 (Y , T , Z , X) ψ1 (Y , T , Z , X), where ψ0 (Y , T , Z , X) and ψ1 (Y , T , Z , X) are defined in the proof of Theorem 7 in Frolich and Melly (2007), and restated in the proof of this theorem in the Appendix. 16 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment Asymptotic Properties of the Test Remember the Wald-type test statistic W ≡ n m̂1 − m̂0 0 V̂−1 m̂1 − m̂0 ⇒ χ2 (K (J − 1)) under the null. Bootstrap V̂. If q0|C (τk ) and q1|C (τk ) were known, φzj (τk ; Y , T , Z , X) would reduce to P I (τk )−mjz (τk ) PJ−1 J−1 1(Z = z, X = xj ) and the K (j 0 − 1) + k 0 -th element of j=1 K (j − 1) + k, pZ ,X (z,xj ) j 0 =1 P z 0 0 z z V is equal to z=0,1 mj (τk ∧ τk 0 ) − mj (τk )mj (τk 0 ) if j = j , and 0 if j 6= j . If J is very large, then the first stage estimation error may be ignored and one can construct V̂ by the analytic formula. Discussed in extensions where J → ∞ or X includes continuous variables. The critical value cα is the (1 − α) × 100-th percentile of the χ2 (K (J − 1)) distribution. The test is consistent for the null hypothesis H0 Once again, the test does NOT test the unobservable part (e.g. V or ability) of the rank invariance assumption. 17 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment The Mean Rank Similarity Test Let m̄jz = E [U|Z = z, X = xj ] for z = 0, 1. H0,mean : m̄j0 = m̄j1 , for all j = 1, ..., J − 1. Let {τ s }Sx=1 be S random draws from U (0, 1). Ûi ≡ T Û1i + (1 − T )Û0i for i = 1, ..., n, can be estimated by S 1 X 1 T q̂1|C (τ s ) + (1 − T )q̂0|C (τ s ) ≤ Yi , Ûi = S s=1 m̄jz can then be estimated by m̈jz = 18 / 37 1 njz Dong, Shen X Ûi . Zi =z,Xi =xj Testing for Rank Invariance or Similarity in Program Evaluation Identification Endogenous Treatment The Mean Rank Similarity Test Corollary 3 Suppose Assumptions 1 and 3 hold for Ω = (0, 1). Under the null hypothesis where m̄1 = m̄0 , when S, n → ∞ √ n m̈1 − m̈0 ⇒ N(0, Vmean ), where Vmean is the (J − × (J − 1) asymptotic variance-covariance matrix. Thei(j, j 0 )-th h1) R R1 1 R1 0 R1 0 1 1 element of Vmean is E , where 0 φj (τ )dτ − 0 φj (τ )dτ 0 φj 0 (τ )dτ − 0 φj 0 (τ )dτ 1 Z 0 φzj (τ )dτ = − − U − m̄jz 1(Z = z, X = xj ) pZ ,X (z, xj ) Z 1 fY |T ,Z ,X (q0|C (τ )|0, z, xj ) f0|C (q0|C (τ )) 0 1 Z − fY |T ,Z ,X (q1|C (τ )|1, z, xj ) 0 f1|C (q1|C (τ )) dτ dτ 1 − PT |Z ,X (z, xj ) ψ0 (Y , T , Z , X) Pc PT |Z ,X (z, xj )ψ1 (Y , T , Z , X) Pc . A Wald-type test statistic is then Wmean ≡ n m̈1 − m̈0 0 V̈−1 m̈1 − m̈0 ⇒ χ2 (J − 1) as N, J → ∞, N/J → ∞. 19 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Simulation and JTPA Simulation Simulation: DGPs DGP: Y0 = X + V + S0 , Y1 = X + V + (1 − bXV ) + S1 , Y = Y1 T + Y0 (1 − T ), Pr(X = 0.4j) = 1/5 for j = 1, ..., 5, V , S0 , S1 ∼ N(0, 1) and b = 0, 2. Exogenous treatment: Pr(T = t) = 12 , t = 0, 1. Endogenous treatment: Pr(Z = z) = 21 , z = 0, 1, and T = 1 (0.15(Y1 − Y0 ) + Z − 0.5 > 0). Rank similarity holds when b = 0 but not when b 6= 0. 20 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Simulation and JTPA Simulation Illustration of DGPs: Exogenous Treatment Figure: Conditional distributions of potential ranks Conditional CDF .2 .4 .6 .8 0 0 Conditional CDF .2 .4 .6 .8 1 b=1 1 b=0 0 .2 .4 .6 Potential rank X=0.4, Y(1) X=0.8, Y(1) X=1.2, Y(1) X=1.6, Y(1) X=2.0, Y(1) .8 1 0 .2 X=0.4, Y(0) X=0.8, Y(0) X=1.2, Y(0) X=1.6, Y(0) X=2.0, Y(0) .8 1 X=0.4, Y(0) X=0.8, Y(0) X=1.2, Y(0) X=1.6, Y(0) X=2.0, Y(0) Conditional CDF .2 .4 .6 .8 0 0 Conditional CDF .2 .4 .6 .8 1 b=3 1 b=2 0 .2 .4 .6 Potential rank X=0.4, Y(1) X=0.8, Y(1) X=1.2, Y(1) X=1.6, Y(1) X=2.0, Y(1) 21 / 37 .4 .6 Potential rank X=0.4, Y(1) X=0.8, Y(1) X=1.2, Y(1) X=1.6, Y(1) X=2.0, Y(1) .8 X=0.4, Y(0) X=0.8, Y(0) X=1.2, Y(0) X=1.6, Y(0) X=2.0, Y(0) Dong, Shen 1 0 .2 .4 .6 Potential ranks X=0.4, Y(1) X=0.8, Y(1) X=1.2, Y(1) X=1.6, Y(1) X=2.0, Y(1) .8 1 X=0.4, Y(0) X=0.8, Y(0) X=1.2, Y(0) X=1.6, Y(0) X=2.0, Y(0) Testing for Rank Invariance or Similarity in Program Evaluation Simulation and JTPA Simulation Simulation Results: Exogenous Treatment N 500 Test Test Test Test Test 1: 2: 3: 4: 5: Ω = {0.5} Ω = {0.2, 0.3, 0.4} Ω = {0.5, 0.6, 0.7, 0.8} Ω = {0.2, 0.3, ..., 0.8} Mean Test 0.034 0.013 0.014 0.006 0.051 Test Test Test Test Test 1: 2: 3: 4: 5: Ω = {0.5} Ω = {0.2, 0.3, 0.4} Ω = {0.5, 0.6, 0.7, 0.8} Ω = {0.2, 0.3, ..., 0.8} Mean Test 0.074 0.269 0.151 0.287 0.103 1000 1500 b = 0 0.039 0.051 0.013 0.025 0.014 0.023 0.010 0.013 0.044 0.048 b = 2 0.150 0.232 0.776 0.968 0.581 0.857 0.910 0.996 0.213 0.278 2000 2500 500 0.040 0.021 0.023 0.013 0.041 0.053 0.023 0.018 0.013 0.067 0.047 0.018 0.022 0.009 0.063 0.303 0.994 0.962 1.000 0.424 0.388 1.000 0.991 1.000 0.500 0.143 0.817 0.306 0.836 0.340 2000 2500 0.120 0.099 0.148 0.144 0.176 0.146 0.152 0.260 0.283 0.251 0.512 1.000 0.992 1.000 0.853 0.640 1.000 1.000 1.000 0.941 0.800 1.000 1.000 1.000 0.971 1 Rejection Rate .4 .6 .8 0 .2 Rejection Rate .4 .6 .8 .2 0 0 1 2 3 b Distributional Test 1 Distributional Test 3 Mean Test 22 / 37 1500 0.101 0.065 0.134 0.102 0.144 b=2 1 Sample Size = 1000 1000 b = 1 0.059 0.038 0.050 0.044 0.092 b = 3 0.335 0.999 0.880 0.999 0.659 Distributional Test 2 Distributional Test 4 Dong, Shen 500 1000 1500 Sample Size Distributional Test 1 Distributional Test 3 Mean Test 2000 2500 Distributional Test 2 Distributional Test 4 Testing for Rank Invariance or Similarity in Program Evaluation Simulation and JTPA Simulation Simulation Results: Endogenous Treatment N 500 Ω = {0.5} Ω = {0.2, 0.3, 0.4} Ω = {0.5, 0.6, 0.7, 0.8} Ω = {0.2, 0.3, ..., 0.8} Mean Test 0.025 0.012 0.006 0.002 0.054 Ω = {0.5} Ω = {0.2, 0.3, 0.4} Ω = {0.5, 0.6, 0.7, 0.8} Ω = {0.2, 0.3, ..., 0.8} Mean Test 0.084 0.170 0.021 0.053 0.152 1000 1500 b = 0 0.036 0.041 0.012 0.018 0.013 0.016 0.010 0.006 0.050 0.051 b = 2 0.242 0.379 0.589 0.870 0.150 0.340 0.431 0.823 0.322 0.481 2000 2500 500 0.038 0.017 0.022 0.010 0.045 0.057 0.025 0.015 0.008 0.057 0.036 0.013 0.006 0.003 0.037 0.522 0.965 0.600 0.960 0.622 0.615 0.993 0.764 0.993 0.709 0.113 0.284 0.020 0.093 0.191 2000 2500 0.073 0.065 0.047 0.057 0.063 0.085 0.107 0.074 0.094 0.068 0.441 0.975 0.450 0.949 0.602 0.617 1.000 0.704 1.000 0.772 0.700 1.000 0.865 1.000 0.843 1 Rejection Rate .4 .6 .8 0 .2 Rejection Rate .4 .6 .8 .2 0 0 1 2 3 b Distributional Test 1 Distributional Test 3 Mean Test 23 / 37 1500 0.064 0.040 0.034 0.029 0.054 b=2 1 Sample Size = 1000 1000 b = 1 0.040 0.022 0.016 0.015 0.050 b = 3 0.293 0.783 0.198 0.634 0.441 Distributional Test 2 Distributional Test 4 Dong, Shen 500 1000 1500 Sample Size Distributional Test 1 Distributional Test 3 Mean Test 2000 2500 Distributional Test 2 Distributional Test 4 Testing for Rank Invariance or Similarity in Program Evaluation Empirical Examples: Testing Results Star Testing: the Exogenous Treatment Case Table: Star Project: Test Results Treatment type Test Stat P-value Test Stat P-value Total Test Score Birthday (1st-31st) Small Class V.s. Regular Class 232 21.23 0.033 0.439 Aid Class V.s. Regular Class 266 13.25 0.001 0.905 Conclusions Both treatments (small class and regular class with aid) improve the rank distribution of the disadvantaged (boy, nonwhite) Assigning a teaching aid to the regular class systematically changes students’ rank. Researchers may want to reconsider the practice of using both regular class with and without aid as the “control” group in analysis. 24 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Empirical Examples: Testing Results Star Empirical Example: JTPA Y = 30 months’ earnings following assignment T = receiving training services, Z = random assignment indicator, X = black, Hispanic, HS or GED, married, worked at least 13 weeks the year before, AFDC receipt (for women only) and 5 age category dummies. (Abadie, Angrist and Imbens, 2002) 25 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Empirical Examples: Testing Results JTPA JTPF: First-Stage Unconditional LQTE Estimation Table: First-stage estimates of unconditional QTEs of training on trainee earnings Quantile Y0 Female QTE Y0 Male QTE 0.15 195 291 (341.88) 1,462 249 (713.36) 0.20 723 714 (358.31)* 2,733 390 (723.01) 0.25 1,458 1,200 (372.08)*** 4,434 489 (746.85) 0.30 2,463 1,380 (399.21)*** 6,993 340 (891.74) 0.35 3,784 1,705 (497.01)*** 8,836 594 (1,042.40) 0.40 5,271 1,974 (669.75)*** 11,010 723 (1,104.63) 0.45 6,726 2,451 (766.25)*** 13,104 1,069 (1,144.28) 0.50 8,685 2,436 (829.29)*** 15,374 1,291 (1,234.59) 0.55 11,007 2,089 (877.56)** 17,357 2,239 (1,295.79)* 0.60 12,618 2,729 (886.96)*** 20,409 2,118 (1,418.40) 0.65 14,682 2,943 (920.45)*** 23,342 2,319 (1,557.00) 0.70 16,971 2,772 (1,027.14)*** 27,169 1,780 (1,606.66) 0.75 20,252 2,106 (1,152.35)* 30,439 2,408 (1,641.47) 0.80 23,064 2,331 (1,149.71)** 34,620 2,800 (1,701.90)* 0.85 26,735 1,762 (1,179.91) 39,233 3,955 (1,886.98)** Note: Standard errors are in the parentheses; All estimates control for covariates including dummies for black, Hispanic, high-school graduates (including GED holders), marital status, whether the applicant worked at least 12 weeks in the 12 months preceding random assignment, and AFDC receipt (for women only) as well as 5 age group dummies; * significant at the 10% level, ** significant at the 5% level, ***significant at the 1% level. 26 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Empirical Examples: Testing Results JTPA JTPA: Joint Test Table: Rank similarity test jointly at all quantiles Female I (1) Male II (2) (1) I (2) (1) II (2) (1) (2) Panel A: Dependent Var. Earnings 7,652.1 7,763.8 1,197.2 1,177.8 2,780.7 2,719.0 886.1 876.8 (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) (0.000) d.f. 1,544 1,544 723 723 1,218 1,218 570 570 Panel B: Falsification test (Dependent Var. Age) χ2 478.8 471.9 252.0 259.9 209.3 203.5 124.7 123.0 (0.926) (0.953) (0.366) (0.245) (1.000) (1.000) (0.977) (0.982) d.f. 525 525 245 245 338 338 158 158 Note: Results are based on the Chi-squared test in Theorem 2; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values are in the parentheses; Columns I report a joint test at equally-spaced 15 quantiles from 0.15 to 0.85; Columns II reports a joint test at equally-spaced 7 quantiles from 0.20 to 0.80; (1) controls for covariates in the first-stage unconditional QTE estimation, while (2) does not; X values with fewer than 5 observations when either Z = 0 or Z = 1 are not used in the test to ensure the common support assumption. χ2 27 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Empirical Examples: Testing Results JTPA JTPA: Test Individual Quantiles Table: Rank similarity test at individual quantiles Quantile χ2 Panel A: Dependent Var. Earnings Female Male χ2 χ2 Panel B: Falsification test (Dependent Var. Age) Female Male χ2 0.15 134.4 (0.012) 103.8 (0.045) 43.9 (0.144) 19.4 (0.561) 0.20 143.0 (0.004) 113.3 (0.010) 37.9 (0.340) 22.1 (0.391) 0.25 126.2 (0.060) 107.8 (0.025) 26.0 (0.863) 13.9 (0.907) 0.30 131.9 (0.034) 104.7 (0.039) 26.9 (0.834) 15.0 (0.861) 0.35 147.2 (0.003) 95.8 (0.142) 22.1 (0.956) 17.9 (0.712) 0.40 118.3 (0.160) 88.6 (0.291) 31.1 (0.659) 23.2 (0.447) 0.45 107.5 (0.387) 110.7 (0.019) 32.1 (0.611) 22.4 (0.497) 0.50 110.9 (0.304) 113.6 (0.012) 32.3 (0.599) 19.2 (0.692) 0.55 112.6 (0.266) 110.9 (0.019) 30.8 (0.673) 19.6 (0.664) 0.60 112.1 (0.276) 112.3 (0.015) 32.7 (0.581) 22.3 (0.503) 0.65 121.7 (0.113) 105.0 (0.044) 29.4 (0.734) 18.4 (0.735) 0.70 108.0 (0.375) 106.1 (0.038) 36.7 (0.388) 24.0 (0.402) 0.75 130.4 (0.035) 109.7 (0.018) 45.4 (0.112) 16.5 (0.831) 0.80 118.4 (0.128) 116.5 (0.005) 47.7 (0.074) 17.1 (0.802) 0.85 92.3 (0.697) 118.7 (0.002) 44.7 (0.125) 18.7 (0.716) Note: Results are based on the Chi-squared test in Theorem 2; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values are in the parentheses; Covariates are controlled for in the first-stage unconditional QTE estimation. X values with fewer than 5 observations when either Z = 1 or Z = 0 are not used in the test to ensure the common support assumption. 28 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Empirical Examples: Testing Results JTPA JTPA: Test the Mean Rank Table: Rank similarity test for the mean rank only Female (1) Male (2) (1) (2) Panel A: Dependent Var. Earnings 123.1 (0.098) 123.1 (0.098) 115.2 (0.009) 115.2 (0.009) d.f. 104 104 82 82 Panel B: Falsification test (Dependent Var. Age) χ2 30.6 (0.683) 30.6 (0.683) 18.4 (0.736) 18.4 (0.736) d.f. 35 35 23 23 Note: Results are based on the Chi-squared test for the mean ranks only; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values are in the parentheses; (1) controls for covariates in the first-stage unconditional QTE estimation, while (2) does not; X values with fewer than 5 observations when either Z = 1 or Z = 0 are not used in the test to ensure the common support assumption. χ2 Conclusion: Training causes some individuals to systemically change their ranks in the earnings distribution. Should be cautious in equating the distributional impacts of training with the true effects on individual trainees. Results largely agree with Heckman, Smith and Clements (1997): “perfect positive dependence across potential outcome distributions ... not credible.” 29 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Extensions Extension I: Covariates with Large Support Assume that J → ∞, as n → ∞. Assumption 4 1 i.i.d. data: the data {Yi , Ti , Zi , Xi } for i = 1, ..., n is a random sample of size n of (Y , T , Z , X). 2 For all τ ∈ Ω = {τ1 , τ2 , ..., τK }, the random variable Y1 and Y0 are continuously distributed with positive density in a neighborhood of q0|C (τ ) and q1|C (τ ) in the subpopulation of compliers. P Let nj = ni=1 1(X = xj ). nj n/J uniformly over j, i.e. there exist 0 < c ≤ C < ∞ such n that c J ≤ nj ≤ C Jn for all j = 1, ..., J. 3 p 4 π̂(xj ) is uniformly consistent, or supj=1,...,J |π̂(xj ) − π(xj )| → 0 as n, J → ∞ and n/J → ∞. 5 For all t, z = 0, 1, j = 1, ..., J and τ ∈ Ω, fY |T ,Z ,X (.|t, z, xj ) is bounded in a neighborhood of qt|C (τ ). For all τ ∈ Ω and j = 1, ..., J, fY |X (.|xj ) is positive and bounded in a neighborhood of qt|C (τ ). 30 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Extensions Extension I: Covariates with Large Support Let m̂zj = (m̂jz (τ1 ) , ... , m̂jz (τK ))0 and mzj = (mjz (τ1 ), ..., mjz (τK ))0 be K × 1 vector. Corollary 4 Given Assumptions 1 and 4, we have v u 1 0 u nj nj t m̂1j − m̂0j − m1j − m0j ⇒ Zj ∼ N(0, Vj ), nj1 + nj0 where Zj for j = 1, ..J follow independent multi-variate normal distributions; the (k, k 0 )-th element of K × K variance-covariance matrix Vj is Vj;k,k 0 = π(xj )mj1 (τk ∧ τk 0 ) 1 − mj1 (τk 0 ) + (1 − π(xj ))mj0 (τk ∧ τk 0 ) 1 − mj0 (τk 0 ) . 31 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Extensions Extension I: Covariates with Large Support For each j = 1, ..., J, define the Wald-type statistic wj = nj1 nj0 nj1 + nj0 0 m̂1j − m̂0j V̂j−1 m̂1j − m̂0j where V̂j is a consistent estimator of Vj . The (k, k 0 )-th element of V̂j is V̂j;k,k 0 = nj0 nj0 + nj1 m̂j1 (τk ∧ τk 0 ) 1 − m̂j1 (τk 0 ) + nj1 nj0 + nj1 m̂j0 (τk ∧ τk 0 ) 1 − m̂j0 (τk 0 ) . The test statistic is then PJ−1 WlargeJ = wj − K (J − 1) p ⇒ N(0, 1). 2K (J − 1) j=1 The one-sided decision rule of the test is to “reject the null hypothesis H0 if WlargeJ > cα ”. 32 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Extensions Extension II: Continuous Covariates Let mkz (x) = E [I (τk ) |Z = z, X = x] for z = 0, 1. Interested in testing H0 : mk1 (x) = mk0 (x) for all x ∈ X and k = 1, ..., K , Apply Chernozhukov, Lee and Rosen (2013) and form Kolmogorov-Smirnov type test statistic: m̂1 (x) − m̂0 (x) k k KS = sup , sk (x) k,x where m̂kz (x) is local linear estimator and sk (x) the standard error of m̂k1 (x) − m̂k0 (x). Construct the critical value cα by multiplier bootstrap. Let m̂k∗ (x) is a multiplier process such that P P ˆk,i Kh1 (Xi − x) ˆk,i Kh0 (Xi − x) Zi =1 ηi Zi =0 ηi ∗ P P m̂k (x) = − K (X − x) i h Zi =1 Zi =0 Kh0 (Xi − x) 1 where {ηi }N i=1 is simulated from i.i.d. N(0, 1)and independent of data, ˆk,i = 1 Yi ≤ q̂1|C (τk )Ti + q̂0|C (τk )(1 − Ti ) − m̂k1 (xi )Zi − m̂k0 (xi )(1 − Zi ). cα is the ∗ m̂ (x) (1 − α) × 100% percentile of the simulated process supk,x s k(x) . k Reject the null if KS > cα . 33 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Extensions Extension III: Testing for Conditional Rank Invariance or Similarity Two main modifications are required: 1 First, estimate conditional quantiles conditional on some covariates X1 of interest. 2 Second, use additional covariates X2 other than the conditioning covariates in the first-step to perform the test. Feasible only when the conditioning set for the conditional quantiles is small. E.g., we estimate quantiles of potential earnings, and perform tests for male and female trainees, so the tests are essentially rank similarity tests for conditional ranks conditional on gender. 34 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Conclusion Conclusion Proposes nonparametric tests for rank invariance or similarity popular in program evaluation or various QTE models. The tests explore whether the distribution (or features of it) of potential ranks remains the same among observationally equivalent individuals. Simulations show good size and power of the proposed tests in small samples. Empirical application to the JTPA training program: Training causes some individuals to systematically change their ranks in the distribution of earnings. Program effects are more complicated than suggested by standard QTEs. Should be cautious in equating program impacts on the distribution of earnings with those on individual trainees. 35 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Conclusion (Incomplete) Reference List QTE models: Abadie, Angrist and Imbens (2002), Chesher (2003, 2005), Chernozhukov and Hansen (2005, 2006, 2008), Firpo (2007), Firpo, Fortin and Lemieux (2007), Chernozhukov, Imbens and Newey (2007), Horowitz and Lee (2007), Imbens and Newey (2009), Rothe (2010), Frolich and Melly (2013), Powell (2013), Yu (2014), etc. Other works in rank invariance/similarity testing Frandsen and Lefgreen (2015): a parametric test for rank similarity, testing the equality of mean ranks with or without treatment. Yu (2015): a test for rank invariance, assuming unconfoundedness. JTPA: Abadie, Angrist and Imbens (2002), Chernozhukov and Hansen (2008), Orr et al. (1996), Heckman, Smith, and Clements (1997), etc. Star: Krueger (1999), Krueger and Whitmore (2001), Chetty, et. al. (2010), Jackson and Page (2013), etc. 36 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation Conclusion Thank you! 37 / 37 Dong, Shen Testing for Rank Invariance or Similarity in Program Evaluation