...

Testing for Rank Invariance or Similarity in Program Evaluation: The

by user

on
Category: Documents
17

views

Report

Comments

Transcript

Testing for Rank Invariance or Similarity in Program Evaluation: The
Testing for Rank Invariance or Similarity in Program Evaluation: The
Effect of Training on Earnings Revisited
Yingying Dong and Shu Shen
UC Irvine and UC Davis
Sept 2015 @ Chicago
1 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Introduction
Motivation: QTE/LQTE Literature
In program evaluations, applied researchers care about treatment effect heterogeneity and
often look at distributional/quantile effects of treatments.
In quantile treatment effect (QTE) models, rank invariance or rank similarity is required
either for identification:
e.g., IVQR model of Chernozhukov and Hansen (2005, 06, 08), Chernozhukov, Imbens, and Newey (2007),
Horowitz and Lee (2007).
or for interpretation:
e.g., LQTE framework (Abadie, Angrist and Imbens, 2002). Also Frolich and Melly (2013), Firpo (2007), and
Imbens and Newey (2009).
This paper
studies the assumption of (unconditional) rank invariance and rank similarity.
provides identification of the distribution of individuals’ (unconditional) potential ranks conditional
on covariates.
proposes nonparametric tests that are applicable to both exogenous and endogenous treatments
2 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Introduction
Motivation: Program Evaluation Applications
The Star Project: the effect of attending a small class (T ) in grade K on student outcome
(Y , grade K test score)
Figure: The Star Project
0
.2
Probability
.4
.6
.8
1
Score Distributions
50
100
Total Score
QTE
Regular Class With Aid
3 / 37
Dong, Shen
150
Small Class
Regular Class
Testing for Rank Invariance or Similarity in Program Evaluation
Introduction
Motivation: Program Evaluation Applications
JTPA (Job Training Partnership Act): the effect of job training (T ) on individual earnings
(Y ). Randomly assigned (Z ) treatment with about 60% compliance rate.
.8
Probability
.4
.6
.2
0
0
.2
Probability
.4
.6
.8
1
Potential Earnings Distributions Among Compiers, Male
1
Potential Earnings Distributions, Female
0
10000
20000
Earnings
LQTE
Control
4 / 37
30000
40000
0
20000
40000
60000
Earnings
Treatment
LQTE
Control
Dong, Shen
Treatment
Testing for Rank Invariance or Similarity in Program Evaluation
Introduction
Definition of Rank Invariance
Y0 and Y1 are the potential outcomes under no treatment and under treatment, respectively.
Ut = Ft (Yt ) ∼ U(0, 1) is the rank of the potential outcome Yt . U0 and U1 are
unconditional and are never observed at the same time.
Rank invariance is the condition that
U0 = U1
Example: Yt = gt (X, V ), where Yt is test score, X is observed characteristics such as
gender, race, and V is ability. If (X, V ) : Ω → W, so that Ut = Ft (gt (X(ω), V (ω))), then
rank invariance is says that U0 (ω) = U1 (ω) for all ω ∈ Ω.
Let qt (τ ) = FY−1
(τ ) and QTE (τ ) = q1 (τ ) − q0 (τ ).
t
Rank invaraince implies that QTE (τ ) is the individual treatment effect for anyone who is at
quantile τ .
Rank invariance is restrictive – does not allow for random slippages in potential ranks (e.g.,
caused by luck).
5 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Introduction
Rank Similarity
Suppose Yt = gt (X, V , St ), where X (gender, race) and V (ability) determine the common
rank level, St (luck) is a random shock (luck) responsible for the random slippages.
St is realized after a treatment is assigned.
Rank similarity is the condition that
U0 | (X = x, V = v ) ∼ U1 | (X = x, V = v ) for all (x, v ) ∈ W.
If (X, V ) : Ω → W, then rank similarity says that U0 (ω) ∼ U1 (ω) for all ω ∈ Ω.
6 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Introduction
Implications of Rank Similarity
Rank similarity implies that
Lemma 1
1
The distributions of observables and unobservables at the same rank are the same across
treatment states. That is,
Given rank similarity, FX,V |U0 (x, v |τ ) = FX,V |U1 (x, v |τ ), for all τ ∈ (0, 1) , (x, v ) ∈ W.
2
For any individual, her average treatment effect is a weighted average of the unconditional
QTEs, where the weights are the individual’s probabilities of being at different quantiles.
That is,
Z 1
QTE (τ )dFU|X,V (τ |x, v ) for all (x, v ) ∈ W
Given rank similarity, E [Y1 − Y0 |X = x, V = v ] =
0
3
(Main Testable Implication) Treatment should not affect the distribution of ranks among
observationally equivalent individuals. That is,
Given rank similarity, FU0 |X (τ |x) = FU1 |X (τ |x), for all τ ∈ (0, 1) , x ∈ X .
7 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Exogenous Treatment
Identification: the Exogenous Treatment Case
If T is exogenous, identification of FU1 |X (τ |x) − F U0 |X (τ |x) for τ ∈ (0, 1) and x ∈X is trivial:
=
=
=
FU1 |X (τ |x) − F U0 |X (τ |x)
E 1(U 1 ≤ τ )|X = x −E 1(U 0 ≤ τ ))|X = x
E 1(Y 1 ≤ q 1 (τ )|X = x −E 1(Y 0 ≤ q 0 (τ ))|X = x
E 1(Y ≤ q 1 (τ ))|X = x, T = 1 −E 1(Y ≤ q 0 (τ ))|X = x, T = 0 ,
where marginal quantiles q1 (τ ) and q0 (τ ) are directly identified from sub-samples with T = 1 and
T = 0, respectively.
8 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
Identification: the Endogenous Treatment Case
If T is endogenous, let Z = 0, 1 be an IV, and Tz for z = 0, 1 be potential treatment status.
Interested in testing for rank similarity among compliers (T1 > T0 ):
FU1 |C ,X (τ |x) = FU0 |C ,X (τ |x) for all τ ∈ (0, 1) and x ∈XC ,
where XC = {x ∈ X : Pr [T1 > T0 |X = x] > 0}.
Assumption 1 Let (Yt , Tt , X , Z ), t = 0, 1 be random variables mapped from the common
probability space (Ω, F, P). The following conditions hold jointly with probability one.
1
Independence: (Y0 , Y1 , T0 , T1 ) ⊥ Z |X.
2
First stage: E (T1 ) 6= E (T0 ).
3
Monotonicity: Pr(T1 ≥ T0 ) =1.
4
Nontrivial assignment: 0 < Pr (Z = 1|X = x) < 1 for all x ∈ X .
9 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
Identification: the Endogenous Treatment Case
Theorem 1
Let I (τ ) ≡ 1 Y ≤ Tq1|C (τ ) + (1 − T ) q0|C (τ ) . Given Assumption 1, for all τ ∈ (0, 1),
x ∈ XC , and t = 0, 1, FUt |C ,X (τ |x) is identified and is given by
FUt |C ,X (τ |x) =
E [I (τ )1 (T = t) |Z = 1, X = x] − E [I (τ )1 (T = t) |Z = 0, X = x]
.
E [1 (T = t) |Z = 1, X = x] − E [1 (T = t) |Z = 0, X = x]
(1)
FU1 |C ,X (.|x) = FU0 |C ,X (.|x) for x ∈ XC if and only if for all τ ∈ (0, 1) and x ∈ X
E [I (τ )|Z = 1, X = x] = E [I (τ )|Z = 0, X = x] .
Note:
I (τ ) is a rank indicator.
Notice the change from x ∈ XC to x ∈ X in the theorem. This is because Equation (2)
holds trivially for X /XC .
Use the identification result of Equation (2) to test for H0 : FU1 |C ,X (.|x) = FU0 |C ,X (.|x).
Use the identification result of Equation (1) to estimate FU1 |C ,X (.|x) − FU0 |C ,X (.|x).
10 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
(2)
Identification
Endogenous Treatment
Mean Test
Theorem 1 also gives identification of specific features of the potential rank distribution such
as the mean.
Rank similarity implies FU1 |C ,X (τ |x) = FU0 |C ,X (τ |x) which further implies
E [U1 |C , X = x] = E [U0 |C , X = x].
E [U1 |C , X = x] = E [U0 |C , X = x] holds if and only if
E [U|Z = 1, X = x] = E [U|Z = 0, X = x] ,
where
R
R
U ≡ TU1 + (1 − T )U0 = 01 1 Tq1|C (τ ) + (1 − T )q0|C (τ ) < Y dτ = 1 − 01 I (τ )dτ . U
is identified because I (τ ) is identified.
E [U1 |C , X = x] − E [U0 |C , X = x] represents the average rank change for each subpopulation.
11 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
Star Project:
Small Classes V.s. Regular Classes
.8
Probability
.4
.6
.2
0
0
.2
Probability
.4
.6
.8
1
Rank Distributions by Gender
1
Rank Distributions by Race
0
.2
.4
.6
Rank of Total Score
Nonwhite, Small Class
White, Small Class
.8
1
0
.2
Nonwhite, Regular Class
White, Regular Class
.4
.6
Rank of Total Score
Boy, Small Class
Girl, Small Class
.8
1
Boy, Regular Class
Girl, Regular Class
Regular Class with Aid V.s. Regular Classes
.8
Probability
.4
.6
.2
0
0
.2
Probability
.4
.6
.8
1
Rank Distributions by Gender
1
Rank Distributions by Race
0
12 / 37
.2
.4
.6
Rank of Total Score
.8
1
Dong, Shen
0
.2
.4
.6
Rank of Total Score
.8
1
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
Empirical Example: JTPA
Female
.8
Probability
.4
.6
.2
0
0
.2
Probability
.4
.6
.8
1
Potential Rank Distributions, by Employment Last Year
1
Potential Rank Distributions, by Education
0
20
40
60
80
100
0
20
40
Rank
60
80
100
Rank
<HS, Treatment
HS, Treatment
<HS, Control
HS, Control
<13 Weeks, Treatment
>=13 Weeks, Treatment
<13 Weeks, Control
>=13 Weeks, Control
Male
.8
Probability
.4
.6
.2
0
0
.2
Probability
.4
.6
.8
1
Potential Rank Distributions by Employment Last Year
1
Potential Rank Distributions by Education
0
20
40
60
80
100
Rank
13 / 37
<HS, Treatment
0
20
40
60
80
100
Rank
<HS, Control
Dong, Shen
Testing<13
forWeeks,
Rank Treatment
Invariance or Similarity
in Program
<13 Weeks,
ControlEvaluation
Identification
Endogenous Treatment
Null Hypothesis and Test Statistic
Let X = {x1 , x2 , ..., xJ } , Ω = {τ1 , τ2 , ..., τK }
H0 : mj0 (τk ) = mj1 (τk ) for j = 1, ..., J − 1 and k = 1, ..., K ,
for z = 0, 1 mjz (τ k ) ≡ E 1 Y ≤ Tq1|C (τk ) + (1 − T )q0|C (τk ) |Z = z, X = xj .
m̂jz (τ k ) = n1z
P
j
Zi =z,Xi =xj
1 Yi ≤ Ti q̂1|C (τk ) + (1 − Ti )q̂0|C (τk ) , with
n
1X
q̂0|C (τk ), q̂1|C (τk ) = arg min
ρτk (Y i −q 0 (1 − T i ) − q 1 Ti )ω̂i ,
q0 ,q1 n
i=1
where ω̂i ≡
Zi
1−Zi
− 1−b
π
b (Xi )
π (Xi )
(2Ti − 1) and π
b(x) is a consistent estimator of
Pn
i=1 1(Z i = z, X i = x j ).
π(x) = Pr (Z = 1|X = x), and njz =
Wald-type test:
W ≡ n m̂1 − m̂0
14 / 37
Dong, Shen
0
V̂−1 m̂1 − m̂0 .
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
Assumption for Asymptotic Properties
Assumption 3
1
i.i.d. data: the data (Yi , Ti , Zi , Xi ) for i = 1, ..., n is a random sample of size n from
(Y , T , Z , X).
2
For all τ ∈ Ω = {τ1 , τ2 , ..., τK }, the random variable Y1 and Y0 are continuously distributed
with positive density in a neighborhood of q0|C (τ ) and q1|C (τ ) in the subpopulation of
compliers.
p
For all j = 1, ..., J, π̂(xj ) is consistent, or π̂ xj → π xj .
3
4
15 / 37
Let fY |T ,Z ,X be the conditional density of Y given T , Z and X. For all t, z = 0, 1,
j = 1, ..., J and τ ∈ Ω, fY |T ,Z ,X (y |t, z, xj ) has bounded first derivative with respect to y in a
neighborhood of qt|C (τ ). Let fY |X (y |x) be the conditional density of Y given X. For all
τ ∈ Ω and j = 1, ..., J, fY |X (.|xj ) is positive and bounded in a neighborhood of qt|C (τ ).
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
Asymptotics
Let m̂z , z = 0, 1, be K (J − 1) dimentional vector with K (j − 1) + k-th element m̂jz (τk ).
Theorem 2
Given Assumptions 1 and 3,
√
n m̂1 − m̂0 − m1 − m0
⇒ N(0, V)
where
1) × K (J − 1) asymptotic
variance-covariance matrix. The
P V is the K (J − P
J−1
J−1
0 − 1) + k 0 -th element of V is equal to
K
(j
K
(j
−
1)
+
k,
0
hj=1
j =1
i
E φ1j (τk ) − φ0j (τk ) φ1j 0 (τk 0 ) − φ0j 0 (τk 0 ) with
φzj (τk ) ≡φzj (τk ; Y , T , Z , X) =
−
−
I (τk ) − mjz (τk )
pZ ,X (z, xj )
1(Z = z, X = xj )
fY |T ,Z ,X (q0|C (τk )|0, z, xj )(1 − pT |Z ,X (z, xj ))
Pc f0|C (q0|C (τk ))
fY |T ,Z ,X (q1|C (τk )|1, z, xj )pT |Z ,X (z, xj )
Pc f1|C (q1|C (τk ))
ψ0 (Y , T , Z , X)
ψ1 (Y , T , Z , X),
where ψ0 (Y , T , Z , X) and ψ1 (Y , T , Z , X) are defined in the proof of Theorem 7 in Frolich and
Melly (2007), and restated in the proof of this theorem in the Appendix.
16 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
Asymptotic Properties of the Test
Remember the Wald-type test statistic
W ≡ n m̂1 − m̂0
0
V̂−1 m̂1 − m̂0 ⇒ χ2 (K (J − 1))
under the null.
Bootstrap V̂.
If q0|C (τk ) and q1|C (τk ) were known, φzj (τk ; Y , T , Z , X) would reduce to
P
I (τk )−mjz (τk )
PJ−1
J−1
1(Z = z, X = xj ) and the
K (j 0 − 1) + k 0 -th element of
j=1 K (j − 1) + k,
pZ ,X (z,xj )
j 0 =1
P
z
0
0
z
z
V is equal to
z=0,1 mj (τk ∧ τk 0 ) − mj (τk )mj (τk 0 ) if j = j , and 0 if j 6= j .
If J is very large, then the first stage estimation error may be ignored and one can construct V̂ by
the analytic formula. Discussed in extensions where J → ∞ or X includes continuous variables.
The critical value cα is the (1 − α) × 100-th percentile of the χ2 (K (J − 1)) distribution.
The test is consistent for the null hypothesis H0
Once again, the test does NOT test the unobservable part (e.g. V or ability) of the rank
invariance assumption.
17 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
The Mean Rank Similarity Test
Let m̄jz = E [U|Z = z, X = xj ] for z = 0, 1.
H0,mean : m̄j0 = m̄j1 , for all j = 1, ..., J − 1.
Let {τ s }Sx=1 be S random draws from U (0, 1). Ûi ≡ T Û1i + (1 − T )Û0i for i = 1, ..., n, can be
estimated by
S
1 X
1 T q̂1|C (τ s ) + (1 − T )q̂0|C (τ s ) ≤ Yi ,
Ûi =
S s=1
m̄jz can then be estimated by
m̈jz =
18 / 37
1
njz
Dong, Shen
X
Ûi .
Zi =z,Xi =xj
Testing for Rank Invariance or Similarity in Program Evaluation
Identification
Endogenous Treatment
The Mean Rank Similarity Test
Corollary 3
Suppose Assumptions 1 and 3 hold for Ω = (0, 1). Under the null hypothesis where m̄1 = m̄0 ,
when S, n → ∞
√
n m̈1 − m̈0 ⇒ N(0, Vmean ),
where Vmean is the (J −
× (J − 1) asymptotic variance-covariance
matrix. Thei(j, j 0 )-th
h1)
R
R1 1
R1 0
R1 0
1 1
element of Vmean is E
, where
0 φj (τ )dτ − 0 φj (τ )dτ
0 φj 0 (τ )dτ − 0 φj 0 (τ )dτ
1
Z
0
φzj (τ )dτ = −
−
U − m̄jz
1(Z = z, X = xj )
pZ ,X (z, xj )
Z 1
fY |T ,Z ,X (q0|C (τ )|0, z, xj )
f0|C (q0|C (τ ))
0
1
Z
−
fY |T ,Z ,X (q1|C (τ )|1, z, xj )
0
f1|C (q1|C (τ ))
dτ
dτ
1 − PT |Z ,X (z, xj ) ψ0 (Y , T , Z , X)
Pc
PT |Z ,X (z, xj )ψ1 (Y , T , Z , X)
Pc
.
A Wald-type test statistic is then
Wmean ≡ n m̈1 − m̈0
0
V̈−1 m̈1 − m̈0 ⇒ χ2 (J − 1)
as N, J → ∞, N/J → ∞.
19 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Simulation and JTPA
Simulation
Simulation: DGPs
DGP:
Y0 = X + V + S0 ,
Y1 = X + V + (1 − bXV ) + S1 , Y = Y1 T + Y0 (1 − T ),
Pr(X = 0.4j) = 1/5 for j = 1, ..., 5, V , S0 , S1 ∼ N(0, 1) and b = 0, 2.
Exogenous treatment: Pr(T = t) = 12 , t = 0, 1.
Endogenous treatment: Pr(Z = z) = 21 , z = 0, 1, and T = 1 (0.15(Y1 − Y0 ) + Z − 0.5 > 0).
Rank similarity holds when b = 0 but not when b 6= 0.
20 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Simulation and JTPA
Simulation
Illustration of DGPs: Exogenous Treatment
Figure: Conditional distributions of potential ranks
Conditional CDF
.2
.4
.6
.8
0
0
Conditional CDF
.2
.4
.6
.8
1
b=1
1
b=0
0
.2
.4
.6
Potential rank
X=0.4, Y(1)
X=0.8, Y(1)
X=1.2, Y(1)
X=1.6, Y(1)
X=2.0, Y(1)
.8
1
0
.2
X=0.4, Y(0)
X=0.8, Y(0)
X=1.2, Y(0)
X=1.6, Y(0)
X=2.0, Y(0)
.8
1
X=0.4, Y(0)
X=0.8, Y(0)
X=1.2, Y(0)
X=1.6, Y(0)
X=2.0, Y(0)
Conditional CDF
.2
.4
.6
.8
0
0
Conditional CDF
.2
.4
.6
.8
1
b=3
1
b=2
0
.2
.4
.6
Potential rank
X=0.4, Y(1)
X=0.8, Y(1)
X=1.2, Y(1)
X=1.6, Y(1)
X=2.0, Y(1)
21 / 37
.4
.6
Potential rank
X=0.4, Y(1)
X=0.8, Y(1)
X=1.2, Y(1)
X=1.6, Y(1)
X=2.0, Y(1)
.8
X=0.4, Y(0)
X=0.8, Y(0)
X=1.2, Y(0)
X=1.6, Y(0)
X=2.0, Y(0)
Dong, Shen
1
0
.2
.4
.6
Potential ranks
X=0.4, Y(1)
X=0.8, Y(1)
X=1.2, Y(1)
X=1.6, Y(1)
X=2.0, Y(1)
.8
1
X=0.4, Y(0)
X=0.8, Y(0)
X=1.2, Y(0)
X=1.6, Y(0)
X=2.0, Y(0)
Testing for Rank Invariance or Similarity in Program Evaluation
Simulation and JTPA
Simulation
Simulation Results: Exogenous Treatment
N
500
Test
Test
Test
Test
Test
1:
2:
3:
4:
5:
Ω = {0.5}
Ω = {0.2, 0.3, 0.4}
Ω = {0.5, 0.6, 0.7, 0.8}
Ω = {0.2, 0.3, ..., 0.8}
Mean Test
0.034
0.013
0.014
0.006
0.051
Test
Test
Test
Test
Test
1:
2:
3:
4:
5:
Ω = {0.5}
Ω = {0.2, 0.3, 0.4}
Ω = {0.5, 0.6, 0.7, 0.8}
Ω = {0.2, 0.3, ..., 0.8}
Mean Test
0.074
0.269
0.151
0.287
0.103
1000
1500
b = 0
0.039
0.051
0.013
0.025
0.014
0.023
0.010
0.013
0.044
0.048
b = 2
0.150
0.232
0.776
0.968
0.581
0.857
0.910
0.996
0.213
0.278
2000
2500
500
0.040
0.021
0.023
0.013
0.041
0.053
0.023
0.018
0.013
0.067
0.047
0.018
0.022
0.009
0.063
0.303
0.994
0.962
1.000
0.424
0.388
1.000
0.991
1.000
0.500
0.143
0.817
0.306
0.836
0.340
2000
2500
0.120
0.099
0.148
0.144
0.176
0.146
0.152
0.260
0.283
0.251
0.512
1.000
0.992
1.000
0.853
0.640
1.000
1.000
1.000
0.941
0.800
1.000
1.000
1.000
0.971
1
Rejection Rate
.4
.6
.8
0
.2
Rejection Rate
.4
.6
.8
.2
0
0
1
2
3
b
Distributional Test 1
Distributional Test 3
Mean Test
22 / 37
1500
0.101
0.065
0.134
0.102
0.144
b=2
1
Sample Size = 1000
1000
b = 1
0.059
0.038
0.050
0.044
0.092
b = 3
0.335
0.999
0.880
0.999
0.659
Distributional Test 2
Distributional Test 4
Dong, Shen
500
1000
1500
Sample Size
Distributional Test 1
Distributional Test 3
Mean Test
2000
2500
Distributional Test 2
Distributional Test 4
Testing for Rank Invariance or Similarity in Program Evaluation
Simulation and JTPA
Simulation
Simulation Results: Endogenous Treatment
N
500
Ω = {0.5}
Ω = {0.2, 0.3, 0.4}
Ω = {0.5, 0.6, 0.7, 0.8}
Ω = {0.2, 0.3, ..., 0.8}
Mean Test
0.025
0.012
0.006
0.002
0.054
Ω = {0.5}
Ω = {0.2, 0.3, 0.4}
Ω = {0.5, 0.6, 0.7, 0.8}
Ω = {0.2, 0.3, ..., 0.8}
Mean Test
0.084
0.170
0.021
0.053
0.152
1000
1500
b = 0
0.036
0.041
0.012
0.018
0.013
0.016
0.010
0.006
0.050
0.051
b = 2
0.242
0.379
0.589
0.870
0.150
0.340
0.431
0.823
0.322
0.481
2000
2500
500
0.038
0.017
0.022
0.010
0.045
0.057
0.025
0.015
0.008
0.057
0.036
0.013
0.006
0.003
0.037
0.522
0.965
0.600
0.960
0.622
0.615
0.993
0.764
0.993
0.709
0.113
0.284
0.020
0.093
0.191
2000
2500
0.073
0.065
0.047
0.057
0.063
0.085
0.107
0.074
0.094
0.068
0.441
0.975
0.450
0.949
0.602
0.617
1.000
0.704
1.000
0.772
0.700
1.000
0.865
1.000
0.843
1
Rejection Rate
.4
.6
.8
0
.2
Rejection Rate
.4
.6
.8
.2
0
0
1
2
3
b
Distributional Test 1
Distributional Test 3
Mean Test
23 / 37
1500
0.064
0.040
0.034
0.029
0.054
b=2
1
Sample Size = 1000
1000
b = 1
0.040
0.022
0.016
0.015
0.050
b = 3
0.293
0.783
0.198
0.634
0.441
Distributional Test 2
Distributional Test 4
Dong, Shen
500
1000
1500
Sample Size
Distributional Test 1
Distributional Test 3
Mean Test
2000
2500
Distributional Test 2
Distributional Test 4
Testing for Rank Invariance or Similarity in Program Evaluation
Empirical Examples: Testing Results
Star
Testing: the Exogenous Treatment Case
Table: Star Project: Test Results
Treatment type
Test Stat
P-value
Test Stat
P-value
Total Test Score
Birthday (1st-31st)
Small Class V.s. Regular Class
232
21.23
0.033
0.439
Aid Class V.s. Regular Class
266
13.25
0.001
0.905
Conclusions
Both treatments (small class and regular class with aid) improve the rank distribution of the
disadvantaged (boy, nonwhite)
Assigning a teaching aid to the regular class systematically changes students’ rank.
Researchers may want to reconsider the practice of using both regular class with and without
aid as the “control” group in analysis.
24 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Empirical Examples: Testing Results
Star
Empirical Example: JTPA
Y = 30 months’ earnings following assignment
T = receiving training services, Z = random assignment indicator,
X = black, Hispanic, HS or GED, married, worked at least 13 weeks the year before, AFDC
receipt (for women only) and 5 age category dummies. (Abadie, Angrist and Imbens, 2002)
25 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Empirical Examples: Testing Results
JTPA
JTPF: First-Stage Unconditional LQTE Estimation
Table: First-stage estimates of unconditional QTEs of training on trainee earnings
Quantile
Y0
Female
QTE
Y0
Male
QTE
0.15
195
291
(341.88)
1,462
249
(713.36)
0.20
723
714
(358.31)*
2,733
390
(723.01)
0.25
1,458
1,200
(372.08)***
4,434
489
(746.85)
0.30
2,463
1,380
(399.21)***
6,993
340
(891.74)
0.35
3,784
1,705
(497.01)***
8,836
594
(1,042.40)
0.40
5,271
1,974
(669.75)***
11,010
723
(1,104.63)
0.45
6,726
2,451
(766.25)***
13,104
1,069
(1,144.28)
0.50
8,685
2,436
(829.29)***
15,374
1,291
(1,234.59)
0.55
11,007
2,089
(877.56)**
17,357
2,239
(1,295.79)*
0.60
12,618
2,729
(886.96)***
20,409
2,118
(1,418.40)
0.65
14,682
2,943
(920.45)***
23,342
2,319
(1,557.00)
0.70
16,971
2,772
(1,027.14)***
27,169
1,780
(1,606.66)
0.75
20,252
2,106
(1,152.35)*
30,439
2,408
(1,641.47)
0.80
23,064
2,331
(1,149.71)**
34,620
2,800
(1,701.90)*
0.85
26,735
1,762
(1,179.91)
39,233
3,955
(1,886.98)**
Note: Standard errors are in the parentheses; All estimates control for covariates including dummies for black, Hispanic, high-school graduates (including GED holders), marital
status, whether the applicant worked at least 12 weeks in the 12 months preceding random assignment, and AFDC receipt (for women only) as well as 5 age group dummies; *
significant at the 10% level, ** significant at the 5% level, ***significant at the 1% level.
26 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Empirical Examples: Testing Results
JTPA
JTPA: Joint Test
Table: Rank similarity test jointly at all quantiles
Female
I
(1)
Male
II
(2)
(1)
I
(2)
(1)
II
(2)
(1)
(2)
Panel A: Dependent Var. Earnings
7,652.1
7,763.8
1,197.2
1,177.8
2,780.7
2,719.0
886.1
876.8
(0.000)
(0.000)
(0.000)
(0.000)
(0.000)
(0.000)
(0.000)
(0.000)
d.f.
1,544
1,544
723
723
1,218
1,218
570
570
Panel B: Falsification test (Dependent Var. Age)
χ2
478.8
471.9
252.0
259.9
209.3
203.5
124.7
123.0
(0.926)
(0.953)
(0.366)
(0.245)
(1.000)
(1.000)
(0.977)
(0.982)
d.f.
525
525
245
245
338
338
158
158
Note: Results are based on the Chi-squared test in Theorem 2; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values are in the
parentheses; Columns I report a joint test at equally-spaced 15 quantiles from 0.15 to 0.85; Columns II reports a joint test at equally-spaced 7 quantiles
from 0.20 to 0.80; (1) controls for covariates in the first-stage unconditional QTE estimation, while (2) does not; X values with fewer than 5 observations
when either Z = 0 or Z = 1 are not used in the test to ensure the common support assumption.
χ2
27 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Empirical Examples: Testing Results
JTPA
JTPA: Test Individual Quantiles
Table: Rank similarity test at individual quantiles
Quantile
χ2
Panel A: Dependent Var. Earnings
Female
Male
χ2
χ2
Panel B: Falsification test (Dependent Var. Age)
Female
Male
χ2
0.15
134.4
(0.012)
103.8
(0.045)
43.9
(0.144)
19.4
(0.561)
0.20
143.0
(0.004)
113.3
(0.010)
37.9
(0.340)
22.1
(0.391)
0.25
126.2
(0.060)
107.8
(0.025)
26.0
(0.863)
13.9
(0.907)
0.30
131.9
(0.034)
104.7
(0.039)
26.9
(0.834)
15.0
(0.861)
0.35
147.2
(0.003)
95.8
(0.142)
22.1
(0.956)
17.9
(0.712)
0.40
118.3
(0.160)
88.6
(0.291)
31.1
(0.659)
23.2
(0.447)
0.45
107.5
(0.387)
110.7
(0.019)
32.1
(0.611)
22.4
(0.497)
0.50
110.9
(0.304)
113.6
(0.012)
32.3
(0.599)
19.2
(0.692)
0.55
112.6
(0.266)
110.9
(0.019)
30.8
(0.673)
19.6
(0.664)
0.60
112.1
(0.276)
112.3
(0.015)
32.7
(0.581)
22.3
(0.503)
0.65
121.7
(0.113)
105.0
(0.044)
29.4
(0.734)
18.4
(0.735)
0.70
108.0
(0.375)
106.1
(0.038)
36.7
(0.388)
24.0
(0.402)
0.75
130.4
(0.035)
109.7
(0.018)
45.4
(0.112)
16.5
(0.831)
0.80
118.4
(0.128)
116.5
(0.005)
47.7
(0.074)
17.1
(0.802)
0.85
92.3
(0.697)
118.7
(0.002)
44.7
(0.125)
18.7
(0.716)
Note: Results are based on the Chi-squared test in Theorem 2; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values are in the
parentheses; Covariates are controlled for in the first-stage unconditional QTE estimation. X values with fewer than 5 observations when either Z = 1 or
Z = 0 are not used in the test to ensure the common support assumption.
28 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Empirical Examples: Testing Results
JTPA
JTPA: Test the Mean Rank
Table: Rank similarity test for the mean rank only
Female
(1)
Male
(2)
(1)
(2)
Panel A: Dependent Var. Earnings
123.1
(0.098)
123.1
(0.098)
115.2
(0.009)
115.2
(0.009)
d.f.
104
104
82
82
Panel B: Falsification test (Dependent Var. Age)
χ2
30.6
(0.683)
30.6
(0.683)
18.4
(0.736)
18.4
(0.736)
d.f.
35
35
23
23
Note: Results are based on the Chi-squared test for the mean ranks only; Variance-covariance matrices are bootstrapped with 2,000 replications; P-values
are in the parentheses; (1) controls for covariates in the first-stage unconditional QTE estimation, while (2) does not; X values with fewer than 5
observations when either Z = 1 or Z = 0 are not used in the test to ensure the common support assumption.
χ2
Conclusion:
Training causes some individuals to systemically change their ranks in the earnings
distribution.
Should be cautious in equating the distributional impacts of training with the true effects on
individual trainees.
Results largely agree with Heckman, Smith and Clements (1997): “perfect positive
dependence across potential outcome distributions ... not credible.”
29 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Extensions
Extension I: Covariates with Large Support
Assume that J → ∞, as n → ∞. Assumption 4
1
i.i.d. data: the data {Yi , Ti , Zi , Xi } for i = 1, ..., n is a random sample of size n of
(Y , T , Z , X).
2
For all τ ∈ Ω = {τ1 , τ2 , ..., τK }, the random variable Y1 and Y0 are continuously distributed
with positive density in a neighborhood of q0|C (τ ) and q1|C (τ ) in the subpopulation of
compliers.
P
Let nj = ni=1 1(X = xj ). nj n/J uniformly over j, i.e. there exist 0 < c ≤ C < ∞ such
n
that c J ≤ nj ≤ C Jn for all j = 1, ..., J.
3
p
4
π̂(xj ) is uniformly consistent, or supj=1,...,J |π̂(xj ) − π(xj )| → 0 as n, J → ∞ and n/J → ∞.
5
For all t, z = 0, 1, j = 1, ..., J and τ ∈ Ω, fY |T ,Z ,X (.|t, z, xj ) is bounded in a neighborhood of
qt|C (τ ). For all τ ∈ Ω and j = 1, ..., J, fY |X (.|xj ) is positive and bounded in a neighborhood
of qt|C (τ ).
30 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Extensions
Extension I: Covariates with Large Support
Let m̂zj = (m̂jz (τ1 ) , ... , m̂jz (τK ))0 and mzj = (mjz (τ1 ), ..., mjz (τK ))0 be K × 1 vector.
Corollary 4
Given Assumptions 1 and 4, we have
v
u 1 0 u nj nj
t
m̂1j − m̂0j − m1j − m0j
⇒ Zj ∼ N(0, Vj ),
nj1 + nj0
where Zj for j = 1, ..J follow independent multi-variate normal distributions; the (k, k 0 )-th
element of K × K variance-covariance
matrix
Vj is
Vj;k,k 0 = π(xj )mj1 (τk ∧ τk 0 ) 1 − mj1 (τk 0 ) + (1 − π(xj ))mj0 (τk ∧ τk 0 ) 1 − mj0 (τk 0 ) .
31 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Extensions
Extension I: Covariates with Large Support
For each j = 1, ..., J, define the Wald-type statistic
wj =
nj1 nj0
nj1
+
nj0
0
m̂1j − m̂0j V̂j−1 m̂1j − m̂0j
where V̂j is a consistent estimator of Vj . The (k, k 0 )-th element of V̂j is
V̂j;k,k 0 =
nj0
nj0
+
nj1
m̂j1 (τk ∧ τk 0 ) 1 − m̂j1 (τk 0 ) +
nj1
nj0
+
nj1
m̂j0 (τk ∧ τk 0 ) 1 − m̂j0 (τk 0 ) .
The test statistic is then
PJ−1
WlargeJ =
wj − K (J − 1)
p
⇒ N(0, 1).
2K (J − 1)
j=1
The one-sided decision rule of the test is to
“reject the null hypothesis H0 if WlargeJ > cα ”.
32 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Extensions
Extension II: Continuous Covariates
Let mkz (x) = E [I (τk ) |Z = z, X = x] for z = 0, 1. Interested in testing
H0 : mk1 (x) = mk0 (x) for all x ∈ X and k = 1, ..., K ,
Apply Chernozhukov, Lee and Rosen (2013) and form Kolmogorov-Smirnov type test
statistic:
m̂1 (x) − m̂0 (x) k
k
KS = sup ,
sk (x)
k,x where m̂kz (x) is local linear estimator and sk (x) the standard error of m̂k1 (x) − m̂k0 (x).
Construct the critical value cα by multiplier bootstrap. Let m̂k∗ (x) is a multiplier process
such that
P
P
ˆk,i Kh1 (Xi − x)
ˆk,i Kh0 (Xi − x)
Zi =1 ηi Zi =0 ηi ∗
P
P
m̂k (x) =
−
K
(X
−
x)
i
h
Zi =1
Zi =0 Kh0 (Xi − x)
1
where {ηi }N
i=1 is simulated from i.i.d. N(0, 1)and independent of data,
ˆk,i = 1 Yi ≤ q̂1|C (τk )Ti + q̂0|C (τk )(1 − Ti ) − m̂k1 (xi )Zi − m̂k0 (xi )(1 − Zi ). cα is the
∗ m̂ (x) (1 − α) × 100% percentile of the simulated process supk,x s k(x) .
k
Reject the null if KS > cα .
33 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Extensions
Extension III: Testing for Conditional Rank Invariance or Similarity
Two main modifications are required:
1
First, estimate conditional quantiles conditional on some covariates X1 of interest.
2
Second, use additional covariates X2 other than the conditioning covariates in the first-step
to perform the test.
Feasible only when the conditioning set for the conditional quantiles is small.
E.g., we estimate quantiles of potential earnings, and perform tests for male and female trainees,
so the tests are essentially rank similarity tests for conditional ranks conditional on gender.
34 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Conclusion
Conclusion
Proposes nonparametric tests for rank invariance or similarity popular in program evaluation
or various QTE models.
The tests explore whether the distribution (or features of it) of potential ranks remains the
same among observationally equivalent individuals.
Simulations show good size and power of the proposed tests in small samples.
Empirical application to the JTPA training program:
Training causes some individuals to systematically change their ranks in the distribution of earnings.
Program effects are more complicated than suggested by standard QTEs.
Should be cautious in equating program impacts on the distribution of earnings with those on
individual trainees.
35 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Conclusion
(Incomplete) Reference List
QTE models: Abadie, Angrist and Imbens (2002), Chesher (2003, 2005), Chernozhukov and
Hansen (2005, 2006, 2008), Firpo (2007), Firpo, Fortin and Lemieux (2007), Chernozhukov,
Imbens and Newey (2007), Horowitz and Lee (2007), Imbens and Newey (2009), Rothe
(2010), Frolich and Melly (2013), Powell (2013), Yu (2014), etc.
Other works in rank invariance/similarity testing
Frandsen and Lefgreen (2015): a parametric test for rank similarity, testing the equality of mean
ranks with or without treatment.
Yu (2015): a test for rank invariance, assuming unconfoundedness.
JTPA: Abadie, Angrist and Imbens (2002), Chernozhukov and Hansen (2008), Orr et al.
(1996), Heckman, Smith, and Clements (1997), etc.
Star: Krueger (1999), Krueger and Whitmore (2001), Chetty, et. al. (2010), Jackson and
Page (2013), etc.
36 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Conclusion
Thank you!
37 / 37
Dong, Shen
Testing for Rank Invariance or Similarity in Program Evaluation
Fly UP