...

\Sink or Swim:" What

by user

on
Category: Documents
6

views

Report

Comments

Transcript

\Sink or Swim:" What
\Sink or Swim:" What Happened to California's
Bilingual Students after Proposition 227?
Valentina A. Bali
California Insititute of Technology
[email protected]
August 10th, 2000
Abstract
Proposition 227, passed in California in 1998, aimed to dismantle bilingual
programs in the state's public schools. Using individual level data from a
southern California school district, I nd that in 1998, before Proposition 227,
limited-English-procient (LEP) students enrolled in bilingual classes had lower
scores in reading than LEP students not enrolled in bilingual classes: 2.4 points
I thank Matthew O. Jackson, R. Michael Alvarez, Christine Rossell and Fred Boehmke for helpful
comments and discussion. I am grateful as well to William Bibbiani of the Research, Evaluation and
Testing Department at Pasadena Unied School District for providing the data, and for insightful
discussions.
1
less on a scale from 1 to 99. In math these bilingual students scored 0.5 points
higher than non-bilingual LEPs. But in 1999, after Proposition 227 the same
set of students had scores no worse than non-bilingual LEP students in reading
and were still 0:5 points higher in math. Proposition 227, which interrupted
bilingual programs early and emphasized English instruction, then, did not
set bilingual students back relative to non-bilingual LEP and may have even
benetted them.
1. INTRODUCTION
Proposition 227 passed in California's June 1998 primary election with an ample
margin of approval: 61% of voters supported the measure statewide, while only two
counties in the state, San Francisco and Alameda, voted against it. The main goal
of the initiative was to dismantle long-standing bilingual programs in public schools.
Specically, after Proposition 227, a child could be kept in a bilingual education
program only if the parents requested a waiver and it was approved by school authorities. This deceptively simple new standard was implemented to dierent degrees
throughout the state. Many school districts fully complied with the new law while
others unsuccessfully sought legal exemptions. As a result, the overall percent of LEP
2
students enrolled in bilingual programs declined from 29% in 1998 to 11% in 1999.1
This shift of educational regimes, from one encouraging primary language instruction
to one emphasizing early English instruction, is taking place in many of California's
districts and may possibly take place in other states as well. It is important then to
understand its immediate impact on students.
The purpose of this paper is to examine the academic performance of limitedEnglish-procient (LEP) students after Proposition 227. In particular, the focus is on
former bilingual LEP students: LEPs who were enrolled in bilingual programs in 1998
but not in 1999, and whose academic
The contributions in this paper are three-fold. First, from a substantive point of
view, looking at individual level data, I nd that former bilingual students in a southern California district were not set back by Proposition 227 relative to non-bilingual
LEPs. In particular, controlling for background characteristics, LEP students enrolled
in bilingual classes in 1998 but not in 1999, had standardized scores 2.4 points less in
reading and 0.5 more in math (in a scale from 1 to 99) than LEP students not enrolled
in bilingual classes in 1998, who I refer to here as non-bilingual LEPs. After Proposition 227, I nd that once former bilingual students were placed in English classrooms
with special support, their scores are never worse than those of non-bilingual LEPs
in reading and are still 0:5 higher in math. In general, then, former bilingual LEPs
caught up with non-bilingual LEPs after the implementation of Proposition 227.
The second contribution is methodological. A common, yet often ignored, problem
in assessing English instruction programs is that students with little English skills may
be exempted from taking standardized tests in English. Bilingual LEPs, in particular,
are more likely to be exempted from testing. In those cases, when we compare
bilingual LEP students, who are exempted more often, to non-bilingual LEP students,
who are exempted less often, the conclusions can be biased towards favoring bilingual
programs if mostly the better performing bilingual students are being tested. In
general, not accounting for the exemption process can provide inconsistent estimates
of expected performance. In this paper I account for the selection process with a
4
model that explicitly takes into consideration the test-taking process. If the selection
model is not used, bilingual LEP students seemingly fare better when compared to
non-bilingual LEPs.
Finally, to check for the generalizability of the school district results, I look at
county level data. I nd that counties with higher percentages of bilingual LEP
students experienced gains in their test scores that are statistically no dierent than
those of counties with less bilingual LEP students. On the other hand, counties with
higher percentages of Hispanic students experienced larger gains than counties with
smaller percentages of Hispanic students. More specically, after racial and economic
controls, when comparing counties with no Hispanics to counties with 40% Hispanics,
such as Los Angeles, I nd that 1.2% more students score above the 50th national
percentile ranking (NPR). The fact that counties with more Hispanics experienced
higher gains is consistent with Proposition 227 not setting English learner students
back.
These ndings have important implications. Given that bilingual students do as
well as non-bilingual students when bilingual programs are dismantled, this suggests
that interrupting or shortening their stay in a bilingual program does not set these
students back. More English instruction does not seem to hamper them, at least in
the short run. Clearly, one-year eects are short term and what matters are long term
eects. The next several years should provide ample opportunities to explore whether
5
the long-term eects of this large reform are positive or negative. Understanding the
eects of Proposition 227 is crucial given that similar measures are being considered
in other states such as Arizona, Massachusetts and New York and given the increasing
size of the immigrant population.
The paper is organized as follows. The next section briey reviews the passage of
Proposition 227. Section 3 discusses the hypothesis while Section 4 reviews the data
and the methods. Section 5 presents then the main results by comparing bilingual
LEPs' scores in 1998 and 1999 to those of non-bilingual LEPs and non-LEPs. Section
6 looks at county level results. Alternative specications, and caveats are in Section
7, while conclusions and discussion are in Section 8.
2. THE PASSAGE OF PROPOSITION 227
Proposition 227, sponsored by a citizen organization, passed in California with
61% of voter approval. Only two counties, San Francisco and Alameda, out of the
58 California counties, did not support the initiative. The level of voter approval
across the state surprised observers at the time, given that many teacher unions and
Hispanic organizations had strongly mobilized against Proposition 227. Moreover,
many counties have large Hispanic populations. Los Angeles, for example, has over
6
45% Hispanics while the state overall has close to 30%.2 This remarkable uniformity
of voter approval did not necessarily imply an informed consensus on the educational
merits of bilingual instruction. Voters did not base their decision exclusively on their
views on bilingual instruction. In a probit analysis of exit-poll voters, Alvarez (1999)
shows that racial and ideological identications were driving factors for the passage
of Proposition 227, independent of opinions on the eÆcacy of bilingual instruction
(see also Cornelius and Martinez (2000) and Ji (2000)).
Bilingual education is a racially and ideologically charged issue for voters, and this
educational method is also controversial for researchers.3 The academic literature has
not arrived at a general consensus on the benets of bilingual programs. This lack of
consensus stems at times from ideological biases, problematic methodology, or simply
intellectual disagreement. James Green (1999) and Rossell and Baker (1996) have
the most recent comprehensive surveys of the eÆcacy of bilingual education, yet they
reach quite opposite conclusions. Rossell and Baker review 72 methodologically acceptable studies from a pool of 300 studies. Their main conclusion is that the research
evidence does not support bilingual programs as a strictly better form of instruction
than English-as-a-second-Language (ESL) programs or structured-English-immersion
(SEI). Only in a minority of studies is bilingual instruction better than a regular En2 See
the Department of Finance of California website (www://htttp.dof.ca.gov) for demographic
information on 2000 projections.
3 For a description of the dierent bilingual programs in the country and their methods see Faltis
and Hudelson (1998). For a history of bilingual education and the its politics see Crawford (1995).
7
glish classroom.
But, other researchers have concluded that bilingual programs can be as eective
as English-only ones and sometimes even more eective (Willig (1985), Collier and
Thomas (1989, 1997), Garcia (1991), Krashen (1998, 1999), Hakuta (1994)). In a
meta-analysis of bilingual eÆcacy that used 11 studies, Green (1999) nds that bilingual instruction is superior to programs emphasizing early English instruction. Similarly, Thomas and Collier (1997) concluded that bilingual instruction, particularly
when literacy in both languages is emphasized, was better than any other program for
LEP students. One of the better known long term studies was conducted by Ramirez
(1992) who tracked students for over four years in various dierent programs of instruction for LEP students. Ramirez and his associates found that bilingual programs
of the early type (where the goal is to exit the students as soon as they learn English)
are better than immersion (all-English special instruction) programs but only in the
early years. In later years the benets from bilingual instruction disappear. Importantly though, the Ramirez study did not statistically account for the fact that fewer
bilingual students were tested than those in other programs. For example, only 29%
of bilingual students were tested while 42% of the alternative immersion program
students were tested. This can induce a favorable bias toward bilingual programs
since only the better performing bilingual students get tested (Rossell (1999)).
The majoritarian opinion of California's voters was for reform, but not entirely
8
based on assessments of educational outcomes. Before Proposition 227 there was no
broad agreement among voters, teachers or academics, about the potential eects of
this initiative. After the public release of 1999 aggregate school level scores which
showed small increases, the reactions were mixed. Those who advocated bilingual instruction cautioned against ignoring across-the-board increases when looking at LEPs'
gains in scores (Hakuta (1999)). Others compared the gains between districts which
thoroughly complied with Proposition 227's mandate and those that maintained bilingual programs and concluded that the initiative had worked (Amselle (1999). The
next sections by focusing on individual data
and
county level data will hopefully
provide further understanding of the impact of the initiative.
3. THE HYPOTHESIS
The main hypothesis to test is whether Proposition 227 had a negative impact
in 1999 on former bilingual LEPs compared to former non-bilingual LEPs relative
to their baseline performances in 1998 . To test the main hypothesis one needs
to look at both 1998 and 1999 overall scores or, alternatively, individual 1998-1999
gains. We may expect that overall, students enrolled in 1998 in bilingual classes had
lower 1998 scores than non-bilingual LEPs and non-LEPs since bilingual students
were exposed to much less English, the standardized tests were fully in English and
designed for uent students, and dierent studies have suggested that full uency
9
can take from 5 to 7 years (National Research Council (1998), Collier and Thomas
(1989)). Controlling for background information and school eects, I will test the
rst hypothesis:
H.1 : Bilingual LEP students had statistically lower scores than non-bilingual LEP
students in 1998.
Dismantling bilingual instruction can have a negative eect on bilingual students if
bilingual instruction is a superior program or, regardless of the merit of the program,
from interrupting the program and expecting English competence in a short period
of time. Dismantling bilingual instruction can positively aect bilingual students if
bilingual instruction is not a superior program (in and of itself or due to implementation) or if bilingual instruction is a benecial program only for short periods of time.
The goal of this paper is to estimate the eect of the reform or, more precisely, the
eect of interrupting bilingual instruction. In particular, I will test the hypothesis
held by bilingual advocates that LEP students would not benet from the reform
compared to other LEP students enrolled in less adequate programs or already exited
from bilingual instruction.
H.2 : The gap in test scores between former Bilingual LEPs students and continuing
non-bilingual LEP students increased in 1999.
Ideally we would rst compare the performance of former bilingual LEPs against
10
continuing bilingual LEPs rather than continuing non-bilingual LEPs. This comparison would hold constant their 1998 bilingual background while varying their 1999
status. The problem with this comparison is the possibility of strong biases in determining who is assigned to each program alternative. The decision to place or continue
a student in a bilingual instructional setting is most likely non-random. However, in
the southern California district of my study, this potentially problematic comparison is actually not possible since only 200 bilingual students continued in bilingual
instruction and they were all exempted from test-taking in 1999. Moreover, the continuing bilingual students were all from the same school and their waivered status was
mostly the result of activist teachers at that particular school. Therefore, although I
cannot compare former and continuing bilingual LEPs, I can compare former bilingual students and non-bilingual students with the condence that there are no large
biases in the composition of the groups due to Proposition 227-induced changes.
4. THE DATA AND THE METHODS
4.1 The District
Pasadena Unied School District (PUSD) is in Los Angeles county, southern California. In 1998-1999 it had a total population of approximately 22,000 students, of
11
whom 18,300 were eligible for testing and took the Stanford 9 tests.4 Pasadena is
quite diverse in its population as can be seen in Table 1. Compared to California's
total 1999 averages, Pasadena has a larger minority student body and its students
come from backgrounds that are more disadvantaged, as seen by the percentages
of students from families qualifying for Aid to Families with Dependent Children
(AFDC) and Free Lunch programs. Pasadena's LEP percentage, 26.3, is, on the
other hand, very similar to California's percentage of 24.6. In terms of academic performance Pasadena lags California in every grade as measured by national percentile
rankings (NPR) from reading Stanford 9 scores. The last two rows in Table 1 show
the dierence for second and eleventh grades.
(Table 1 about here).
Overall PUSD is a good representative school district to study the eects of Proposition 227 in that it has a large Hispanic and LEP student body, one which is comparable to the state's demographics. The fact that it has a possibly more disadvantaged
student body and in general lower scores can make it more diÆcult for any reform to
succeed, but if positive eects are found then they would be even more convincing.
The language of Proposition 227 implied that each district had to inform the parents about the new regime, and inform them of their option to request a waiver to
4 Stanford
9 tests are the standardized tests which by law since 1998 all students in California in
grades 1-11 must take. The tests cover math, reading, language and subject areas.
12
keep their child in a bilingual program. In the 1998 academic year there were approximately 5,400 LEP students; 2,900 of these (or 16% of the student population including
kindergarten) were enrolled in bilingual classes, primarily in grades K-4. Less than 3%
of bilingual students were from an Armenian background, the other minority group
which was oered bilingual instruction in Pasadena. By 1999, after the passage of
Proposition 227, the district largely dismantled its bilingual programs after few requests for waivers from parents were received. The majority of the bilingual students
were placed in structured-English-immersion classes (SEI) where English is taught at
the students' level, while non-bilingual LEPs continued in regular classrooms or classrooms with some English-as-a-second-language support. Approximately 200 waivers
were requested by parents and accepted, all coming from the most heavily Hispanic
school. The district went from roughly two-thirds of its 30 schools oering bilingual
programs in 1998 to just one school in 1999.5
4.2
The Data
To test the hypothesis I use multivariate analysis in which the dependent variable
is test scores in reading and math and the explanatory variables correspond to background and school information. To measure the performance of students I use their
5 Since
continuing bilingual students are not included in this study I will often refer in the reaminder of this paper to former bilingual LEP student simply as bilingual LEPs. Similarly I will refer
to continuing non-bilingual LEPs as non-bilingual LEPs.
13
1998 and 1999 Stanford 9 tests scores. This is the test that all California students in
grades 2-11 must take by law since 1998. Only math, reading and language are tested
across all grade levels and I will focus the analysis on total reading and math.6
In general the variables to be incorporated in the analysis can, arbitrarily, be
grouped into three categories: individual, group and school variables. The individual
variables correspond to those describing a student's English prociency classication.
The group, or family, variables are: race, socioeconomic level, welfare (AFDC), Free
Lunch program, and residence values of Both Parents, Mother and Father. Socioeconomic level can take three values, low, mid, and high. These levels are derived from
relative real estate values of a students' address. AFDC is welfare for families with
children and Free Lunch captures students enrolled in the federally funded program of
free/reduced luncheons. Both Parents, Mother and Father, correspond to the various
types of residence values or the guardians the student lives with in the household.
In general, lower SES or welfare variables are expected to be associated with lower
scores, while relatively more stable households composed of both parents are expected
to be associated with higher scores.
6 The
test scores are normed curve equivalent (NCE) scores which are obtained by rst scaling
the raw score of a student given the diÆculty of the questions such that any increase of a point at
one place in the scale is equal to a point increase anywhere else in the scale. Next, these scaled
scores are translated into a national percentile rank (NPR) which is the percentage of the national
norming sample who scored equal to or less than the student. Finally, the NPR is re-expressed as a
value from a normal curve. The benets from using NCE scores is that comparisons can be made
across subjects and grades.
14
The school variables are Class Size, Percent Full Credentials and Magnet. Class
size is the average class size of a school while Percent Full Credentials is the percent
of credentials held in a school which are full credentials, as opposed to emergency, or
interim credentials. Magnet is an indicator variable for the three magnet schools in
the district. I expect larger class sizes to have a negative impact on scores while I
expect higher percentages of full credentials to be associated with higher scores.
Below is a brief description of the variables that capture the level of English prociency which are the focus of most of the analysis.
LEP/Non-LEP:
A LEP student is not yet deemed procient in English, as mea-
sured by a standardized evaluation.7 If a student is assessed as LEP in 1999 then they
were also LEP in 1998. Non-LEP students are either uent natives or students who
have been redesignated as procient, or former LEPs. LEP students score in general
signicantly less than non-LEPs and this gap is to be expected in the Pasadena district as well. Moreover, in any given year the gains of LEPs are expected to be larger
than those of non-LEPs since LEPs have gains that include increased comprehension of English and not just expanded acquisition of the material (Rossell and Baker
(1996)).
7 Standardized
evaluations of LEP students can be problematic. The category of LEP does not
exclusively include students learning English but may also include students who are now uent but
were not so previously, and in some cases even students who know no language other than English.
What all LEP students have is a family member who does not speak Englsish and low scores.
15
Former Bilingual LEP/Non-Bilingual LEP:
In 1998 a LEP student could be
enrolled in bilingual classes or not. If they were enrolled in bilingual classes I refer
to them as former bilingual LEP (or just bilingual LEP); otherwise they are nonbilingual LEP 1998. Note that non-bilingual LEPs may have been enrolled in bilingual
classes before 1998. After Proposition 227, most bilingual LEPs were assigned to SEI
classrooms or mainstream classrooms with some English support. Non-bilingual LEPs
largely continued in their previous program (mostly regular classrooms with English
support) unless redesignated as non-LEPs or uent, in which case they attended
regular classrooms.
4.3 Methods
The assignment in 1998 of students into a given English learning program was
determined in great part by the districts' assessment and subsequent recommendation
to the parents. Students were clearly not randomly selected into bilingual or all
English programs so we must address other factors, apart from enrollment in or out
of bilingual classes, that may have inuenced students scores. In addition, out of a
pool of 14,000 students enrolled in the district in both years, more than 1000 students
were exempted while close to 1000 students skipped the reading and math tests (in
the data set the students who were exempted or missed are indistinguishable). If the
exemptions and misses are not random these underlying selection processes must be
16
taken into account, otherwise the estimates of the coeÆcients will be inconsistent.
A rst conjecture with regards to the direction of the bias is that the less procient
LEPs are being exempted while the lower achieving students are missing the tests.
In this paper I will use a method of estimation, Heckman's selection model, that
explicitly models the selection process. That is, two equations are actually estimated.
The rst equation is the one that explains test scores and the one we are interested
in. Without a selection process this equation would be estimated by standard ordinary least squares (OLS) techniques. The second equation, the selection equation,
predicts whether a score is observed or not. Separate from the scores equation this
equation could be estimated as a discrete binary choice model (probit). In Heckman's
model, the coeÆcients and parameters in both equations are simultaneously estimated
through maximizing the likelihood of observing the data.8 An important parameter
that is estimated is the correlation, , between the errors (the non-deterministic components) in the two equations. If the correlation is signicantly dierent from zero
this suggests the two processes, scores and test-taking, are interdependent and the
selection model is an appropriate approach. Moreover, when a coeÆcient appears
in both equations, the total marginal eect will depend on the eect in the scores
equation plus a correction term that is linearly weighted by the correlation between
the errors (see Appendix A for details of the model).
8 See
Green (1998) or Maddala (1996) for a detailed explanation of the method.
17
5. BEFORE AND AFTER PROPOSITION 227
5.1 Overall View
I begin the analysis by looking at the average scores without controlling for background or school information. The student population studied is those enrolled in
the district in both years 1998 and 1999 and tested in 1999.9 Some of these students
may be missing scores in either year but they were enrolled in the district. Table
2 presents their average scores in 1998 and 1999 for reading and math by level of
English prociency. In parentheses the number of students who actually took the
test and the standard deviation of their scores are also included.10 LEP students who
were enrolled in bilingual classes in 1998 increased their average scores by 4.4 points
in reading and 4 points in math in 1999. Non-bilingual LEPs on the other hand
experienced smaller increases in 1999: 1.5 points and 2.5 in reading and math respectively. Further, bilingual LEPs' 1999 scores in reading and math are statistically
indistinguishable from those of non-bilingual LEPs. Non-LEP students have much
higher average scores than either bilingual or non-bilingual LEPs but their average
gains are much smaller: 0.7 points in reading and 1.2 in math. This preliminary
break-down already suggests that bilingual LEP students caught up to non-bilingual
9 As
discussed in Section 6 the same qualitative results will hold when comparing all students who
take the tests in each year, without the present restrictions.
10 1998
scores include test-taker in grades 1-10 and 1999 scores include test-takers who have moved
on to grades 2-11.
18
LEP students after Proposition 227. These numbers though do not include statistical
controls for background. Moreover, this simple inspection does not account for the
fact that many bilingual students did not take the tests in 1998. The next section will
address these problems using a selection model that includes controls for background
characteristics and estimates a scores equation and an equation explaining who is
more likely to be tested.
(Table 2 about here).
5.2 The Baseline in 1998
Table 3 belo
at the 95% level. Being a former bilingual LEP, LEP, Hispanic, or black decreases a
student's chances of taking the tests. Similarly, this probability decreases further if
the school has a large percentage of Hispanic teachers. Students in magnet schools
are more likely to be tested as well as those in grades greater than 3 (not shown).
(Table 3 about here).
I begin by looking at the background and school information variables that appear
only in the scores equation. The coeÆcients for these variables have a straightforward
interpretation as in an OLS model, without any corrections. In general the signs of
the eects are all in the expected direction. For example, all else equal, students
with low SES backgrounds have lower scores in reading (-3.2) and math (-3.3) than
students from high SES background. Family stability, on the other hand, corresponds
to signicant increases in scores: having both parents in the family is associated
with 2.4 points more in reading and 3.15 points more in math in comparison to
living with step parents, a foster family or in an institutional setting, the categories
excluded. With regards to the school variables or the policy variables upon which
the district has direct inuence, I nd that the coeÆcient on Percent Full Credential,
0:20 is positive and signicant for both reading and math. All else equal, if we
compare a school with 65% of its credentials being full, close to the districts' average,
with a hypothetical school with 100% of its credentials being full, the increase in
20
percentage of full credentials corresponds to an increase of 7 points in reading and
math. These increases are large when compared to the average increases in scores,
close to 3 percentile points in the national ranking ascribed to the reduction in class
size in California's schools (Los Angeles Times (June, 1999)).
I focus next on the variables LEP and Bilingual LEP which are the subject of this
analysis. As I have modeled the selection process these variable can inuence both
a score and the probability of taking a test. The variable LEP is a discrete variable
which is one when a student is bilingual LEP or non-bilingual LEP in 1998 and zero
otherwise. Therefore if the variable bilingual LEP is signicant it means there is
an extra eect for former bilingual students. Table 4 below summarizes the results
(the exact determination of the total eects are included in Appendix B). The net
eect of having been enrolled in bilingual instruction in 1998 for a LEP student is
2:4 points less in reading than a non-bilingual LEP and 0:5 points more in math.
These dierences are signicant at the 95% level. While in bilingual classes LEP
students did worse in reading than non-bilingual LEP students, as would be expected
given that they were exposed to less English. The fact that their scores in math
are virtually the same suggests that the assignment into the two groups is based on
language skills rather than academic skills. These eects are signicant and conrm
the intuition behind Hypothesis 1: Bilingual students enrolled in 1998 had statistically
lower scores than non-bilingual LEPs in subjects that stress English skills.
21
(Table 4 about here).
With regards to the other indicator of English prociency, LEP 1998, I nd that
LEP students have much lower scores than non-LEP students, the excluded indicator
variable. When we combine the eects of the LEP variable from both the scores and
the selection equations a representative (Hispanic and non-bilingual) LEP student
scores 13:1 points less than a non-LEP student in reading and 9:7 in math. These
dierences are large and signicant at the 95% level. DiÆculties with the English
language being a LEP or not is the variable with the strongest impact on a students'
score. Furthermore, after controlling for LEP status, Hispanic students still score
close to 6:3 and 7:3 points less in reading and math than white students. For black
students the gap with respect to white students is larger: 9:8 points in reading and
11:9 points in math. My present analysis does not include more controls such as
parent's education and at-home behavior which can reduce the gap between whites
and Hispanic and black students. On the other hand, this gap is consistent with many
similar ndings in the literature of a persistent racial gap after ever more thorough
controls.11
5.3 Good News after Proposition 227?
11 See Jencks and Phillips, 1998, for an excellent account of the test-score gap black and white
students. See NCE 95-767 report on Hispanics in education and a discussion of their gap in scores
with regards to white students.
22
Bilingual students had lower scores in reading than non-bilingual students most
likely due to the fact that they had less exposure to English since their math scores
were actually slightly higher than those of non-bilingual LEPs. In 1999, most bilingual LEP students were placed in structured-English-immersion classrooms where the
content is, in theory, the same as in regular classrooms but the English is adapted
to suit the student's level. That is, in 1999 many bilingual students had their educational program interrupted (specially if the student was entering second or third
grade given that the average stay in bilingual programs was above two years) and
they were placed in a dierent educational program that heavily emphasized English
acquisition. What happened in 1999, after Proposition 227, to these former bilingual
LEPs? I nd that in 1999 former bilingual LEP students had scores in reading that
were not statistically dierent from those of non-bilingual LEPs.
(Table 5 about here).
Table 5 presents the results from a selection model analysis where scores from
1999 are explained by the same independent variables as in the scores equation for
1998. Complete results are included in Appendix C. As in the previous section, the
students included in the analysis are those in the district in both years. In reading
bilingual LEPs had scores 0.37 less than non-bilingual LEPs (p-value >0:2). That
is, at the 95% level bilingual LEPs scored statistically indistinguishably from nonbilingual LEPs. Likewise for math, the positive coeÆcient, 0:49; is not signicantly
23
dierent from zero (p-value >0:43). Non-bilingual students' scores also went up: in
1999 they are 12.8 points less than non-LEPs in reading, rather than 13.1 in 1998,
and 9.4 points less in math rather than 9.8 points.12 From Table 2 we know that
non-LEP scores went up as well though by a much smaller amount. So the reduction
in the gap between bilingual LEPs and non-bilingual LEPs is not due to the \top"
performing students going down. Rather, it seems the \bottom" performing students
caught up a bit. Exposing bilingual students to a program that emphasized English
acquisition did not set these students back relative to non-bilingual LEP students. In
this way we can refute the second hypothesis. Therefore:
The gap in scores between
Bilingual LEPs students and non-bilingual LEP students decreased in 1999.
An alternative way to analyze the impact of the reform is to look at individual gains
in scores. Table 6 below presents the overall gains for students in the district and
with test scores in both years. The gains of bilingual LEPs are statistically higher
than those of non-bilingual LEPs at the 95% level. With math the gains between
the two groups of students are statistically indistinguishable at the 95% level. This
preliminary inspection is consistent with the previous ndings. A more thorough
analysis taking into consideration control variables runs into diÆculties: gains are
not well explained by the independent variables previously introduced (the adjusted
12 With
regards to the other independent variables in the model we would not expect them to have
a dierential eect one year later and they display essentially the same magnitude and direction as
in 1998.
24
R2's are 0:06 and 0:04 in reading and math). That is, apart from language prociency,
the other independent variables cannot explain much of the gains. This is consistent
with the fact that, in theory, standardized tests are designed such that a student
who has learned his grade level material will test at the same percentile level as the
previous year (Rossell and Baker (1996)).
(Table 6 about here).
Finally, an interesting question remains with regards to alternative scenarios. Specifically, what would PUSD scores have been in 1999 if Proposition 227 had not been
passed? Figure 1 below shows three dierent average scores: average predicted scores
in 1999 for an alternative, counterfactual, scenario in which Proposition 227 was not
implemented, average predicted scores from the selection model in 1999 and average
actual scores in 1999.13 The average scores in reading and math for the scenario without implementation, 29.56 and 40.43, are lower than both the predicted and actual
average scores. As might be expected the dierence is larger for reading than math.
These results suggest that although increases in scores were to be expected independently of Proposition 227, due to students becoming more comfortable with the test
formats or teachers stressing preparation, if the measure had not been implemented
then average scores would have been lower.
13 The
average score for the alternative scenario is obtained by adding to the predicted average
score of a non-bilingual LEP in the 1999 selection model the marginal eect from bilingual instruction
in the 1998 selection model.
25
(Figure 1 about here).
Looking at individual level data from a representative district I found that bilingual students caught up to non-bilingual students after bilingual instruction was dismantled. A counterfactual scenario suggests that if the measure had not been implemented, scores would have been lower. In general, the emphasis on English instruction
did not seem to hamper former bilingual students' academic performance. Having
said this, these ndings are not conclusive since they may be in part due to district
eects not captured in my analysis or a possibly spurious short term eect. The next
section looks at county level data to check for general trends.
6. COUNTY LEVEL RESULTS
The results I obtained are for PUSD, a district that I consider representative of
California's districts. Yet, it is important to check whether similar results hold at
more general levels. To do so, I analyze county level data for California in 1998 and
1999 whic
1998, Hispanic, Black, White, AFDC, Free lunch, and LEP. I also include percentage
of teachers with full credentials. The standard caveats for any aggregate analysis
of educational data hold in this case: there is multicollinearity among variables and
there is always the possibility of committing ecological fallacies. I view the aggregate
analysis as a check of the previously observed results.
(Table 7 about here).
The t of the analysis is 0.35 for reading and 0.19 for math, of a similar order to the
t from an OLS analysis of the PUSD data. The only coeÆcients that are signicant
at the 90% level are Percent Hispanic, Percent Black, and Percent Full Credentials.
The coeÆcient on Percent Hispanic (0:029) is positive for reading, indicating that
counties with larger percentages of Hispanics had more students scoring above the
median. In particular, if the comparison is made between a county with no Hispanics
to a county with 40% Hispanics then 1.2% more students score above the 50th NPR
in reading. For math the predicated increase is 5%. These values would be consistent
with Proposition 227 not setting Hispanic students back, though a more denitive
test would compare gains from other years, for example 97-98, to those experienced
in 98-99. Systematic testing by law only started after 1998, so this comparison cannot
be made.
For reading the coeÆcient for Bilingual LEPs is negative, 0:01; but not signicant
27
at the 90% level ( p-value =0:29) and for math it is positive, 0:001; though also not
signicant (p-value=0:96). The fact that the coeÆcient is negative for reading may
partly be the result of only the higher performing bilingual LEP students being tested
in 1998. Across the state, 57% of LEPs got tested in 1998 while in 1999 the percent
tested ranged from 63% to 82% (the uncertainty in the latter is due to yet unresolved
problems by the test-makers when collecting language uency information). These
results higher Hispanic counties experiencing higher gains, and counties with more
bilingual students experiencing the same gains as those with less bilingual students
suggest that Proposition 227 did not set LEP students back. But again the ndings
are not denitive since one cannot compare the current gains to gains experienced
in other years nor account for the bias induced by a more selective pool of 1998
test-takers.
7. ROBUSTNESS
There are several clarications necessary when assessing educational reforms in
general, and in this study in particular. These will be addressed below.
Selection Model vs OLS:
If a standard linear model (OLS) is estimated instead
of the selection model estimated in Section 4, then bilingual LEPs score 1:8 less
in reading and 0:73 more in math than non-bilingual LEPs (p-values are 0:005 and
0:287 respectively). These numbers imply a smaller gap between the two groups than
28
those obtained earlier with the selection model: 2.4 less in reading and 0.5 points
more in math. The dierences between the two estimates (selection vs OLS) are
somewhat small considering that close to 50% of bilingual students did not take the
tests compared to 11% of non-bilingual LEPs. The selection model is the correct
methodology to analyze the PUSD data (the estimates are consistent and the model
is identied) but it may not be capturing the test-taking process completely. As
a result, the selection model estimates will not be dramatically dierent from the
OLS estimates. Using a selection model, future research should incorporate further
variables into the test-taking equation, such as English prociency level or teacher
certication and racial background.
Redesignation:
Bilingual and non-bilingual students were redesignated in 1999
as non-LEPs. This would have occurred with or without the reform. If I repeat the
same analysis done throughout this paper but excluding bilingual and non-bilingual
students who were redesignated as non-LEPs in 1999, then the same qualitative results
obtain. Bilingual LEPs (the reduced set) had lower scores in 1998 than non-bilingual
LEPs (also a reduced set), but in 1999 they caught up with them.
Stable population of students bias:
The analysis in this paper looks at stu-
dents who were in the district in both years, before and after Proposition 227. The
rationale was that in this way the general district-wide impact would be held constant. Bilingual students that arrive may have had very dierent experiences in their
29
bilingual instruction, further complicating the comparisons with non-bilingual LEPs.
This choice of a more stable population can induce bias. The direction, though, is not
clear since anecdotal accounts often refer to the high mobility of low income students
yet the data suggests otherwise. In PUSD the group who left in 1998 had statistically the same percentage of LEP students and mean SES level as those who stayed.
Moreover, the percentage of white students who left was slightly higher (signicant at
the 95% level) than those who stayed. Repeating the analysis done in the paper, but
without a restriction on enrollment for two consecutive years yields the same qualitative results. In 1999, bilingual LEPs scored indistinguishably from non-bilingual
LEPs.
Technical Issues:
The data does present some heteroskedasticity due to boundary
eects. That is, larger errors occur in the estimation when predicting close to the
boundaries. All estimations were done without including robust standard errors to
minimize the chances of incorrectly concluding a variable had signicant eects. If
1999 results are checked with a probit analysis in which a one codes for scores above
50 and 0 otherwise, then again bilingual LEP students scored no dierently than
non-bilingual LEPs in 1999.
7. DISCUSSION
30
In this paper I have shown that before Proposition 227 bilingual LEP students were
scoring in reading lower than non-bilingual students, as would be expected since they
had not yet been redesignated and they were taught with a heavy emphasis on their
primary language. One year later and these former bilingual students have reading
scores that are indistinguishable from non-bilingual LEPs students who in principle
already had a better command of English. Non-bilingual LEP students themselves
had better scores in 1999 than in 1998 implying that the lower performing students,
the bilingual LEPs, were catching up to them, rather than the non-bilinguals doing
less well. I conclude then that interrupting bilingual students' length of stay in
bilingual programs did not set them back relative to non-bilingual LEP students. A
counterfactual analysis of the Pasadena data further suggests that if Proposition 227
had not been implemented then bilingual students would have had lower scores. The
benets of the initiative may have been positive after all.
The methodology used for the analysis is a selection model that estimates scores
and the probaility of taking a test. I argue this is an appropriate methodology given
the potential bias among the population of test-takers towards higher performing students. Further research will hopefully focus on the test-taking process and incorporate
more predictors of this process.
At the county level I nd that counties with larger Hispanic populations had higher
gains than counties with less Hispanics. Counties with more bilingual students on the
31
other hand had gains no dierent from those with fewer bilingual students. These
ndings are consistent with the initiative having no deleterious eects on students
but, without comparisons to gains in other years, cannot be denitive.
Some caveats are in order. First, reforms that completely dismantle bilingual instruction may not be benecial either. What this paper has suggested is that interrupting bilingual instruction does not set them back. The key for success may actually
be in \small doses" of bilingual instruction early on. And in fact some studies have
found that exposure to the primary language is benecial (Rossell and Baker (1996)).
Further, an important point to remember is that factors other than programs for
English learning have larger impact on student scores. Factors such as socioeconomic
advantage cannot be modied by the schools or district. But full credentialing can be
required by schools. It is possible that on average an emphasis on teachers standards
may help students (and English learners) more than an emphasis on nding the best
language program (Hakuta (1999)). An nally, as often mentioned in the paper, long
term eects are what actually matter and, for this data, from the next three or four
years will prove enlightening.
REFERENCES
[1]
Alvarez, R. M. (1999). \Why did Proposition 227 pass?." Working paper 1062, Caltech.
32
[2]
Amselle, J. (1999). \Teaching English Wins: An Analysis of California Test Scores After
Proposition 227." Read Perspectives, Abstract, Fall.
[3]
Bali, V. (2000).\Proposition 227 and California" The EÆcacy of Bilingual Education Revisited." Mimeo, Caltech.
[4]
Baker, K. and deKanter, A. (1983). \Federal Policy and the Eectiveness of Bilingual
Education." K.Baker and A. deKanter (Eds.), Bilingual Education, 33-86, Lexington,
MA.
[5]
Collier, V. (1992).\A synthesis of studies examining long-term-language-minority students
data on academic achievement." Bilingual Research Journal, 16 (1&2), 187-212.
[6]
Collier, V. and Wayne P. Thomas (1989). \How Quickly Can Immigrants Become Procient
in School English?." The Journal of Educational Issues of Language Minority Students,
5, Fall, 26-38.
[7]
Wayne A. Cornelius and Francisco J. Martinez, eds. (2000). \Educatin California's Immigrant Children: The Origins and Implementation of Proposition 227." Monograph
No.2, LaJolla, Calif.: Center for Comparative Immigration Studies, University of
California-San Diego.
[8]
Crawford, J. (1995): \Bilingual Education: History, Politics, Theory and Practice." Crane,
Trenton, New Jersey.
33
[9]
Faltis and Hudelson (1998). \Bilingual Education in Elementary and Secondary Communities." Allyn and Bacon, Massachusetts.
[10]
Garcia, E. (1991).\The Education of Linguistically and Culturally Diverse Students: Eective Instructional Practices." Report from National Center for Research on Cultural
Diversity and Second Language Learning.
[11]
Greene, J. (1998). \A Meta-Analysis of the Eectiveness of Bilingual Education." Mimeo,
University of Texas.
[12]
Green, W. (1993). \Econometric Analysis." Prentice Hall, New Jersey.
[13]
Hakuta, K. (1999). \What Legitimate Inferences can be Made from the 199 Release of SAT9 Scores with Respect to the Impact of Proposition 227 on the Performance of LEP
students?." Release on website http://www.stanford.edu/~hakuta/SAT9.
[14]
Krashen, S.D. (1996). \Under Attack: The Case Against Bilingual Education." Language
education Associates, Culver City, California.
[15]
Krashen, S. D. (1999). \Condemned without a Trial: Bogus Arguments Against Bilingual
Education." Heinemann, Portsmouth, New Hampshire.
[16]
Ji, Chang-Ho C. (2000). \Education and Ballot Measures in California: Reections on the
Thirty-Year Experience." Working Paper, Univeristy of California, Riverside.
34
[17]
Maddala, G. (1983). \Limited Dependent and Qualitiative Variables in Econometrics."
Cambridge University Press, New York.
[18]
National Research Council Report (1998). \Educating Language-Minority Children." National Academy Press, Washington D.C.
[19]
NCES Report 767 (1995). \The Condition of Education: The Educational Progress of
Hispanic Students." National Center for Education Statistics.
[20]
Ramirez, J.D. (1992). \Executive summary." Bilingual Research Journal, 16 (1&2), 1-61.
[21]
Rossell, C. and Baker, K. (1996).\The educational eectiveness of bilingual education."
Research in the Teaching of English, 30 (1), 7-74.
[22]
Rossell, C. (1998). \Mystery on the Bilingual Express: A Critique of the Thomas and Collier
Study". Read Perspectives, V(2), Fall, 5-32.
[23]
Willig, A. (1985). \A Meta-Analysis of Selected Studies on the Eectiveness of Bilingual
Education." Review of Educational Research, 55, 269-317.
APPENDIX A:
ESTIMATION OF TOTAL MARGINAL EFFECT IN
HECKMAN'S SELECTION MODEL.
Following Green's notation (1998) the total eect is as calculated as follows. Consider two equations, one for the selection model and the other for scores.
35
zi = wi + ui
Selection
yi = xi + i
Scores
0
0
Then for an observed yi we have,
E [yi jyi is observed] = E [yi jzi > 0]
= xi + E [ijui > wi]
0
0
= xi + e mi(u)
0
(1)
where,
wi
u =
u
()
mi (u ) =
() = Mills Inverse Ratio:
0
(:) is the density of a normal and (:) is the cumulative density function of a
36
normal. The term e is often referred to in the economics literature and in the
present takes on values 0 or 1 then the marginal eect is:
E [yijzi > 0; xk = 1] E [yi jzi > 0; xk = 0] = k + e[mi (xk = 1) mi (xk = 0)]
APPENDIX B:
TOTAL MARGINAL EFFECTS FOR READING 1998.
To calculate the total marginal eects we rst obtain mi(xk = 1) mi(xk = 0)
holding all other variables in wi other than xk at their mean value (or modal value
if more appropriate): Below I present the estimates for the dierences in the inverse
Mill's ratio as well as the total eect for reading variables in 1998.
T otalEffect = ScoreEffect + Lambda [Mills(1) Mills(0)]
BilingualLEP 1998 2:4 = 9:2 + 15:8 [0:65 0:22]
(2)
LEP 1998 13:1 = 14:7 + 15:8 [0:22 0:12]
(3)
37
Hispanic 6:3 = 7:28 + 15:8 [0:12 0:065]
Black
9:8 = 11:06 + 15:8 [0:14 0:065]
(4)
(5)
In (1) the dierence in the Mill's ratio was obtained with regards to a LEP and
Hispanic student who went from non-bilingual to bilingual while all other variables
in the selection equation were set at their meadian. For (2), Bilingual LEP was zero
and Hispanic was one. For (3), Bilingual was zero, and LEP was one. And for (4),
Bilingual and LEP were zero.
38
Fly UP