...

Ecological inference reversed: Estimating aggregate features of voter ideal-point distributions from

by user

on
Category: Documents
25

views

Report

Comments

Transcript

Ecological inference reversed: Estimating aggregate features of voter ideal-point distributions from
Ecological inference reversed: Estimating aggregate
features of voter ideal-point distributions from
individual-level data
Incomplete, comments welcome, version 0.1
Jeffrey B. Lewis
Department of Politics
Princeton University
July 13, 1999
Abstract
In the last decade a great deal of progress has been made in estimating spatial models of
legislative roll-call voting. There are now several well-known and effective methods of estimating the ideal points of legislators from their roll-call votes. Similar progress has not been
made in the empirical modeling of the distribution of preferences in the electorate. Progress
has been slower, not because the question is less important, but because of limitations of data
and a lack of tractable methods. In this paper, I describe the existing technologies for inferring
ideal points. I then develop a method for recovering the relative means and variances of the
voter ideal point distribution across two (or more) groups of voters from individual-level binary
response data. I extend the model to multiple dimensions and describe tests for dimensionality
and inter-group differences. I then present Monte Carlo results demonstrating the efficacy of
the method.
Prepared for presentation at the Annual Meetings of the Political Methodology Society, College
Station TX, July 1999. A previous related work was presented at the 1999 Midwest Political
Science Association Meetings and at the 1998 American Political Science Association meetings.
Addresses: Department of Politics, Corwin Hall, Princeton University, Princeton NJ 08544 and
[email protected]. This research was supported by a grant from the Woodrow Wilson School of Public
and International Affairs, Princeton University. Thanks to Liz Gerber and David Epstein for comments on a related
paper.
1 Introduction
Political analysts are often interested in whether there are differences in the distribution of preferences, ideology, or policy positions across two (or more) groups of voters or other political
actors. For example, we may wish to know whether incumbent Democratic house candidates are
on average more liberal than Democratic challengers. Or we may want to know if there is greater
heterogeneity in abortion preferences among men than there is among women. However, existing
methodologies for such inference is generally inadequate. In this paper, I draw on the psychometrics literature on test taking to develop a method of inferring the relative differences in ideal point
distributions across groups using individual-level binary response data.
The method could be used to recover the distribution of latent characteristics other than ideal
points and the individual actors need not be voters. For example, in the psychometric literature
on item response, the individuals are usually “students” and the latent characteristic might be
“intelligence.” However, I will use the convention of referring to the actors as “voters” and to the
characteristic whose distribution we are inferring as ideal points.
At first glance, the problem seems rather trivial. Surveys such as the American National Elections Study ask voters to place themselves on policy scales. If we take responses to such questions
as indicative of voters’ ideal points, tests of differences in the mean or the variance of ideal points
across groups are straightforward. However, such seemingly direct measures may be problematic.
It is unclear how people conceive of such scales. Does a mid-point of a 7-point scale reflect the
same degree of liberalness to all voters? This problem may be particularly acute for inter-group
comparisons. For examples, a right of center Republican may consider herself to be somewhat
liberal by comparison to her largely Republican peers while a similarly right-wing Democrat may
see himself as moderately conservative. Moreover, it remains an open question whether voters
even conceive of such scales in a way that is consistent with the spatial voting model (see Hinich
& Enelow (1984) and Rabinowitz & MacDonald (1989)). Finally, responses are known to be unreliable (Achen, 19XX). A fact which is not surprising given the somewhat unnatural nature of the
question.
Alternatively, it might be preferable to infer voters spatial positions from responses to a series
of policy questions. These questions might be votes on propositions (see Lewis (1999) or they may
2
be answers to survey questions. All that is required is that in each case the voter be asked to express
her preference over a specific policy question. For example, abortion preferences might be inferred
from a series of yes/no responses on different conditions under which access to abortion should
be allowed.1 While these “questions” may be actual votes or answers to survey questions, I will
follow the convention of referring to the observed voter choices as votes. Such an approach has
the advantage that the basic items are, more or less, objective in the sense that they mean the same
thing to all voters. Of course, researchers have been estimating voter preferences by constructing
indices of binary policy questions for some time. However, such indices are rather crude and do
not (in general) have desirable statistical properties. More sophisticated factor analytic techniques
have also been employed.
These factor analytic models including those specifically designed for binary choice items (like
Poole & Rosenthals NOMINATE). are fundamentally as problematic as the simple index for these
data. In particular, all such methods require the estimation of each individuals ideal point. These
individual estimates can then be used to test for differences in ideal point distributions across
groups. However, as noted below. The individual-level ideal point estimates are (at best) only
consistent as the number of questions asked to each respondent grows large. In most cases, we
have only a very few decisions upon which to make inferences about voter preferences. The
inconsistent estimates of each voter’s ideal point, lead to inconsistent estimates distribution of
ideal points within each group.
One obvious alternative is to use the individual-level response data, to estimate features of the
groups level ideal-point distributions directly without the intermediate of estimating each voter’s
ideal point. While techniques such as covariance structure modeling (L ISREL) have this flavor,
very few examples of this sort of approach have been employed in the Political Science literature.
On the other hand such approaches have been used extensively in the psychological literature on
test taking (see, for example, Lord & Novick (1968), Anderson & Madsen (1977), Bock & Aitken
(1981)).
In what follows, I will recast some of the insights from the item response theory to the problem
of recovering the distribution of voter ideal points. In particular, the model I develop allows me to
1
The questions need not be binary though I will only consider binary questions in what follows.
3
estimate of inter-group differences in the means and variances of ideal points distributions along
two latent dimensions. I then demonstrate the efficacy of the technique through Monte Carlo
simulation. The Monte Carlo experiments reveal that with as few as seven observed votes per
voter, inter-group differences in ideal point distributions in a two-dimensional policy space can be
recovered.
The paper unfolds as follows. In the next section, I review the literature on recovering ideal
points from voting or other binary choice data. In section three, I develop a model for estimating
inter-group differences in means and variances of voter ideal points across multiple spatial dimensions. In section four, I consider measures of fit and test of dimensionality for the model. Section
five presents some preliminary Monte Carlo results. Section six concludes.
2
Spatial models of legislators’ preferences or spatial locations
The obvious place to start in any consideration of recovering spatial positions from binary choice
data is models of roll call voting in legislatures. Over the last twenty years there have been great
advances in the methods for mapping legislators in policy spaces based on their roll call voting
records. Unfortunately, these models are not directly importable to the question at hand. Almost
all existing models of roll call voting only have desirable statistical properties as the number of
observed votes grows large. In a legislative setting this is not particularly problematic because for
most legislatures the number of recorded votes is indeed quite large.
Despite the differences in the observed data, models of legislator’s preferences are the natural
place to start in constructing a technique for inferring voter preferences. Thinking about differences
in legislators’ voting records in a spatial way has a long history, perhaps the earliest example of this
sort of analysis is Thurstone (1931). The methods employed in this early work were not based on
any explicit theoretical model of preferences and were mainly concerned with identifying voting
coalitions in the Congress, various state legislatures, and even the United Nations [ADD CITES].
Recent advances in the spatial analysis of roll-call data are explicitly grounded in formal models of voting. In these spatial theories of voting (see Hinich & Enelow (1984)), public policy
alternatives are represented as points in space and legislators’ preferences over those points are
defined by the distance between the policy point that the legislator would most like to see imple-
4
Methods of Estimating Legislator/Voter Locations
Estimator
Method
type
Vote index
NP
Guttman Scaling
NP
Heckman–Snyder scores
GLS
NOMINATE scores
ML
Random proposal models
MML
Rasch models
CML
Covariate models
ML
Random effects covariate models
MML
Consistent as:
n/a
n/a
Citation/example
ADA score
Anderson et. al., 1966
Heckman & Snyder (1997)
Poole & Rosenthal (1997)
Londregan (n.d.)
Lahda (1991)
Peltzman (1985a)
Bailey (1998)
Table 1: Table lists commonly used methods of ideal point estimation. NP = Non-parametric,
GLS = Generalized Least Squares, ML = Maximum Likelihood, CML = Conditional Maximum
Likelihood, MML = Marginal Maximum Likelihood, = Number of legislators/voters, =
Number of votes.
mented (the ideal point) and each of the alternatives. With this theoretical framework (articulated
further below), estimated locations of the legislators take on a new significance. That is, the spatial
locations of the legislators can be interpreted as their ideal points. As it turn out, a spatial voting
model interpretation can be given (roughly speaking) to many of the earlier methods (Heckman &
Snyder 1997).
Table 1 lists a number of techniques that have been used to estimate legislators’ spatial locations or ideal points.2 Each of these methods represents a potential approach to the problem of
estimating voter ideal points. In the end, none are particularly suited to the problem. However,
some consideration of their properties highlights the relevant data and statistical issues involved.
While some of these models are explicitly designed to place legislators in a multidimensional
space others are restricted to a single dimension. A few can be extended to multiple dimensions
only with a great increase in computational cost (e.g. Rasch model and random-effects models).
The most important statistical issue is the consistency of the estimated ideal points. The problem is most easily seen from a maximum-likelihood perspective. Supposing a maximum-likelihood
estimator for each of the ideal point models could be written, it would have the form
2
In discussing the methods, I use the terms “location” and “ideal point” interchangeably.
5
where is a !" matrix of observed votes (the data), #
characteristics of the proposals, and $
is a vector of parameters describing
is a vector of parameters describing the distribution of ideal
points. Suppose for starters that each of the " proposals to be voted on has one or more elements
of #
associated with it and that $
is a vector of ideal points. Consider the ML estimators for
# and $ as the size of the data matrix increases. As the number of legislators grows by one so
does the size of $ . Similarly, as " grows so does the size of # . This proliferation of parameters
as the sample size increases is well-known to undermine the standard consistency results for ML
estimators (Neyman & Scott 1948).
One way around this problem has been to show that estimates of $
will be consistent under
a so-called “triple” asymptotic condition (Haberman 1977). In these cases, $
can be consistently
estimated if the following three conditions hold: (1) the number of roll calls goes to infinity, (2)
the number of legislators goes to infinity, and (3) the ratio of votes to legislators goes to infinity.
In other words, these estimators will work if you have a large legislature that takes a lot of roll call
votes.3 While I have not extended Haberman’s triple asymptotic result to Poole and Rosenthal’s
N OMINATE procedure, I believe their method is consistent under these conditions. From the standpoint of estimating voters preferences (as opposed to legislators’ preferences), this is certainly not
going to be a compelling result. While it may be correct to think of the number of voters as approaching infinity, the number of votes is not large and certainly in no way could we think of the
number of votes over the number of voters as large.
In cases where the triple asymptotic condition seems unlikely to hold, several alternative solutions have been presented. The first is to specify the model in such a way that their exists a
sufficient statistic ( % ) of the data for # . In this case the likelihood can be reformulated as:
&'
)( $+* #,.-0/
1
&.2 '
1
1 ' 1
( $+* % , 3 % ( #4,
Since the corresponding log likelihood,
5 6 &'
3
)( $+* #4,.-87 1
'56 & 2 '
1
1
56 ' 1
( $+* % , 9
3 % ( #4, ,*
It should be noted that under this triple asymptotic condition the : will still not be consistently estimated.
6
is additively separable in ;
and < , the value that maximizes => ? @ over ;
will be the same as the
value that maximizes =.A > ? @ over ; . Assuming =.A > ? @ meets the usual conditions for consistent ML
estimation, we see that we can get consistent estimates of ;
likelihood of ;
as BC grows large by maximizing the
conditional on D . This method is referred to as Conditional Maximum Likelihood.
Both the Heckman–Snyder and (one-parameter) Rasch-type models take this approach. From the
standpoint estimating voter preferences, even these models require that a large number of votes be
observed.
The above approaches can all be put under the general heading of fixed-effects models. That
is, they treat all of the vote characteristic and the ideal point parameters as fixed constants to be estimated. Another approach would be to treat the set of proposals as a draw from some distribution,
E > FG@ . The likelihood of => H)I ;@ can then be written as
L
=> HI ;+@GJ0K
LMN=> H
I ;+O F.@ PRQS> FG@ T
Integrating out the nuisance parameters ( < ) and maximizing (1) over ;
(1)
is referred to as Marginal
Maximum Likelihood. Londregan’s (n.d.) random coefficient model takes this approach. Londregan presents a model in which E > FG@ is conditioned on a set of proposal-specific covariates (e.g.
party of the proposer). This use of auxiliary information weakens the importance of the arbitrary
distributional assumption, U > ? @ . The coefficients on the covariates are also of direct substantive
interest. The model is consistent as the number of votes grows large. Since Longregan’s method
still requires the number of proposals to grow large, it is not appropriate to the problem at hand,
but it does suggest a possible course.
Rather than using auxiliary information to help identify the distribution of proposal characteristics, we could parameterize the idel points using covariates. This is quite common in the literature.
The most common form of the “covariate” models are those in which the legislators’ ideal points
are assumed to be a deterministic linear function of a set of covariates,
;
where ;
L
is legislator X ’s ideal point and V
L
L
J8V
LW
is a vector of observed legislator attributes (e.g. party,
7
constituency characteristics, or previous occupation). These deterministic models are commonly
used (at least implicitly) in both models of legislative voting and candidate elections. Generally
these models involve running a probit or logit regression of a single roll call or vote choice on a
\
vector of covariates. The ideal points can then be estimated as YSZ
[ . While these models can be
applied to situations in which many decisions are observed, the estimated ideal points are consistent
with a single observed roll call as the number of voters or legislators grows large.
The obvious problem with the deterministic covariate model is that it is highly restrictive to
assume that the ideal points can be written as a deterministic function of a set of covariates. Surely,
unobserved traits must also affect the values of ] . Bailey (1998) generalizes the deterministic
covariate model by assuming that the ideal points are a linear function of a set of covariates and a
^
legislator specific random shock,
Z _ Y Z
\`a
Z
^
Bailey’s model can be thought of as a mirror image of Londregan’s.
Bailey treats b
as a set of
fixed effects to be estimated and integrates over the random ,
cd e)f
b4g
\h
_8i
cd e
Zj
f
^
Z gb
^
h kRl d f
Y+g
\h m
a
Having assumed a distribution for , Bailey can then consistently estimate the distribution of leg-
^
islator ideal points as the number of legislators
grows large. To find a particular legislator’s ideal
point, the a posteriori expectation n
o
d
Z
fe
g
\
gb
h
o
can be estimated using estimated the [ and p [ in
place of p and . The consistency of these estimates require that the number of legislators and the
number of rolls calls grow large.
Because it is only the distribution of voter ideal points and not each individual’s ideal point that
is of interest, the random-effects covariate model seems promising. It provides estimates of the
parameters of the conditional ideal point distribution as the number of legislators (voters) grows
large. However, the voting data contains no covariates. The challenge is to develop a model that
will allow us to estimate consistently the distribution of ideal points without covariates.
8
3
The basic spatial model
My statistical model begins from the standard spatial model of voting (Hotelling 1929, Downs
1957, Black 1958, Hinich & Enelow 1984). In the spatial model, it is assumed that policy choices
can be represented by points in Euclidean space. Each voter is assumed to have a most preferred
policy position in a q -dimensional space, r+sut v w x v y x z z z x v { | . A voter’s utility for various policy
alternatives is defined by a function of the distance between the position of the alternative and the
voter’s ideal point. Following the usual convention in the literature, I assume this function is a simple quadratic. In order to introduce uncertainty into the vote choice, I assume that voters’ utilities
for various alternatives are not solely determined by their spatial positions but are also determined
by an additive idiosyncratic shock } . Thus, the utility for a voter at r from the implementation of
a policy ~Nst € w x € y x z z z x € { | is,

‡
‡ y‚‰
t r‚x ~|.s„ƒ†…ˆ‡!t v ƒ € |
}
where, by assumption, }‹Š0Œt Žˆx  | and i.i.d. across alternatives.
Assume that all choices are over exactly two alternatives (there is no abstention). The difference


between the utility
provided
by any two alternatives € and 
t r‚x ~|‚ƒ
‡
t r‚x ‘|’s“ƒ†…‡†t v ƒ €
‡y
‡y
s … ‡†t  ƒ€
is:
‡ G
‡
y ‰
|
…ˆ‡!t v ƒ 
‡
‰
| … ‡„”•t € ƒ!
‡ G
y ‰
‰
|
}w }y
‡ ‡
| v ƒ8t } w ƒ} y |
Assuming sincere voting in the sense that voters vote for the alternative that they prefer.4 the
probability that a voter with ideal point v votes for alternative € over alternative 
–.— ˜
™
tš
˜ › œ. ˜R—
€ž r|.s
–.— ˜
™†Ÿ
is:
‡
‡ ‡‚ ‡
‡y ‰
… ‡†t  y ƒ € | … ‡„”•t € ƒ | v
t } y ƒ} w | ¡
4
Of course, since the each voter has virtually no chance of changing the outcome of the election with her vote,
voting for her preferred outcome is not a strictly dominating strategy.
9
Because ¢¤£0¥¦ §ˆ¨ © ª . ¢ «.¬¢ ­.£0¥¦ §ˆ¨ ® ¯
© ª ,
°.± ²
³S´
¦ À « ¬!µ « ª ¶
¯ˆ¦ µ ¿ ¬À ¿ ª · ¿
«
«
¿
¦ µ « ¬µ ­ ª ¶)¯ˆ¦ µ­G¬µ« ª ·¹¸„¦ ¢ «.¬¢ ­ ª º4»0¼8½‹¾8¿ ¿
¾ ¿
Á
Â
® ¯ ©
where ¼¦ à ª is the standard normal cumulative distribution. Letting ÄˆÅ Æ represent voter Ç ’s choice
over È É¹Æ ¨ Ê4Æ Ë where ÄÅ Æ»Ì denotes the choice of ÉÆ and ĈŠƻ0§ denotes the choice of Ê4Æ and
Ô
Ô
Ø Ó « Ð Ù ÕˆÔ Ö Ó Ð × and Ú Æ »
letting Í Æ »ÏÎÐ Ñ Ò
¿
°.± ²
³
« ØÔ Ô
Ñ Ö Ð « Õ Ù ÒÔ Ð × , we find the familiar probit model.5
Ô
Ú Æ · Å ª Ý Þ ¦ ̤¬)¼¦ Í Æ ¶)Ü
¿ ¿
¦ Ä Å Æ »8Û Å Æ ªG»0¼¦ Í Æ ¶)Ü
¿
¿
Ô
Ú Æ · Å ª ª Ñ­Õ ÝÞ ×
¿ ¿
4 Estimating the model
Given this basic model of each vote choice, I now turn to the question of estimating the parameters
of the model. In order to go from the theoretical model presented above to a statistical model, I
must make some additional assumptions. First, I assume that, conditional on the parameters, the
vote choice probabilities are independent across votes and voters. That is, each decision made by
each voter is an independent draw. Thus, the likelihood of observing a vector of votes (Ä Å ) by a
voter Ç located at · Å is
ß
¦ ÄÅ à Í.¨ Ú.¨ á•Å ª.â„ã
Æ
¼¦ ÍÆG¶ Ü
¿
Ô
ÚRÆ ¿ · Å ¿ ª Ý Þ ¦ ̬)¼¦ ÍÆG¶ Ü
¿
Ô ä
ÚRÆ ¿ · Å ¿ ª ª ­ Õ Ý Þ
As shown here, the likelihood is conditioned on the observed á . The following the random effects
approaches discussed above, I will place a multivariate density åG¦ Ã ª over the vector á . I then
integrate out over this density.
ß
¦ æSÅ à 獨 èªG»0éêé
äää
éNë¦ ÄˆÅ à 獨 è¨ ·Rª åG¦ · ­ ¨ · « ¨
äää
¨ · ¿ ª ìR·
­ ìR· «
äää
앷 ¿
The resulting likelihood expresses the observed data only in terms of the parameters of interest.
Because the above integral is very unlikely to have a closed form, it is approximated by a
Defining í¤îï¤ð where ï is a vector of observed characteristics, we have the deterministic covariates model of
ideal point estimation described above.
5
10
Form of the data
òˆó Pattern
ò•ó òˆó ò•ó òˆó Rò ô
òˆó ò•ó òˆó ò•ó òˆó õ ô
òˆó ò•ó òˆó ò•ó õRó òRô
ñ ò•ó
ñ ò•ó
ñ ò•ó
ñ õ ó ˆò ó õ ó
ñ ò•ó Rõ ó õ ó
ñ õ
ó Rõ ó õ ó
Frequency
5
2
5
..
Rõ . ó õ ó Rõ ó õ ô
Rõ ó õ ó Rõ ó õ ô
Rõ ó õ ó Rõ ó õ ô
Total
13
16
75
1000
Table 2: A listing of the frequency of each voting pattern completely captures the information in
data. Thus, the estimation problem can be thought of as fitting a multinomial distribution with öR÷
catagories (one for each pattern). The ability to render the data in this multinomial way is very
convenient as it allows all the outer sums in the log likelihood to run over the ö
÷ categories rather
than ø voters. With large data sets the computational advantages can be very large.
quadrature. That is the integral is replaced by a weighted sum of a given number of points,
ù ñ úû ü ýó þô.ÿ
ù ñ úû ü ýó þó •ô ñ •ô ó
(2)
Given that vote choices are assumed to be independent across voters, the total log likelihood
can be written as
û ù
ù ñ ú û ü ó ý4ó þ4ó
However, the situation can be greatly simplified. If
ô ¹ñ ô (the number of items) is not overly large, the
number of patterns in the data ö
÷ is considerably smaller than the number of voters ø . Since the
vector of votes is all that I observe for each voter the value of the likelihood is clearly the same
for all voters who share the vote vector
share a vote profile
ú û
. Thus, defining
to be the total number of voters that
, the likelihood can be rewritten as
ù
ù ñ ! ü ýó þ4ó •ô ñ •ô "#
Equation (3) can be maximized directly using standard numerical maximization techniques.
11
(3)
$
All that remains is the selection of a distribution . The usual convention is to construct the
distribution such that the ideal point dimensions are orthogonal. Because the likelihood is function
%'&)(+* & , - & ), one cannot jointly estimate the
covariance between the ideal point distribution and the ( s. Thus, the usual approach is assume
of a linear combination of the ideal point elements (i.e.
independence. Under independence, we have
$/. 01324$)5 . , 5 1 $ 6 . , 6 17 7 7 $ 89. , 81 7
The quadrature estimates are then also written as sum of over the product of the weights along each
: ;=<
dimension,
?2 @ 6 A > @
5B
: ;DC
?? 7 7 7 ?
<
@+F GIH JKH L
.E
1 MN5 . O)5 1 MP6 . O 6 17 7 7 MP89. Q/81 R#7
At this point, I have fully specified the model for a single group. Extending the model to test
for intergroup differences is straightforward. The basic intuition is the following. Suppose, that
the distribution
$
has parameters
S
. For example, if the
$
were taken to be spherical normal
S
H VH H W
7 7 7 : .;]We\^_can^ then^`write
parameters
differ across groups T2U
: ;=<
< the log likelihood as
GIH JKH X4Y H X'Y H H
@ F GIH JKH L F [
ZA ?@ 6 A > [ @
?
[
.
7 7 7 .E
1 $/. 0 S 1 a , 5 a , 6 7 7 7 a , 8b 7
5 6 7 7 7 S=Z9132 5 5B
For identification, (at least some of) the elements of S must be fixed for one of the groups. Notice
cedata
V)f continues to have a multinomial where form the number of “patterns” in the data will
thatWd
the
G
J
be
. Assuming a well behaved $ , the above can be estimated by MML.
would be comprised of the means and variances along each dimension. Further suppose that those
6
Note that the leverage in this model comes from the restriction that
s and
s are the same
across groups. Without this restriction no comparison between the group distributions would be
g
(
possible. In effect, the item parameters ( s and s are used to place the groups in the same space.
6
h
Alternatively, the sum or mean of the groups s might be fixed.
12
5 An example assuming ideal points are distributed uniformly over rectangles
In this section, I consider two examples of the general model developed above. In these examples,
i
the number of votes ( ) is set at 7 and the number of dimensions is fixed at one for the first example
and two for the second. The number of groups is fixed at two. The data are simulated from a known
process.
The distribution of voter ideal points is assumed to be uniform over a rectangle in the two dimensional case and uniform over a closed interval in the case of one dimension. The choice of the
uniform distribution is restrictive.7 Much less restrictive distributions might to used.8 While spherical distributions such as the bivariate normal would perhaps be more appropriate for empirical
applications, the rectangular distribution is preferable for the Monte Carlo experiments presented
below. The problem with spherical distributions is that the dimensions are only defined up to an
orthogonal rotation of the axes. In general this is not problematic and may even be an advantage allowing the analyst to pick the rotation that, for example, captures the most inner group differences
along a single dimension. However, for Monte Carlo experiments invariance to rotation makes the
results difficult to interpret since the true data generating process can be any on a continuum of
orthogonal rotations. This is not true for the uniform distribution of a rectangle. While this distribution is not invariant to rotation, the number of isomorphic transformations are very few and they
are easy to compare to one another. In particular, the maximum likelihood can be achieved only be
exchanging the axes, inflecting the axes, or some combination of the two. Thus, by the judicious
selection of starting values, the parameters estimated in each iteration of the Monte Carlo can be
directly compared to the parameters used to generate the data.
The distribution of ideal points for each group is defined as follows. Group one voters are
assumed to be uniformly distributed over the interval
j kKlm l)n in the unidimensional case and uni-
j kKlm l)noej kPlpm l n square in the two dimensional case. Group two members are assumed to be uniformly distributed over the inteval j kKl)qDrts/m l)qr
sn in the case of one
dimension and uniformly over the j kKl)quvr'swu m l)qu3r's9u nvoxj kPl+qpy=r'swy m l+qpy=r'sy n rectangle in
formly distributed over the
7
To avoid the issue of picking the optimal quadrature, the simulated data is drawn from distribution of the a 7point discrete distribution the corresponds to the 7-point quadrature that would of the continuous uniform distributions
described in the text.
8
See Lewis (1999) for a semi-parametric estimator of ideal point distributions in the one-dimensional case.
13
Possible distribution of ideal points for each of two groups
z
~
}
~
|
~
{
Figure 1: Graph shows possible distributions of ideal points for each of two groups. For identification, the location of one group must be fixed. Here we assume that the ideal points of members of
group one are uniformly distributed over a
square centered at the origin. Given this assumed
distribution of group one ideal points (the solid square in the graph), we can the then estimate
the distribution of ideal points for group two. The three dashed boxes are examples of possible
rectangles over which group two member might be uniformly distributed.
€K#€
14
Results of MML estimation on simulated data: One dimension
Item True
1
0.1
2
0.2
3
0.3
4
0.4
5
0.5
6
0.6
7
0.7
‚Item parameters
ƒ
Estimated
0.087
0.222
0.306
0.413
0.452
0.574
0.643
Estimated
0.489
0.329
0.089
-0.386
1.204
0.459
0.810
True
0.5
0.3
0.1
-0.4
1.2
0.5
0.8
Group 2 parameters
True
0.4
1.2
Shift
Stretch
Estimated
0.433
1.205
„K …)Ž †+‡ˆ+‰ †+Š
ˆ)‰ †+
†p ‰ Ž+™ ‡
†+†)†
‹3Œ
3 ‘ ’)“w” •9– ”  Špˆ —v˜ 3‹ Œ —
š
Log likelihood
Table 3: Table shows the estimated parameters of the one dimensional model. The data are simulated. The true parameters used to generate the data are given in the table.
the two dimensional case. The parameters
›
determine how far the mean group two ideal point is
shifted away from the mean group one ideal point. The
œ
parameter describes how stretched or
compressed the group two ideal points are relative to the group one ideal points.
Examples of possible ideal points distribution for group two are shown in figure 1. The two
 .
groups could be found to have relatively similar central tendencies along both dimension with
   . Or, group two could be more

heterogeneous along one dimension and more homogeneous along the other as in .
group two being much more homogeneous as in
On the other hand, group two could have
much more extreme positions and be more homogeneous as in
Table 3 shows the estimates of the model on simulated data with a single dimension. In this
example, seven items were used. Notice that in general the estimates are very close to the true
15
values. The estimated
ž
and
Ÿ
are all quite good as is the estimate of the shift of the stretch
parameter. Somewhat farther off is the estimate of the mean shift. In the next section, Monte Carlo
experiments will confirm that the mean shift parameter is
The two dimensional model was similarly effective at recovering the true parameters of the
model. In this case, we were able to correctly infer that members of group two were had positions
on dimension one that were one higher on average and more variable than the positions of group
one members. Similarly, on dimension two, the model correctly found that group two members
had on average lower positions and about the same dispersion as group one members.
[STANDARD ERROR FOR THE ESTIMATED PARAMETERS ARE IN PRINCIPAL STRAIGHTFORWARD TO CALCULATE FROM THE INVERSE OF THE HESSIAN. HOWEVER, I HAVE
YET TO ACHIEVE A WORKING CODE TO DO THIS]
6 Assessing the fit of the model
By not estimating each voter’s ideal point individually, criteria such the percent of votes correctly
¡
classified are not available. That is, because we do not estimate a particular ideal point ( ) for
each voter, we cannot say how well we were able to predict that particular voters vote. However,
the fit of the model can be considered both from a likelihood and from a graphical perspective.
Both will be considered below.
As noted above the data are inherently multinomial in nature. That is the estimation problem
can be thought of as picking the parameter values that most closely reproduce the frequency distribution of the voting patterns in the data (as shown in table 2). Thus, the spatial models developed
above can be thought of as nested within a general multinomial alternative. That is, we could estimate a general model in which we estimate parameters of the multinomial distribution of the data.
)¢ £)¤'¥x¦ parameters (one parameter describing the probability of falling into
each of the first ¢)£)¤'¥x¦ patterns). The likelihood function for this general model is:
§/¨P©4ª «'ª ¨­¬¨® « ¯ °
« ¨ « ©'²¨ « ³ ´
¨
Noting the ML estimator for each ¬ is ¬ ±
, the value of the log likelihood at it maximum
This distribution has
16
Results of MML estimation on simulated data: Two dimensions
µ
Item parameters
Item True Estimated
1
0.1
0.091
2
0.2
0.199
3
0.3
0.274
4
0.4
0.438
5
0.5
0.344
6
0.6
0.559
7
0.7
0.668
¶·
¶p¸
True Estimated
0.5
0.486
0.3
0.316
0.2
0.158
-0.4
-0.389
1.2
1.146
0.5
0.500
0.8
0.872
True Estimated
0.2
0.203
-0.2
-0.189
0.6
0.797
0.2
0.127
0.0
0.097
0.3
0.263
0.1
0.111
Group 2 parameters
Parameter
Shift
Dimension 1
Dimension 2
Stretch
Dimension 1
Dimension 2
¿3À ¸
3Ä Å Æ)ÇwÈ É È Á ÂpÊ ËvÌ­¿3ÀIË
Log likelihood
Î
True
Estimated
0.4
-0.2
0.52
-0.10
1.2
1.0
1.21
1.05
¹KÁ º)Â »¼)¼½ º+¾
¼+½ Ã)»
»pÁ ½ Í Á ¼
»+»)»
Table 4: Table shows the estimated parameters of the two dimensional model. The data are simulated. The true parameters used to generate the data are given in the table.
17
Ï Ð=Ñ3Ò
Ï ÐwØ Ö Ù ÚÛ
Ö
ÓPÔ4ÕÖ'× ×
will be,
Since the spatial models developed above (the MML estimated models) place restrictions on the
feasible values of
Ü Ó Ý , these models can be thought of as nested under the general model. If in fact,
the data were generated by the spatial model, then the restriction should not bind and the a similar
likelihood would be achieved by the spatial model and the general multinomial alternative. On the
other hand, if the data were generated by a different process (perhaps a spatial model with more
dimensions), then the general model should produce a significantly higher likelihood.
The dominence of the general multinomial alternative over the any particular spatial model can
be tested for by constructing a familiar likelihood ratio statistic.
Ñ3Þ
where is distributed
ã9ä
Ø Ï Ð=Ñ Ò Ï Ð=Ñ áNÒ áKâ Û
Ô4ß Ó=à
with degrees of freedom equal to the number of patterns (
åæä ) minus the
number of parameters used in the spatial model.
Tables 3 and 4 report the value this test statistic and it corresponding p-value. Notice that in
each case we cannot reject the spatial model in favor of the general multinomial alternative. This
is, of course, what we would expect given that the data were indeed generated by the spatial models
described above.
Similarly, likelihood tests can also be constructed to test for the intergroup differences and
for dimensionality. To see this note that the one dimensional model is nested within the two
dimensional model. The same likelihood could also be achieved for the two dimensional model as
ç
è
for the one dimensional model simply by setting all of the s and s equal to zero. Similarly, the
models which constrain the groups to have the same ideal point distribution can be nested within
models that allow the distributions to differ.
The availability of statistical tests for dimensionality and parameter differences is an advantage
of MML approach over the fixed effect models such as NOMINATE. However, these tests do have
some problems. If you have relatively few observations the predicted cell probabilities for many of
the multinomial cells will be very small. Since the distribution of the statistic is only asymptotically
18
correct, these small predicted cell frequencies cast some doubt on the validity of the test. For
example, in the Monte Carlo experiments below, it is not uncommon for as many as twenty of
the patterns to be missing from the data, while many others appear fewer than 5 times. To get
around this problem, the patterns could be grouped and tests run based on the grouped patterns.
Unfortunately there is no obviously best way to group the patterns and moreover this grouping
does discard information lowering the power of the test.
One the other hand, very large data sets introduce other problems. With very large data sets,
the
é3ê
statistic will more closely approximate its limiting
ë9ì distribution. However, we such large
data sets, it is very likely that the general multinomial alternative will always dominate the spatial
models when they are applied to real world data. Whatever, distribution we put over the voter
ideal points and whatever independence assumptions we make about the questions are unlikely to
hold exactly in practice. With very large data sets, we will likely reject our spatial model even
when the violations to its assumptions are quite slight. In such a case, comparing the value of the
spatial likelihood to the general multinomial alternative would be rather uninformative. On the
other hand, with large data sets, tests for dimensionality and for intergroup differences should be
very effective. Though again in very large data sets, LR tests may find significant differences that
are of little substantive importance.
The fit of the model can also be assessed using a graphical approach that is similar in spirit to
the LR test against the general multinomial model. Figures 2 and 3 show the predicted and actual
pattern of frequencies for each of the two examples presented above. Note that in both cases the
model fits the data quite nicely. Nearly all of the predicted frequencies fall very near to the 45
degree line.
The graphs also highlight the small cell frequency problem noted above. Much the discrepancy
between the true and estimated pattern frequencies occur near the bottom left-hand corner of each
graph. In these cases the discrepancies may in part be due not to poor estimated frequencies but,
in a sense to “poor” actual frequencies. That is, where we expect to find 3.2 cases but find none, it
may be that
íî ï
is a better estimate of how many we should find than is none.
19
True frequency
ù ú û üý þ ÿ
ø
ð
PSfrag replacements
ñ ò ó ôõ ö ÷
Estimated frequency
Figure 2: Plots the actual number of observation casting each of the roughly 256 vote patterns
against the number predicted by the model. That most all of the points fall close to the 45 degree
line is indicative of the good fit of the model.
True frequency
PSfrag replacements
Estimated frequency
Figure 3: Plots the actual number of observation casting each of the roughly 256 vote patterns
against the number predicted by the model. That most all of the points fall close to the 45 degree
line is indicative of the good fit of the model.
20
7 Monte Carlo results
In this section, I run Monte Carlo experiments on the one and two dimensional models developed
above. Overall the experiments show the method to be quite effective in recovering parameters of
the model. For comparison, I also estimate the s and s directly by running probits of each on
the usually unobserved . While the sampling variation in the estimates made without knowledge
of is larger, it is not as much larger as one would imagine. Even with as few as 7 observed
“votes” per voter, the estimates of the s and s are generally quite good.
Figure 5 shows the results of the model with one spatial dimension. Overall, the technique
seems to work well. In only a very few instances to do the average estimated s and s fall as
much as away from their true values. Moreover, the standard deviation of the estimates. In
many cases, the sampling variation in the these estimates is only about 1/3rd more than was found
by estimating the s and s by running probit directly. The only case where the estimation seems
to have broken down is in the estimates of the parameters related to items (or votes) 5 and 7. In
both of these cases, I believe the problem is one of numerical inaccuracy. In each case the true s
were very large. With such large s the exact value the value of likelihood for a yes vote is very
close to zero or one for nearly all respondents. As this probability approaches zero or one the exact
value of the estimated
is less and less important in the likelihood. [IN THE FUTURE, I WILL
EITHER REDUCE THE VARIANCE OF THETA OR MAKE THESE BETA SMALLER.]
The estimates the intergroup differences between the two groups also seem very good on average. Somewhat surprisingly, the mean shift is much less accurately measured on average than is
the variance stretch.
Figure 6 shows the results of the Monte Carlo experiments using the two-dimensional model.
As was the case with a single dimension, the s and s are quite well estimated with exception of
the very large values of . As before the estimates of stretch parameters were considerably more
accurate than those of the shift parameters.
A few more Monte Carlo experiments would be very instructive. Figure 4 shows how the
sampling variation in the parameters falls as a function of the sample size. Other experiments
would also be valuable, most obvious would be one in which the number of votes per voter is
manipulated. These are left for future work.
21
Monte Carlo results: One dimension
Item parameters
Direct
probit
Estimated
Item true mean
1
0.1 0.105
2
0.2 0.204
3
0.3 0.299
4
0.4 0.398
5
0.5 0.572
6
0.6 0.609
7
0.7 0.713
std
0.067
0.050
0.044
0.050
0.355
0.079
0.118
mean
0.102
0.203
0.299
0.401
0.501
0.606
0,710
Estimated
std
0.047
0.045
0.044
0.050
0.083
0.057
0.068
true
0.5
0.3
0.1
-0.4
1.2
0.5
0.8
mean
0.498
0.299
0.100
-0.399
-1.290
-0.503
-0.799
std
0.036
0.027
0.022
0.032
0.383
0.038
0.083
Direct
probit
mean
0.502
0.301
0.100
-0.402
-1.220
0.506
0.803
std
0.026
0.021
0.019
0.022
0.086
0.028
0.044
Group 2 parameters
Estimated
Shift
Stretch
True
0.4
1.2
Mean
0.414
1.216
Std.
0.218
0.077
Trials
Table 5: Shows the results of 250 trials of a Monte Carlo experiment in which a data set with
1000 observations generated by the given parameters was fit using the MML method described
in the text. The “Direct probit” columns show the results of fitting the model by running probit
regressions of each vote directly on ! , something which is, of course, not possible in real-life
applications.
22
23
std
0.034
0.030
0.026
0.292
0.081
0.039
0.057
mean
0.101
0.200
0.301
0.402
0.503
0.602
0.699
true
0.5
0.3
0.1
-0.4
1.2
0.5
0.8
std
0.022
0.024
0.027
0.387
0.107
0.024
0.039
Std.
0.103
0.055
0.048
0.043
Mean
0.401
-0.002
1.198
0.997
()*
&
Trials
0.4
0.0
1.2
1.0
True
Estimated
std
0.015
0.011
0.011
0.019
0.040
0.015
0.022
Direct
probit
mean
0.502
0.300
0.099
-0.400
1.204
0.501
0.800
Group 2 parameters
mean
0.500
0.300
0.099
-0.467
1.217
0.499
0.801
Dimension 1
Dimension 2
Stretch
Dimension 1
Dimension 2
Shift
std
0.025
0.023
0.022
0.033
0.040
0.028
0.035
Estimated
'(
((
true
0.2
0.4
-0.4
1.1
0.4
0.3
0.0
mean
0.200
0.400
-0.401
1.194
0.409
0.299
0.001
std
0.038
0.026
0.024
0.372
0.114
0.037
0.065
Estimated
mean
0.200
0.401
-0.400
1.100
0.401
0.300
-0.001
std
0.013
0.012
0.012
0.032
0.021
0.015
0.065
Direct
probit
Table 6: Shows the results of 250 trials of a Monte Carlo experiment in which a data set with 1000 observations generated by the given
parameters was fit using the MML method described in the text. The “Direct probit” column show the results of fitting the model by
running probit of each vote directly on , something which is, of course, not possible in real-life applications.
mean
0.100
0.200
0.301
0.460
0.505
0.600
0.698
"
true
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Direct
probit
#$
Item
1
2
3
4
5
6
7
Estimated
Item parameters
Monte Carlo results: Two dimensions
#%
+
Monte Carlo standard deviations of , , -/. , and -10 , by sample size
Standard deviation
PSfrag replacements
MML
Direct Probit
9: ; < = > ? 2 3 4
Observations
p/q
ij k l m n o
Standard deviation
UV W X Y Z [
Standard deviation
@A B C D E F
MML
Direct Probit
5678
PSfrag replacements
NO P Q R S T G H I
JKLM
Observations
pr
s
p/q
PSfrag replacements
cd e f g h \ ] ^
MML
Direct Probit
Observations
s
_`ab
p r
Figure 4: Show the average standard errors of the s, s, and s across 7 items for various
samples sizes. The standard error estimates are based on Monte Carlo simulations. The model
parameters are the same as those used in table 6. For each sample size, 200 simulations were
conducted.
8 Conclusion
In this paper, I have developed a general model for estimating intergroup differences in ideal point
distribution from individual-level binary choice (voting) data. The main advantage of the model is
that skip the usual intermediate step of estimating each individuals ideal point. This is desirable
because, by most existing methods, these individual-level ideal point distributions are known to be
inconsistent.
The Monte Carlo experiments show the promise of the model. While the applications are
obvious they have been omitted from this paper and will the focus of future work. [MORE TO
COME]
24
References
Achen, Christopher H. 1977. “Measuring Representation: Perils of the Correlation Coefficient.”
American Journal of Political Science. 21:805–815.
Achen, Christopher H. & W. Phillips Shively. 1995. Cross Level Inference. Chicago: University
of Chicago Press.
Alesia, Alberto & Howard Rosenthal. 1994. Partisan Politics, Divided Government, and the Economy. New York: Cambridge University Press.
Anderson, Erling B. 1972. “The Numerical Solution of a Set of Conditional Estimation Equations.”
XXXXX. XXX(1):42–54.
Anderson, Erling B. & Mette Masden. 1972. “Estimating the Parameters of the Latent Population
Distribution.” Psychometrika. 42(3):357–374.
Anderson, Lee F., Meredith W. Watts & Allen Wilcox. 1966. Legislative Roll-Call Analysis.
Evanston: Northwestern Unversity Press.
Bailey, Michael. 1998. “A Random Effects Approach to Legislative Ideal Point Estimation.” Presented at the 1998 Midwest Political Science Association Meeting, Chicago.
Black, Duncan. 1958. The Theory of Committees and Elections. England: Cambridge University
Press.
Bock, R. Darrell & Marcus Lieberman. 1970. “Fitting a Response Model for n Dichotomously
Scored Items.” Psychometrika. 35(2):179–197.
Bock, R. Darrell & Murray Aitken. 1981. “Marginal Maximum Likelihood Estimation of Item
Parameters: Application of the EM algorithm.” Psychometrika. 46(4):443–459.
Brady, Henry E. 1989. “Factor and Ideal Point Analysis for Interpersonally Incomparable Data.”
Psychometrika. 54(2):181–202.
Butler, J. S. & Robert Moffit. 1982. “A Computationally Efficient Quadrature Proceedure for the
One-Factor Multinomial Probit Model.” Econometrica. 50(3):761–764.
Converse, Phillip & Gregory A. Markus. 1979. “Ca Plus Change...The new CPS Election Study
Panel.” American Political Science Review. 73(1):32–49.
Cressie, Noel & Paul W. Holland. 1983. “Characterizing the Manifest Probabilities of Latent Trait
Models.” Psychometrika. 48(1):129–143.
Dempster, A. P., D. B. Rubin & R. K. Tsutakawa. 1981. “Estimation in Covariance Components
Models.” Journal of the American Statistical Society. 76(June):341–353.
Dempster, A. P., N. M. Laird & D. B. Rubin. 1977. “Maximum Likelihood Estimation from
Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society, Series B.
39(1):1–38.
25
Downs, Anthony. 1957. An Economic Theory of Democracy. New York: Harper & Row.
Dubin, Jeffrey A. & Elisabeth R. Gerber. 1992. “Patterns of Voting on Ballot Propositions: A
Mixture Model of Voter Types.”. Social Science Working Paper 795, California Institute of
Technology.
Fienberg, Stephen E. & Michael M. Meyer. 1983. “Loglinear Models and Categorical Data Analysis with Psychometric and Econometric Applications.” Journal of Econometrics. 22:191–214.
Fiorina, Morris. 1992. “An era of divided government.” Political Science Quarterly. 33(2):387–
410.
Follman, Dean. 1988. “Consistent Estimation in the Rasch Model Based on Nonparametric Margins.” Psychometrika. 53(4):553–562.
Follman, Dean A. & Diane Lambert. 1989. “Generalizing Logistic Regression by Nonparametric
Mixing.” Journal of the American Statistical Society. 84(March):295–300.
Gerber, Elisabeth R. 1991. Legislative politics and the direct ballot: comparing policy outcomes
across institutional arrangements PhD thesis University of Michigan Ann Arbor: .
Gerber, Elisabeth R. 1996a. “Legislative Response to the Threat of Popular Initiatives.” American
Journal of Political Science. 40:99–128.
Gerber, Elisabeth R. 1996b. “Legislatures, Initiatives, and Representation: The Effects of Institutions on Policy.” Political Research Quarterly. 49:263–286.
Gerber, Elisabeth R. & Adam S. Many. 1996. “Incumbent-Led Ideological Balancing: A Hybrid
Theory of Split-Ticket Voting.” Present at the 1996 Midwest Political Science Association
Annual Meetings, Chicago.
Groseclose, Tim. 1994. “The committee outlier debate: A review and a reexamination of some of
the evidence.” Public Choice. 80.
Haberman, Shelby J. 1977. “Maximum Likelihood Estimation in Exponential Response Model.”
The Annals of Statistics. 6(5):815–841.
Heckman, James J. & Bo E. Honore. 1990. “The Empirical Content of the Roy Model.” Econometrica. 58(5):1121–1149.
Heckman, James J. & James M. Snyder. 1997. “Linear Probability Models of the Demand for
Attributes with an Empirical Application to Estimating the Preferences of Legislators.” The
Rand journal of economics. 28:S142–169.
Heckman, James J. & Robert J. Willis. 1977. “A Beta-logistic Model for the Analysis of Sequential
Labor Force Participation by Married Women.” Journal of Political Economy. 85(1):27–58.
Heinen, Ton. 1996. Latent Class and Discrete Latent Trait Models. Thousand Oaks: Sage Publications.
26
Heron, Michael. 1998. “Some consequences of the lack of micro-foundations in aggregate voting data.” Prepared for presentation at the 1998 annual meetings of the American Political
Science Association. Boston MA.
Hinich, Melvin J. & James M. Enelow. 1984. The spatial theory of voting: an introduction.
Cambridge, England: Cambdridge University Press.
Hotelling, H. 1929. “Stability in Competition.” Economic Journal. 39:41–57.
Kiefer, J. & J. Wolfowitz. 1956. “Consistency of the Maximum Likelihood Estimator in the Presence of Infinity Many Incidental Parameters.” Annals of Mathematical Statistics. 27:887–906.
King, Gary. 1997. A Solution to the Ecological Inference Problem: Reconstructing Individual
Behavior from Aggregate Data. Princeton: Princeton University Press.
Kuklinski, James H. 1978. “Representation and Elections: A Policy Analysis.” American Political
Science Review. 72:165–177.
Lahda, Krishna K. 1991. “A Spatial Model of Legislative Voting with Perceptual Error.” Public
Choice. 68:151–174.
Laird, Nan. 1978. “Nonparametric Maximum Likelihood Estimation of a Mixing Distribution.”
Journal of the American Statistical Association. 73(December):805–811.
Lewis, Jeffrey B. 1996. “Referendums, Roll-calls, and Constituencies.” Presented at the 1996
Annual meetings of the American Political Science Association, San Fransisco, CA.
Lewis, Jeffrey B. 1997a. “To the victors go the rollcalls.” Presented at the 1997 Annual meetings
of the Midwest Political Science Association, Chicago, IL.
Lewis, Jeffrey B. 1997b. Who do representatives represent? The importance of electoral coaltion
preferences in California PhD thesis Massachusetts Institute of Technology Cambridge MA:
.
Lindsay, Bruce, Clifford C. Clogg & John Grego. 1991. “Semiparametric Estimation in the Rasch
Model or Related Exponential Response Models, Including Simple Latent Class Model of
Item Analysis.” Journal of the American Statistical Society. 86(March):96–107.
Londregan, John. n.d. “Estimating Preferred Points in Small Legislatures: The Case of the Chilean
Senate Committees.” Unpublished Working Paper, University of California, Los Angeles.
Londregan, John & James M. Snyder. 1994. “Comparing Committee and Floor Preferences.”
Legislative Studies Quarterly. 19(2):233–265.
Mislevy, Robert J. 1984. “Estimating Latent Distributions.” Psychometrika. 49(3):359–381.
Mroz, Thomas A. 1997. “Discrete Factor Approximations in Simultaneous Equations Models:
Estimating the Impact of of a Dummy Endogenous Variable on a Continuos Outcome.” Unpublished working paper, Department of Economics, University of North Carolina, Chapel
Hill. 97-2.
27
Neyman, J. & Elizabeth L. Scott. 1948. “Consistent estimates based on partially consistent observations.” Econometrics. 16(1):1–32.
Peltzman, Sam. 1985a. “Constituent Interest and Congressional Voting.” Journal of Law & Politics.
27:181–210.
Peltzman, Sam. 1985b. “An Economic Interpretation of the History of Congressional Voting in the
Twentieth Century.” American Economic Review. 75:657–675.
Poole, Keith & Howard Rosenthal. 1985. “A Spatial Model for Legislative Roll Call Analysis.”
American Journal of Political Science. 29:355–384.
Poole, Keith T. & Howard Rosenthal. 1991. “Patterns of Congressional Voting.” American journal
of political science. 35(1):228.
Poole, Keith T. & Howard Rosenthal. 1997. Congress: A Political-Economic History of Roll Call
Voting. Oxford: Oxford University Press.
Rednert, Richard A. & Homer F. Walker. 1984. “Mixture Densities, Maximum LIkelihood and the
EM Algorithm.” SIAM Review. 26(2):195–239.
Rigdon, Steven F. & Robert R. Tsutakawa. 1983. “Parameter Estimation in Latent Trait Models.”
Psychometrika. 48(4):567–574.
Sanathanan, Lalitha & Saul Blumenthal. 1978. “The Logistic Model and Estimation of Latent
Structures.” Journal of the American Statistical Society. 73(December):794–799.
Snyder, James M. 1996. “The Dimensions of Constituency Preferences: Evidence from California
Ballot Propositions, 1974-1990.” Legislative Studies Quarterly. 21(4):463–488.
Stroud, A. H. & Don Secrest. 1966. Gaussian Quadrature Formulas. Englewood Cliffs NJ:
Prentice-Hall.
Thurstone, L.L. 1931. “The isolation of Blocs in a Legislative Body by the Voting Records of Its
Members.” Journal of Social Psychology. 3:425–433.
Tong, Y. L. 1988. “Some Majorization Inequalities in Multivariate Statistical Analysis.”
Review. 30(4):602–622.
SIAM
Zwinderman, Aeilko H. 1991. “A Generalized Rasch Model for Manifest Predictors.” Psychometrika. XX:589–599.
28
Fly UP