...

Tau-b or Not Tau-b: Measuring Alliance Portfolio Similarity Curtis S. Signorino

by user

on
Category: Documents
12

views

Report

Comments

Transcript

Tau-b or Not Tau-b: Measuring Alliance Portfolio Similarity Curtis S. Signorino
Tau-b or Not Tau-b:
Measuring Alliance Portfolio Similarity
Curtis S. Signorino
Jerey M. Ritter1
Work in Progress
Comments Welcome
April 2, 1997
Littauer Center North Yard, Harvard University. Email: [email protected] and
Prepared for the 1997 annual meeting of the Midwest Political Science Association. Although they may not agree with all our conclusions, we would like to thank Bruce Bueno de
Mesquita and Scott Bennett for providing us with data and with many helpful discussions, as well as Chris
Gelpi and the participants in the Harvard Government Department's Rational Choice Discussion Group for
their valuable comments.
1
[email protected].
Abstract
The pattern of alliance commitments among states is commonly assumed to reect the extent to
which states have common or conicting security interests. For the past twenty years, Kendall's b
has been used to measure the similarity between two nations' \portfolios" of alliance commitments.
Widely employed indicators of systemic polarity, state utility, and state risk propensity all rely
upon b . We demonstrate that b is inappropriate for measuring the similarity of states' alliance
commitments. We develop an alternative measure of alliance portfolio similiarity, S , which avoids
many of the problems associated with b , and we use data on alliances among European states to
compare the eects of S versus b in measures of utility and risk propensity. Finally, we identify
several problems with inferring state interest from alliance commitments and we provide a method to
overcome those problems using S in combination with data on alliances, trade, UN votes, diplomatic
missions, and other types of state interaction.
1 Introduction
In recent years, international relations scholars have devoted considerable eort to testing hypotheses derived from systemic theories of international politics and from choice-theoretic models of
international interactions. For each of these purposes, researchers have attempted to measure and
compare the patterns of military alliances among states. Bruce Bueno de Mesquita rst proposed
using Kendall's b rank-order correlation coecient as a measure of the similarity of states' alliance
portfolios in his article \Measuring Systemic Polarity" (Bueno de Mesquita 1975). Over the last
twenty years, it has become a common practice to rely upon b as a measure of the similarity of
two states' alliance commitments with the other states in the system.
Those interested in testing systemic theories of international politics use the b measure of
alliance portfolio similarity to identify alliance \clusters" and to measure the extent to which those
clusters are discrete or overlapping (e.g. Bueno de Mesquita 1975; Ostrom and Aldrich 1978;
Bueno de Mesquita 1978, Organski and Kugler 1980; Bueno de Mesquita 1981b, Stoll 1984; Stoll
and Champion 1985; Iusi-Scarborough 1988; Bueno de Mesquita & Lalman 1988; W. Kim 1989,
1991; C. Kim 1991). The method of constructing state utilities by treating alliance portfolios as
revealed preferences over security issuess was developed in a pioneering article by Altfeld and Bueno
de Mesquita (Altfeld and Bueno de Mesquita 1979). Since then, many scholars have employed the
similarity of states' alliance portfolios as a useful indicator of the similarity of those states' security
interests. These authors subject choice-theoretic models of international conict to empirical tests,
using b as the basis for operational measures of states' willingness to take risks and of those
states' expected utilities for challenging each other. (Bueno de Mesquita 1978; Altfeld and Bueno
de Mesquita 1979; Bueno de Mesquita 1980; Bueno de Mesquita 1981; Berkowitz 1983; Altfeld
1984; Bueno de Mesquita 1985; Altfeld 1985; Bueno de Mesquita & Lalman 1986; Lalman 1988;
Scarborough 1988; Bueno de Mesquita & Lalman 1988; Lalman and Newman 1991; C. Kim 1991;
W. Kim 1991; Bueno de Mesquita & Lalman 1992; Morrow and Kim 1992; Huth, Bennett and
Gelpi 1993).
The fundamental question we ask in this paper is: How well does b measure the similarity
of two states' alliance commitments? We argue that while Kendall's b is a useful measure of
association for ranked categorical data (e.g. Levy 1981), it is inappropriate to use Kendall's b as
an indicator of the similarity of states' alliance portfolios. Our examination of this question also
raises some concerns about the practice of interpreting the similarity of alliance portfolios as a
measure of states' common interests.
By way of a roadmap, we begin by presenting the construct of the \alliance portfolio" in section
two. In section three, we present Kendall's b and demonstrate the problem of applying it as a
measure of portfolio similarity. We develop an alternative measure S for the similarity of alliance
2
FRN
UK
GMY
AUH
RUS
FRN UK GMY AUH RUS
3
0
0
0
0
0
3
3
3
3
0
3
3
3
3
0
3
3
3
3
UK
FRN
GMY
AUH
ITA
RUS
0
3
3
3
3
UK FRN GMY AUH ITA RUS
3
1
0
0
0
0
1
3
0
0
2
3
0
0
3
3
3
0
0
0
3
3
3
1
0
2
3
3
3
0
0
3
0
1
0
3
(a) 1816
(b) 1905
Figure 1: Major Power Alliances in 1816 and 1905. Each table element denotes the type of alliance
the column nation has with the row nation. A state's alliance portfolio is the column vector of its
alliances with each of the row nations. (0=no alliance, 1=entente, 2=neutrality pact, and 3=defense
pact).
commitments in section four and show that it suers from none of the problems associated with
b . In section ve, we provide empirical comparisons of S with b using data on alliances between
European states. In section six, we identify several reasons why the similarity of states' alliance
portfolios are likely to be poor measures of their similarity of interests, but suggest a method
for obtaining better estimates of the similarity of states' interests using S . In section seven, we
conclude.
2 Alliance \Portfolios"
To set the stage for further discussion of b and other measures, we rst need to identify the variable
of interest here: the alliance portfolio. The Correlates of War (COW) Alliances Data Set classies
alliances into four types, which in this paper we code as follows: 0=no alliance, 1=entente, 2=neutrality or nonaggression pact, 3=mutual defense pact. We follow Bueno de Mesquita (1975:195)
in assuming that that these categories represent increasing degrees of formal alliance commitment
between states and that it is therefore appropriate to treat the data as ordinal.1
Let the states in the system for a given year be indexed k = 1 : : :N . Then state i's alliance
portfolio is an N 1 vector Ai = [ai1 ai2 : : : aiN ] , where each element aik 2 f0; 1; 2; 3g is i's alliance
with state k. For the moment, consider a simple example in which we compare the major European
powers' alliance portfolios with each other. Figure 1 displays the alliances between major powers
in 1816 and 1905.2 As an example of an alliance portfolio, France's portfolio in 1905 AFRN =
[ 1 3 0 0 2 3 ] .3
To determine the similarity of alliance portfolios, we compare the portfolios of two nations i
0
0
This assumption is certainly not unproblematic, but we postpone a discussion of it until section six.
Major powers are based on COW codings. States are treated as having mutual defense pacts with themselves.
3 For the empirical analyses later in this paper, each state's portfolio will actually be taken over all the states in
the European system, not just the major powers.
1
2
3
0
0
ITA
1
FRN
2
UK
3
RUS
1
2
0
ITA 1
2
3
FRN
3 GMY
AUH
ITA
0
0
0
0
2
FRN
1
1
0
0
0
2
0
0
0
1
3
1
0
1
0
(b)
(a)
Figure 2: Contingency tables based on France's and Italy's alliance portfolios over major powers in
1905. Table (a) shows the distribution of the bivariate alliance rankings (aik ; ajk ). Table (b) shows
the corresponding distribution of counts. (The alliance categories along the top and left of the tables
are 0=no alliance, 1=entente, 2=neutrality pact, and 3=defense pact.)
and j . In 1816 for Germany and Austria-Hungary, this is not terribly dicult, since their portfolios
are exactly the same: both have no alliance with France and defense pacts with Great Britain and
Russia. However, it is not so clear for France and Italy in 1905 | hence the need for a measure to
quantify the similarity of two nations' alliance portfolios.
An alternative (equivalent) way of conceptualizing the data | which will be used throughout
this paper | is to treat two alliance portfolios as cross-classications of alliances and to represent
this in a 4 4 contingency table. If Ai and Aj represent states i and j 's vectors of alliance
commitments with N states indexed k = 1 : : :N , then the elements of the contingency table are
comprised of the joint rankings (aik ; ajk ). Figure 2 gives an example of the square contingency table
that results from the alliance portfolios of France and Italy in 1905, the data for which is based on
their respective columns in Figure 1(b). Figure 2(a) displays the distribution of the joint rankings
(aik ; ajk ). Figure 2(b) shows the corresponding distribution of counts.
3 Kendall's as a Measure of Association
b
Since Bueno de Mesquita (1975), Kendall's b has been used in the overwhelming majority of
published articles that attempt to measure the similarity of alliance portfolios. Kendall's b is one
measure among a host of others that fall under the rubric of \measures of association." Chi-square
and proportional reduction in error measures for nominal data, the Goodman-Kruskal for ordinal
data, Spearman's for interval data, and Pearson's product-moment correlation for continuous
data are all measures of this type.4
For general references to these measures (and others cited in this paper) see Bishop et al. (1975), Kotz & Johnson
(1988), Kendall & Stuart (1961), Kendall & Gibbons (1990), and Liebetrau (1983).
4
4
The appeal of Kendall's b rests on the facts that it is specically designed for measuring the
association between two sets of ordinal rankings when \tied" rankings are permitted, and it is
easily interpretable, varying from ,1 for perfectly negative association to 1 for perfectly positive
assocation.5 A decade after Bueno de Mesquita rst introduced this approach to measuring the
similarity of states' alliance portfolios, Michael Wallace called it \a major advance in every respect,"
that represents \a notable improvement in sophistication" over previous measures."6 (Wallace
1985:102,103)
Given two rankings x and y over N items, the calculation for b is based on comparisons of
pairs of joint rankings (xi; yi ) and (xj ; yj ) and determining whether those pairs of joint rankings are
\concordant," \discordant," or tied. A pair of rankings (xi ; yi) and (xj ; yj ) is considered concordant
if xi > xj and yi > yj or if xi < xj and yi < yj . They are considered discordant if xi > xj and
yi < yj or if xi < xj and yi > yj . If all pairings of joint rankings are concordant, then x and y are
perfectly positively associated. If all are discordant, x and y are perfectly negatively associated.
To calculate b for two rankings x and y over N items (see e.g. Kendall & Stuart, 1961:562{3),
rst dene for x a matrix representing whether the paired comparisons of xi and xj are concordant,
discordant, or tied for all raters i; j :
8
< +1 if xi < xj
aij = : 0 if xi = xj
,1 if xi > xj
(1)
and similarly for all paired comparisons in y :
8
< +1 if yi < yj
bij = : 0 if yi = yj :
,1 if yi > yj
The measure for b is then given by
P
i;j
b = rP
i;j
aij bij
a2ij
i; j = 1; 2; : : :n; i 6= j
P 2
bij
(2)
(3)
i;j
When x and y share the same number of ordinal levels, the contingency table is square and the
b measure of correlation takes on values in the interval [,1; 1], where b = 1 represents complete
concordance in rankings, b = ,1 represents complete discordance in rankings, and b = 0 represents
independence in rankings.
Bueno de Mesquita (1975:198) oers no explicit justication for the use of b other than it being one of a variety of
possible measures of association. Majeski and Sylvan, criticizing Bueno de Mesquita's use of b , appear to mistakenly
believe that b is measured on a cardinal or interval scale (Majeski and Sylvan 1984:331).
6
Wallace nevertheless has strong reservations about using b to measure systemic polarity and he develops an
alternate approach to measuring polarity.
5
5
3.1
b
as a Measure of Alliance Portfolio Similarity
The application of b to measuring the similarity of alliance portfolios between states is straightforward. In a system of N states, the b statistic compares the way state i ranks its alliance
relationships with states 1; : : :; N to the way state j ranks its alliance relationship with states
1; : : :; N . Take the pair of alliance rankings (aik ; ajk ) and (ail ; ajl ) of i and j for states k and l. When
i has a stronger (weaker) alliance commitment to k than it does to l and j has a stronger (weaker)
alliance commitment to k than it does to l, then the pair of rankings are concordant. When i has a
stronger (weaker) alliance with k than with l, while j has a weaker (stronger) alliance with k than
with l, their rankings are said to be discordant. With the two portfolios Ai and Aj of i's and j 's
alliance commitments to each nation k = 1; : : :; N , we can then use Equation 3 to calculate the
association between i's and j 's alliance commitments.
The question is whether this measure of association is also a good measure of the similarity of
alliance portfolios. We argue that it is not|that there are critical dierences between the rank{
order association measured by Kendall's b and the \similarity" of alliance commitments. We
provide the following series of examples to show why a measure of association is not necessarily an
appropriate measure of similarity.
3.1.1 Perfect Association, But Imperfect Similarity
At rst glance, one would expect that if two alliance portfolios are perfectly positively associated
then they should be similar. This is not necessarily the case. A b score of 1 does not necessarily
mean that two states in fact have identical alliance portfolios. Figure 3 shows three hypothetical
contingency tables with dierent distributions of elements that produce a b score of 1.7 It is true
that b generates a score of 1 when all of the elements fall on the main positive diagonal of the
contingency table, as in Figure 3 (a). Kendall's b also generates a score of 1 when all the elements
fall in one of the o-diagonals, as in Figure3 (b) and (c). Despite their identical b scores, there are
substantial dierences between the three cases. In case (a), states i and i have identical types of
alliances with every state in the system, while in case (b) the states agree about the rank-order of
their alliances but have dierent types of alliances with every state in the system. Case (c) is similar
to case (b), except that the types of alliances i and j have with the other system members are even
less similar: their alliance types with the other system members dier by two ordinal categories
rather than one. In fact, it is dicult to tell if the data should be interpreted as a positive diagonal
or as a slight deviation from a point on the main negative diagonal.
7
These cases are of i and j 's alliances with four other nations.
6
0
1
2
3
0
1
0
0
0
1
0
1
0
0
2
0
0
1
0
3
0
0
0
1
0
1
2
3
0
0
0
0
0
1
2
0
0
0
2
0
1
0
0
3
0
0
1
0
0
1
2
3
0
0
0
0
0
1
0
0
0
0
2
2
0
0
0
b = 1
b = 1
b = 1
(a)
(b)
(c)
3
0
2
0
0
Figure 3: Illustrations of dierent alliance rankings yielding b = 1. The gure displays three cases
of state i's and state j 's alliances with four other nations. In case (a), i and j have the same
alliances with the four nations and we would expect a measure of alliance portfolio similarity to
indicate perfect similarity. However, i's and j 's alliances with the four nations diverge slightly in
case (b) and even more in case (c) | yet b still reects perfect similarity of alliance portfolios. (The
alliance categories along the top and left of the tables are 0=no alliance, 1=entente, 2=neutrality
pact, and 3=defense pact.)
0
1
2
3
0
0
0
0
1
1
0
0
1
0
2
0
1
0
0
3
1
0
0
0
0
1
2
3
0
0
0
0
0
1
0
0
0
2
2
0
0
1
0
3
0
1
0
0
0
1
2
3
0
0
0
0
0
1
0
0
0
0
2
0
0
0
2
b = ,1
b = ,1
b = ,1
(a)
(b)
(c)
3
0
0
2
0
Figure 4: Illustrations of dierent alliance rankings yielding b = ,1. The gure displays three cases
of state i's and state j 's alliances with four other nations. In case (a), i and j have completely
opposite alliances with the four nations and we would expect a measure of alliance portfolio similarity
to indicate perfect dissimilarity. However, i's and j 's alliances with the four nations converge
slightly in case (b) and even more in case (c) | yet b still reects perfect dissimilarity of alliance
portfolios. (The alliance categories along the top and left of the tables are 0=no alliance, 1=entente,
2=neutrality pact, and 3=defense pact.)
7
3.1.2 Perfect Negative Association, But Imperfect Dissimilarity
The hypothetical contingency tables in Figure 4 show that a similar problem exists at b 's lower
boundary value. In Figure 4 (a), i and j have opposing views about the ordinal relationship
of their alliance commitments to the four other system members. Nevertheless, i and j do not
really have antithetical alliance portfolios, in that they are mutually allied to the states in the
(1,2) and (2,1) cells, albeit at dierent levels of commitment. International relations researchers
examining alliance patterns should not be pleased that b takes on a value of ,1 whenever all of the
elements fall into the main negative diagonal of the contingency table, because truly antithetical
disagreement occurrs only when all of the elements are concentrated in the (0,3) and (3,0) cells of
the table. In Figure 4 (b), i and j have diametrically opposing views on the correct rank-ordering
of their alliance commitments with the other states in the system, but they agree that they should
have some formal alliance commitment with all of the other states in the system. Moreover, they
agree about the specic type of alliance commitment they should have with one of the system
members, represented by the element in the (2,2) box. In Figure 4 (c), both i and j have relatively
high alliance commitments to the other states in the system, but they disagree over the ordinal
ranking of those commitments by one category. While it seems clear that the alliance portfolios
represented in Figure 4 (c) are more similar than the portfolios represented in Figure 4 (b), which
are themselves more similar than the portfolios represented in Figure 4 (a), all three contingency
tables generate a Kendall's b of ,1.
3.1.3 When b Is Undened
While Figures 3 and 4 provide examples of cases in which it is misleading to interpret identical b
scores as representing identical degrees of similarity in states' alliance portfolios, Figure 5 displays
plausible hypothetical cases in which a comparison of two states' alliance portfolios yields a b that
is undened. Whenever one or both of the states rank the elements in the same ordinal category,
b is undened. Recall from Equations (1) and (2) that b compares one states' ordinal rankings
of elements to the others' ordinal rankings of the same elements. If either i or j ranks all of the
elements in the same category, there is no \order" to its ranking of the elements. As a result, one
of the terms in the denominator of Equation (3) is zero, and no meaningful value exists for b . In
cases (a) and (b) of Figure 5, one of the states has dierent alliance commitments to the other
members of the system, while the other state has identical levels of alliance commitments to all
the members of the system. The result is that the elements appear in a row or a column of the
contingency table. If either pair of alliance commitments were to appear in the real world, it would
be impossible to measure their similarity using b . The example in Figure 5 (c) is particularly
disturbing. When states i and j both have defense pacts with all the states in the system, it seems
reasonable to expect a valid measure of alliance portfolio similarity to indicate that i and j have
8
0
1
2
3
0
0
0
0
1
1
0
0
0
1
2
0
0
0
1
3
0
0
0
1
0
1
2
3
0
0
0
0
0
1
0
0
0
0
2
0
0
0
0
3
1
1
1
1
0
1
2
3
0
0
0
0
0
1
0
0
0
0
2
0
0
0
0
3
0
0
0
4
b = undened
b = undened
b = undened
(a)
(b)
(c)
Figure 5: Illustrations of dierent alliance rankings where b is undened. The gure displays three
cases of state i's and state j 's alliances with four other nations. Most concerning of the three cases
is (c), where i and j have defense pacts with all four nations, indicating perfect similarity of alliance
portfolios | yet b is undened. (The alliance categories along the top and left of the tables are
0=no alliance, 1=entente, 2=neutrality pact, and 3=defense pact.)
0
1
2
3
0
10
0
0
1
1
0
0
0
0
2
0
0
0
0
b = ,:09
(a)
3
1
0
0
0
0
1
2
3
0
0
0
0
1
1
0
10
0
0
2
0
0
0
0
3
1
0
0
0
0
1
2
3
b = ,1
0
0
0
0
1
1
0
0
0
0
2
0
0
10
0
b = ,1
(b)
(c)
3
1
0
0
0
0
1
2
3
0
0
0
0
1
1
0
0
0
0
2
0
0
0
0
3
1
0
0
10
b = ,:09
(d)
Figure 6: Association changes when \similarity" has not. Cases (a){(d) show how the association
between Ai and Aj can change, yielding a dierent b , even though the alliance portfolios are no
more similar or dissimilar in any of the cases. In cases (b) and (c), the commitment of i and j
to the ten other countries increases jointly, but does not change with respect to each other. Yet,
b = ,1 would imply greater dissimilarity in cases (b) and (c) than in (a) or (d).
identical alliance portfolios. Kendall's b , however, is undened.
3.1.4 Constant Similarity With Varying Association
Figure 5 (c) calls our attention to another characteristic of b that reduces its attractiveness as a
measure of the similarity of alliance portfolios: because the b measure of association is sensitive to
cells in which many elements are concentrated, the value of b may vary over pairs of states with
equally similar alliance portfolios. Figure 6 demonstrates how this behavior can aect the value
of b . We imagine a system of twelve states, in which states i and j are treated as having mutual
defense pacts with themselves, no alliance with each other, and identical alliance commitments to
the other ten states. The cases in Figure 6 dier only in the type of alliance commitments i and
j have with the other ten states in the system: \no alliance" in case (a), ententes in case (b),
9
neutrality/non-aggression pacts in case (c), and defense pacts in case (d). Notice that cases (a) and
(d) produce b scores of ,:09, indicating mild dissimilarity, even though the i and j are in perfect
agreement about the type of alliance commitment they should have with 10 of the 12 states in the
system. Moreover, cases (b) and (c) generate b scores of ,1, indicating complete disagreement,
despite the fact that i and j remain in complete agreement about the type of alliance commitment
they should have with 10 of the 12 states in the system and despite the fact that the extent of their
disagreement over the other two relationships is no greater than it is in (a) or (d). In cases (b) and
(c), i and j agree that they should have increasingly binding alliance commitments with the vast
majority of states in the system, but their b rating of alliance portfolio similarity is lower than in
case (a), where i and j have no alliances with each other or with any other state.
The results in Figure 6 are due to the manner in which b handles tied rankings. In all four
cases, the pair comparisons of the ten states in the same cell contribute nothing to the b score,
since the aij and bij terms of Equation 3 are all zero. However in cases (a) and (d), there are partial
ties due to the ten states in the same cell being paired with i and j 's rankings of alliances with
each other, providing some non-zero aij and bij terms. In cases (b) and (c), all of the pairings (that
are not among the ten states in the same cell) are completely discordant. Since the pairings in of
the rankings in the same cell do not contribute anything to the score, only the discordant pairings
contribute, yielding a b = ,1. But clearly we can not say that (a) displays more similar portfolios
than (b) when in fact i and j jointly increase their commitent to the ten states in (b) and (c).
There are two critical lessons to be drawn from this example. First, measures of linear association may obscure the actual \similarity" of states' alliance portfolios precisely because they
focus on linear association rather than actual agreement. Second, coding rules that determine the
domain of other possible alliance partners for i and j are extremely important, since denitions
of systemic membership that include larger numbers of states that are \irrelevant" to i's and j 's
security interests will generally result in a larger number of elements falling into the (0,0) corner of
the contingency table, skewing the value of b .
3.1.5 \Not Tau-b"
Kendall's b is a rank-order correlation coecient; it is designed to measure the linear association
between i's and j 's ordinal rankings of the elements in a square contingency table, which is not the
same thing as the \similarity" of i's and j 's relationships with the elements in the table. In fact, b
can seriously misrepresent the degree to which two states' alliance portfolios are similar. It seems
reasonable to claim that i's and j 's alliance portfolios are more similar when i and j have identical
types of alliances with each of the other system members than they are when i and j have dierent
types of alliances with each of the other system members but identical ordinal rankings of those
commitments. Kendall's b is insensitive to the dierent degrees of alliance portfolio similarity and
10
dissimilarity exhibited in the dierence between elements falling into the main-diagonals or the
o-diagonals of the table. b fails to notice the type of dissimilarity in alliance portfolios exhibited
in the dierence between the diagonals and the o-diagonals. Equally disturbingly, b is undened
whenever i or j rank all of their potential alliance partners in the same ordinal category, and b
is undened whenever elements are concentrated at the extreme corners of the contingency table,
even when such concentrations should clearly be interpreted as representing completely identical or
opposite alliance portfolios. These characteristics render b inappropriate as a measure of alliance
portfolio similarity. This indictment is not limited to b alone: any of the measures of association or
correlation mentioned at the beginning of this section would be inappropriate as an indicator of the
similarity of two states' alliance portfolios. The fact that b can seriously misrepresent the degree to
which two states' alliance portfolios are similar, combined with the absence of any convenient and
eective substitute for b , leads us to seek an alternative approach to measuring alliance portfolio
similarity.
3.2 Be Not A Borrower: Agreement and Cohen's Political Science is often called a borrowing discipline, and in this case it seems reasonable to ask
our sister social sciences if they could loan us a better measure of alliance portfolio similarity.
Social psychologists and educational testing statisticians are very interested in the extent to which
two observers \agree" about how a collection of elements should be rated. Measures of agreement,
notably Cohen's and its variants, do have some features that make them more appealing than
measures of association as indicators of the similarity of alliance portfolios. Unfortunately, these
measures suer from at least two problems that render them inappropriate for the task at hand.
Recall that a measure of association quanties the extent to which two variables are dependent
and the direction of that dependence. If two variables are perfectly associated, the measure of
association is only required to predict the category of one variable given the category of the other
(Bishop et al., 1975:394). This ability to predict the category of one variable given the category of
the other is most clearly shown by the perfectly positive and perfectly negative association results
in Figures 3 and 4, respectively. In both cases, the association is such that knowing the value of
one category allows us to determine the category of the second | even when the portfolios are not
what we would consider perfectly \similar."
Measures of agreement are a special case of measures of association, assuming more of the
structure of the contingency table. Measures of agreement are particularly appropriate for data
where x and y represent rankings (or categorizations) over N items by two dierent raters using
the same classications.8 The question answered by measures of agreement is not the extent to
which the ratings are associated, but the extent to which the raters actually agree in their ratings.
8
Measures of association do not require that x and y have similar categories, which allows for J K tables.
11
This places more restrictions on the structure of the contingency table because the emphasis is on
the extent to which the elements of the contingency table diverge from the positive diagonal (i.e.,
where the cells represent agreement between the raters).
One of the earliest and most widely used measures of agreement is Cohen's (1960) for nominal data. It measures the extent to which contingency table elements fall on the positive diagonal,
corrected for the probability of falling on the positive diagonal due to chance. One drawback of
is that it is asymmetrically oriented towards measuring agreement and not disagreement. International relations researchers would like a measure that is symmetric, allowing for both complete
agreement and complete disagreement. Unlike b , which varies from ,1 for complete discordance to
1 for complete concordance, does not have a similar nice, symmetric character, ranging between
complete disagreement and complete agreement.
Although a number of variations of Cohen's have been developed to account for partial
agreement and for use with other data types, none are appropriate for our task. Cohen's (1968)
weighted for nominal data allows the weighting of cells o the positive diagonal to give partial
credit for rankings that are \close" to being in agreement. Davies & Fleiss (1982) develop a
generalized -like statistic for nominal data. For interval data, -like measures or generalizations
of have been developed by Schouten (1982), O'Connell & Dobson (1984), and Berry & Mielke
(1988,1990). While some of these claim to be measures of agreement for ordinal data, all require
the use of a distance metric or scoring that imposes interval assumptions on the rankings. To
our knowledge, no measure of agreement has been developed for ordinal data that respects the
ordinality limitations. This may simply be an information problem inherent to the data: granting
partial credit for \close" agreement requires specication of the extent of that credit, which can
not be done with the information available in rank-order data. Finally, with respect to measuring
alliance portfolio similarity, all of the aforementioned measures of agreement suer from the same
problem as in that they are oriented towards measuring agreement per se and not towards being
symmetric measures of agreement versus disagreement.
In sum, measures of agreement take us closer to \similarity" conceptually, but not far enough.
Unfortunately, there are no measures of agreement appropriate for ordinal alliance data, and existing (non-ordinal) measures of agreement are incapable of reecting antithetical disagreement in
a convenient form. As we are unable to borrow a method for measuring the similarity of alliance
portfolios, we are forced to develop one ourselves.
4
S:
A Distance-Based Measure of Similarity
The approach we use in this paper is instead based on \measures of similarity and dissimilarity,"
which have been widely employed (e.g. Kotz & Johnson, 1988: 397{405) to assess the extent to
which two vectors of (nominal, ordinal, interval, or continuous) data dier from each other. For
12
our proposed solution we ask, literally: How \far" are two portfolios from each other?
Assuming there are N states in the system, we let a state i's alliance portfolio vector Ai represent
a point in an N -dimensional, discrete data space. We dene the similarity S of states i and j 's
alliance portfolios Ai = [ ai1 ai2 aiN ] and Aj = [ aj1 aj2 ajN ] , respectively, as
i; Aj ; W; L)
(4)
S (Ai; Aj ; W; L) = 1 , 2 d(dAmax
(W; L)
where W = [ w1 w2 wN ] is a vector of weights over the N dimensions, L = [ l1 l2 lN ] is a
vector of scoring rules for the ranks within each of the N dimensions, d(Ai ; Aj ; W; L) is the distance
metric
N
X
i
j
d(A ; A ; W; L) = wk jlk (aik ) , lk (ajk )j
(5)
0
0
0
0
k=1
max
and d (W; L) is the maximum distance possible in the N -dimensional space (given the dimension
weights W and scoring rule L) denoted by
N
X
max
d (W; L) = wk(lkmax , lkmin)
k=1
(6)
where lkmax and lkmin are the maximum and minimum scores possible, respectively, for dimension k.
There are a number of advantages to this measure of similarity. Unlike b , S is dened for all
possible alliance patterns. S does not measure the linearity of the association of ordinal rankings
of elements, so it can distinguish more subtle shades of similarity and dissimilarity than b can
recognize. (The next section provides extensive examples showing why S does not suer from
many of the shortcomings of b .) Specied as above, the similarity measure S ranges from ,1,
corresponding to two portfolios at \opposite" ends of the data space, to 1, corresponding to identical
portfolios. This (arbitrary) standardization allows us to compare the scores of S and b for pairs of
portfolios whenever they are used to measure the \similarity" of alliance portfolios, since both have
the same range and the same substantive meaning at the endpoints of that range. It also allows
us to substitute S into any of the various applications for which b has been used in the study of
international politics (e.g., in calculating choice-theoretic risk scores) and to compare the resulting
scores when S is substituted for b . Care should be taken, however, in direct comparisons of S
and b , since they measure dierent characteristics of a contingency table (or the pair of rankings
on which the table is based). For example, b = 0 indicates independence between two vectors of
rankings, while S = 0 indicates that the distance between the scored rankings is half the maximum
it could be. Sometimes these coincide; sometimes they do not.
Another advantage of S is that one can specify a weighting W of the dimensions, where each
dimension represents a country with whom the referrent nation could be allied. While the weighting
could be as simple as wk = 1 8k (i.e., an alliance with one country is worth the same as with
another), there might be theoretical reasons to assume that not all alliances should be considered
13
equal. For example, if states ally to increase their security, then it might be appropriate to weight
the states proportionally to their military power in order to avoid exaggerating the importance of
small states. In our empirical examples, we will use the notation Sw when referring to a version of
S that weights each state according to its national material capabilities as a share of the sum of
the capabilities of all the states in the system.9
The similarity measure S does force two restrictions on the data, which may not always be
valid, although the measure allows for exibly modeling these. First, just as with the measures of
agreement, we are not aware of any measure of similarity that respects the ordinal limitations of
rank-order data. Most often, the ranks are converted to intervals through some scoring rule, which
is what is done here through the scoring rule L. This may be as simple as setting the intervals to
the rank values: l(aik ) = aik 8k. With no other information available, this may be an acceptable
scoring rule.10 However, if one does not believe, for example, that the dierence in commitment
between defense pacts and neutrality pacts is the same as the dierence between ententes and no
alliances, then one can specify any number of theoretically or empirically informed scoring rules for
the ranks. Moreover, one may specify dierent scoring rules for each of the N dimensions | i.e.,
for each nation with which one could be allied.
The second assumption placed on the data is that nations are assumed to view the alliance
categories in the same way. In other words, every nation has a similar conception of the commitment
embodied in an entente, in a neutrality pact, and in a defense pact. We do not feel this is an heroic
assumption to make. In fact, the COW Alliance Dataset coding rules for the alliance types imply
a common understanding between two states as to the responsibilities born to the other alliance
partner. A more heroic assumption along these lines concerns the weights W , which are assumed
shared by all countries. This might be the case in the hypothetical situation noted above | where
nations weight a possible alliance partner by the capabilities it would bring to the table. However,
one can easily think of two nations having dierent weights W i and W j for alliance partners based
on language, culture, or other nonmaterial factors. The similarity measure S does not allow for
such an heterogeneous weighting scheme. We believe that would be an interesting avenue of future
research.
4.1 Comparison of S to b
Although S is not available in canned statistics programs, it is actually easier to implement than
b, and it adheres to the convention of varying from ,1 to 1 in order to reect complete similarity
and complete dissimilarity of states' alliance portfolios. Moreover, we contend that in addition to
preserving the welcome characteristics of b , S is free of many of b 's liabilities. As a rst step
toward establishing that S does a better job than b of capturing the actual similarity of states'
9
10
We refer to capabilities as measured by the COW National Material Capabilities data set.
In fact, this is what we use in the empirical examples that follow.
14
0
1
2
3
0
1
0
0
0
1
0
1
0
0
2
0
0
1
0
3
0
0
0
1
0
1
2
3
b = 1
S=1
0
0
0
0
1
1
0
0
1
0
2
0
1
0
0
3
1
0
0
0
0
1
2
3
0
1
2
3
2
0
0
0
1
3
0
0
0
1
0
1
2
3
0
1
2
3
2
0
0
0
0
b = ,:09
S = :67
(j)
3
1
0
0
0
0
0
0
0
0
1
0
0
0
2
2
0
0
1
0
0
0
0
0
0
1
0
0
0
0
2
0
0
0
0
0
1
2
3
1
0
10
0
0
2
0
0
0
0
3
1
0
0
0
0
1
2
3
b = ,1
S = :67
3
0
1
0
0
0
1
2
3
3
0
2
0
0
0
0
0
0
0
1
0
0
0
0
2
0
0
0
2
3
0
0
2
0
(f)
3
1
1
1
1
0
1
2
3
0
0
0
0
0
1
0
0
0
0
2
0
0
0
0
3
0
0
0
4
b = undened
S=1
(i)
0
0
0
0
1
1
0
0
0
0
2
0
0
10
0
3
1
0
0
0
(l)
Figure 7: Comparison of S with b .
15
2
2
0
0
0
b = ,1
S = :33
b = ,1
S = :67
(k)
1
0
0
0
0
(c)
(h)
0
0
0
0
1
0
0
0
0
0
b = 1
S = ,:33
b = undened
S=0
(g)
1
0
0
0
0
0
1
2
3
(e)
b = undened
S=0
0
10
0
0
1
3
0
0
1
0
b = ,1
S=0
(d)
1
0
0
0
1
2
0
1
0
0
(b)
b = ,1
S = ,:33
0
0
0
0
1
1
2
0
0
0
b = 1
S = :33
(a)
0
1
2
3
0
0
0
0
0
0
1
2
3
0
0
0
0
1
1
0
0
0
0
2
0
0
0
0
b = ,:09
S = :67
(m)
3
1
0
0
10
alliance portfolios, we re-examine the hypothetical examples of Figures 3{6 in order to determine
how S performs when b provides misleading results. The results of this comparison appear in
Figure 7.
Consider the examples in Figure 7, tables (a){(c). In (a), states i and j have exactly identical
alliance commitments with the other states in the system, and both b and S produce values of
1. In table (b), i and j agree on the ordinal ranking of their alliance commitments with the other
states in the system, but they do not have identical levels of alliance commitments with any of
those states. b would produce a misleading value of 1, while S produces a score of .33, reecting
the fact that the states' alliance portfolios are similar but not identical as they are in table (a). In
table (c), i is strongly allied with two states that are not allied to j , and j is strongly allied with
two states that are not allied to i. b reects only the perfect similarity of ordinal rankings, while S
accurately characterizes this table as representing signicant dissimilarity (,:33) between the row
and column states' alliance portfolios.
In section 3.1.2, we noted that b takes on a value of ,1 in cases like that pictured in Figure 7
(d), misleadingly implying that i and j have antithetically-opposed alliance portfolios when in fact
both i and j have some formal alliance with the states in the (1,2) and (2,1) cells. The value of S ,
in contrast, implies less{than{complete dissimilarity between the portfolios.11 In table (e), i and j
disagree over a narrower range of alliance commitments than they do in table (d), and they actually
agree on the exact level of one of the alliance commitments. b again takes on a value of ,1, while
S correctly indicates that the portfolios in (e) are less dissimilar than the portfolios depicted in
(d). In table (f), i and j have relatively strong alliance commitments with all of the other four
states in the system, but they disagree as to which pair of states they should ally themselves with
more strongly. S produces a value of .33, suggesting the states have moderately similar alliance
portfolios, while b would once again imply complete disagreement.
S is not a measure of assocation between rank-orderings, and so it is dened even in those cases
in which one state has identical alliance commitments to all of the other states in the system. For
example, in Figure 7, tables (g) and (h) display cases in which no value of b exists simply because
there is no \order" to the ranking of alliance commitments by i and by j , respectively. S is dened
even when all of the elements in the table fall into a single cell. When both i and j have mutual
defense pacts with all of the other states in the system, their alliance portfolios are identical. As
Figure 7 (i) shows, the value of S accurately reects this complete similarity, while the value of b
is undened.
Figure 7 tables (j) through (m) reproduces the perplexing example from Figure 6. Recall that
in all four tables, i and j have mutual defense pacts with themselves, no alliance with each other,
and identical alliance commitments to the other ten states in the system. Although this is true
11
S does take on a value of -1 when all of the elements fall into the (0,3) and (0,3) cells of the table.
16
in each of the four tables, the value of b varies as we alter the type of alliance i and j have with
the other ten states. This example was particularly striking because b actually indicates that the
states' alliance portfolios are less similar in table (l), where i and j share nonaggression pacts with
ten states, than in table (j), where neither i nor j has any formal alliance commitment to any state
but itself. In contrast, S behaves just as we would like: it reects only the similarity of i's and j 's
patterns of alliances.12 In sum, S has produced meaningful measures of alliance portfolio similarity
in every one of these hypothetical cases, whereas b produced scores that were misleading.
5 Empirical Examples of S at Work
Thus far, we have considered several reasons why a measure of rank-order correlation like Kendall's
b is not the correct tool for measuring the similarity of states' alliance portfolios. We have examined a variety of hypothetical examples which illustrate how b may mislead researchers. We have
developed a measure which avoids many of the problems associated with b . And, we have demonstrated that it is a reliable guide to the actual degree of alliance portfolio similarity even when b
is not. Nevertheless, one might still ask whether S would actually alter one's interpretations of
empirical, rather than hypothetical, alliance patterns. If in practice states tend to adopt only those
patterns of alliances for which b and S yield very similar portfolio similarity scores, then one may
use b without worry.
Using data on European major power alliances, we demonstrate in this section that the values
of b generated by empirical data deviate substantially from the values of S . In the following
examples, we identify the type of alliance commitment each of the European major powers had
with every other state in Europe from 1816 to 1965 using the Correlates of War Alliances data
set. We derive the b measure of alliance portfolio similarity according to the procedure developed
by (Bueno de Mesquita 1975). We also calculate our standard similarity score S and a weighted
similarity score Sw . As discussed above in Section 4.1, we calculate Sw , by weighting the entries in
the contingency table according to each state's material capabilities as a share of the total material
capabilities of all the states in Europe in that year. We oer examples of how the raw values of
b and Sw dier, and we examine the impact of these dierences on measures of states' attitudes
towards risky ventures.
5.1 The Dierence Between Sw and b In Practice
For every pair of major European powers we graphed, there were substantial dierences between
the annual values of b and Sw . Two typical examples appear in Figure 8. Figure 8 (top) plots the
i's and j 's alliance portfolios are perfectly similar if they have the same types of alliances with all the states in
the system, regardless of the level of alliance commitments they agree upon. Thus, S should n ot increase steadily
from table (j) to table (m). The level of overall commitment may be an interesting variable, but it is not pertinent
to measuring portfolio similarity.
12
17
Figure 8: Comparison of S and b from 1816 to 1965. The top graph shows the similarity of Great
Britain and Germany's alliance portfolios as measured by Sw and b for each year from 1816 to
1965. The bottom graph displays the similarity of France and Germany's alliance portfolios for the
two measures. As the two graphs show, there is often great dierence empirically between the two
measures of similarity.
18
Nation
GMY
RUS
UK
FRN
AUH
ITA
BEL
SPN
TUR
NTH
SWD
RUM
POR
SWZ
GRC
DEN
YUG
BUL
NOR
ALB
System Portfolios
Cap GMY RUS
.25
3
0
.21
0
3
.20
0
1
.08
0
3
.08
3
0
.05
3
1
.03
0
0
.03
0
0
.02
0
0
.01
0
0
.01
0
0
.01
3
0
.01
0
0
.01
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
0
GMY 1
2
3
0
13
0
0
3
RUS
1
1
0
0
1
2
0
0
0
0
3
2
0
0
0
b = :03
S = :4
Sw = ,:5
(b)
(a)
Figure 9: Comparison of alliance similarity for Germany and Russia, 1914. Table (a) shows Germany and Russia's alliance portfolios with all the European nations in 1914, along with the nations'
proportion of system capabilities. Table (b) displays the contingency table based on those portfolios
and the values of b and S . (S is unweighted similarity. Sw is similarity weighted by capability
share.)
similarity of the United Kingdom's and Germany's alliance portfolios for every year from 1816 to
1965, while Figure 8 (bottom) displays the similarity of France's and Germany's alliance portfolios.
Although they draw on identical data, Sw and b draw very dierent pictures the similarity
of these states' alliance portfolios. The dierences in the values of b and Sw are not localized in
specic time periods, and the magnitude of those dierences is often quite substantial. Although
Sw and b are not scaled identically, they are only moderately correlated (in these two examples,
U.K.-Germany = :64, France-Germany = :52), indicating that the values of b stray from those
of Sw in tendency as well as in scale. For scholars testing systemic theories of international politics,
it is particularly noteworthy that Sw is often a dierent sign than b , as this is likely to aect the
composition and characteristics of alliance \clusters."
19
Figure 9 (a) displays the alliance portfolios of Germany and Russia in 1914. The leftmost column
lists the states listed by the Correlates of War as members of the European region in 1914, plus
Turkey. The second column indicates each state's material capabilities measured as a proportion
of the sum of all states' capabilties. This information is used in the weighting of Sw as described
in section 4. The remaining two columns display Germany's and Russia's alliance portfolios across
the European region. Figure 9 (b) shows the contingency table generated by plotting Germany's
alliance portfolio against Russia's, along with the b , S , and Sw indicators of the similarity of those
alliance portfolios.
The value of b is easy to understand in light of our earlier discussions. There is little linearity
to the ordinal ranking of the elements in the table in Figure 9, so that the value of b approaches
0. This is clearly misleading, in that the elements are not distributed randomly through the cells
of the table. S does a better job of capturing the similarity of the portfolios, insofar as Germany
and Russia agree precisely on the type of alliance commitment they share with thirteen of their
twenty potential partners, and they very nearly agree on their relationship with a fourteenth.13
Nevertheless, the thirteen states they agree upon are not nearly as strategically important as the
handful of states over which they disagree. Sw = ,:5 because the four entries in the lower-left corner
of the contingency table represent Germany, Austria, Rumania, and Italy, while the two entries in
the upper-right corner of the table represent Russia and France. Thus, where S is positive because
Germany and Russia appear to agree about their alliance commitments to fourteen of the twenty
states, Sw is solidly negative because it reects the fact that Germany and Russia strongly disagree
about their alliance commitments to states representing approximately 58% of the total material
capabilities of Europe.
Figure 10 displays the alliance portfolios and similarity scores of the U.K. and France in 1921.
France has mutual defense pacts with itself, Poland, and Belgium, while the Britain has no alliance
with those states. Britain, on the other hand, has mutual defense pacts with itself and with
Portugaul. Neither Britain nor France has any formal alliance with the other twenty four states
that qualify as members of the Euoropean region in 1921.
b takes on a slightly negative value because Britain and France do not share each other's
alliances with Poland, Belgium, and Portugaul. In this case, as in the previous example, S is
positive because Britain and France share exactly the same type of alliance commitment with
twenty{four states out of the twenty-nine. Sw also indicates that these alliance portfolios are
very similar, because the twenty{four states with which Britain and France share identical alliance
relationships account for 89% of the total material capabilities of Europe.
Finally, Figure 11 displays the same information for the United States and the Soviet Union in
It may seem counterintuitive to treat \no-alliance" relationships as points of agreement, but as we discuss below,
this is a logical consequence of treating the four alliance types as ordinal categories and not a problem attributable
to the similarity measure.
13
20
Nation
USA
RUS
GMY
UK
FRN
ITA
POL
SPN
CZE
RUM
BEL
TUR
AUS
YUG
HUN
NTH
SWD
GRC
POR
LUX
DEN
SWZ
BUL
FIN
LIT
NOR
LAT
EST
ALB
System Portfolios
Cap UK FRN
.25
0
0
.22
0
0
.12
0
0
.10
3
0
.07
0
3
.04
0
0
.03
0
3
.02
0
0
.02
0
0
.02
0
0
.01
0
3
.01
0
0
.01
0
0
.01
0
0
.01
0
0
.01
0
0
.01
0
0
.01
0
0
.01
3
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
0
UK 1
2
3
0
24
0
0
2
FRN
1
0
0
0
0
2
0
0
0
0
3
3
0
0
0
b = ,:09
S = :66
Sw = :62
(b)
(a)
Figure 10: Comparison of alliance similarity for Great Britain and France, 1921. Table (a) shows
Great Britain and France's alliance portfolios with all the European nations in 1921, along with the
nations' proportion of system capabilities. Table (b) displays the contingency table based on those
portfolios and the values of b and S . (S is unweighted similarity. Sw is similarity weighted by
capability share.)
21
Nation
USA
RUS
UK
FRN
ITA
POL
SPN
CZE
BEL
NTH
TUR
SWD
YUG
RUM
HUN
POR
SWZ
GRC
DEN
BUL
NOR
LUX
FIN
IRE
ALB
ICE
System Portfolios
Cap USA RUS
.46
3
0
.17
0
3
.14
0
3
.04
0
3
.03
0
0
.02
0
3
.02
0
0
.01
0
3
.01
0
0
.01
0
0
.01
0
0
.01
0
0
.01
0
3
.01
0
0
.01
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
.00
0
0
0
USA 1
2
3
0
19
0
0
1
RUS
1
0
0
0
0
2
0
0
0
0
3
6
0
0
0
b = ,:11
S = :46
Sw = ,:74
(b)
(a)
Figure 11: Comparison of alliance similarity for the United States and Russia, 1946. Table (a)
shows the United State and Russia's alliance portfolios with all the European nations in 1946,
along with the nations' proportion of system capabilities. Table (b) displays the contingency table
based on those portfolios and the values of b and S . (S is unweighted similarity. Sw is similarity
weighted by capability share.)
22
1946. Interestingly, according to the COW Alliances data set, the U.S. had no formal alliances with
any of its wartime allies in 1946, but the U.S.S.R. had mutual defense pacts with Britain, France,
Czechoslovakia, Poland, and Yugoslavia. The pattern of formal alliances in this particular year is
probably a poor indicator of states' actual loyalties; it is wise to remember that any measure of
the similarity of formal alliance commitments is destined to overlook the eects of informal, tacit
alliance commitments, as we discuss below.
Because so many of the elements are grouped in one cell of the contingency table, b once again
nds very limited evidence of linear order to the rankings of the elements. b is only very slightly
more negative than it was when comparing the U.K. and France in 1921. S is once again solidly
positive. By treating all states as equally important, S appears likely to generate positive scores as
a general rule, since the majority of alliance relationships in any given year consist of \no alliance."
Sw , however, shifts the emphasis from the number of states with which the U.S. and the U.S.S.R.
have identical alliance commitments to the share of the total material resources with which they
have similar alliance commitments. Sw thus appears strongly negative, as we would intuitively
expect for these two states in 1946.
In summary, S does a consistently better job than b at measuring the extent to which alliance
portfolios are similar, and Sw does a consistently better job than either S or b at indicating the
extent to which states agree over the allotment of security resources, which is its intended purpose.
b measures neither of these characteristics well.
5.2
S , b ,
and Attitudes Toward Risk
It is increasingly common for researchers to use a measure of alliance portfolio similarity as an
indicator of the extent to which the states have common or conicting interests. This is then used
as the operationalization of the \utilities" states expect to receive by making demands on each
other. More recently, Bruce Bueno de Mesquita (1985) and James Morrow (1986) have suggested
how the similarity of alliance commitments can be used to measure how willing a state is to take
risks in an uncertain environment. In this section we examine how the signicant dierences in the
values of Sw and b aect measures of states' risk-taking propensities.
The approach to measuring a state's risk propensity is given in Bueno de Mesquita (1985:157).
P
Dene nation i's \security level" as E (Uji), where E (Uji) is the expected utility each state j
j =i
expects to receive from conict with i The greater this sum, the more utility i believes its adversaries
expect to derive from challenging it. These utilities, in turn, depend on how similar state i's alliance
portfolio is with each state j . Formally, following Bueno de Mesquita (1985) and Bueno de Mesquita
(1981a):
3
2
X
X
X ck (Ukj , Uki )
5
E (Uji) = 4Pj (1 , Uji ) + (1 , Pj )(Uji , 1) +
(7)
c
+
c
+
c
i
j
k
j =i
j =i
k=i;j
6
6
6
6
23
where Uji is the value of the b or Sw measure of the similarity of j 's and i's alliance portfolios, ci is
i's material capabilites as a share of the total capabilties of all the states, and Pj is j 's probability
of beating i in a bilateral war, operationalized as Pj = ci c+jcj (Bueno de Mesquita 1981:58; Bueno
de Mesquita, Newman, & Rabushka 1985:46,50).
Holding the alliance portfolios of all the other states in the system constant, i's security level
will vary as it alters its alliance commitments. We use Signorino & Ritter's (1997) genetic algorithm
method to identify the hypothetical alliance commitments that i could adopt that would minimize
and maximize E (Uji). i's m aximium security level would be attained if i were to adopt the prole of
alliance commitments that minimizes the sum of j 's expected gains from conict with i. Similarly,
i's m inimum security level would be attained if i were to adopt the prole of alliance commitments
that maximizes the sum of j 's expected gains from conict with i.
We can then measure i's willingness to take risks by comparing the actual security level provided
by its observed alliance portfolio to the hypothetical minimum and maximum security levels i could
have achieved with some alternate alliance portfolio, according to (Bueno de Mesquita 1985)
P
P
P
E (Uji) , E (Uji)max , E (Uji)min
j =i
P j =i
P j =i
Ri =
E (Uji)max , E (Uji)min
2
6
6
(8)
6
j =i
j =i
6
6
which yields values on the interval [,1; 1], where ,1 represents complete risk aversion, 0 represents
risk neutrality, and 1 represents complete risk acceptance.
We oer two examples of how Sw alters the measure of states' risk propensities obtained using
b. Figure 12 displays the information used to calculate the risk attitude of Britain in 1950.
The rst column of Figure 12 lists the states who qualify as members of the European region in
1950, including the U.S. and Turkey. The second column displays each state's share of the total
material capabilities in the system, which reveals how the states are weighted in Sw . The next four
columns display the alliance portfolios of the four most powerful states in Europe, including Britain's
actual alliance commitments. The remaining columns show the hypothetical alliance portfolios that
Britain would adopt to minimize or maximize its security depending on whether the similarity of
P
alliance portfolios is measured by b or Sw . The \Max" portfolios maximize E (Uji), which
P j =i
is to say they minimize Britain's security. The \Min" portfolios minimize E (Uji), maximizing
j =i
Britain's security. Figure 12 shows Britain's attitude toward risk as being quite risk-averse according
to either of the b - or Sw -based measures, with the Sw -based measure indicating complete risk
aversion. Britain's Sw -based Min portfolio shows that Britain could maximizes its security not
by allying with all the most powerful countries, but rather by allying with a subset of the major
powers | those in a particular bloc. That is exactly what Britain has done, so Ri = ,1, indicating
complete risk aversion.
Figure 13 presents the same type of information for Italy in 1926. In this case, however, the
6
6
24
Nation
USA
RUS
UK
FRN
ITA
POL
SPN
CZE
BEL
YUG
NTH
TUR
RUM
SWD
HUN
POR
GRC
SWZ
DEN
BUL
NOR
LUX
FIN
IRE
ALB
ICE
Share of
System
Capabilities
0.41
0.24
0.10
0.05
0.04
0.03
0.02
0.02
0.01
0.01
0.01
0.01
0.01
0.01
0.01
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
UK | b
Portfolios
Max Min
1
2
1
2
3
3
1
2
1
2
1
2
3
1
1
2
1
2
3
1
1
2
3
1
1
2
3
1
3
1
1
2
3
1
3
1
1
2
1
2
1
2
1
2
1
2
3
1
3
1
1
2
Ri =
,:7
Actual Portfolios
USA UK FRN RUS
3
3
3
0
0
0
0
3
3
3
3
0
3
3
3
0
3
3
3
0
0
0
0
3
0
0
0
0
0
0
0
3
3
3
3
0
0
0
0
0
3
3
3
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
3
3
3
3
0
0
0
0
0
0
0
0
0
3
3
3
0
0
0
0
3
3
3
3
0
3
3
3
0
0
0
0
3
0
0
0
0
0
0
0
0
3
3
3
0
UK | Sw
Portfolios
Max Min
0
3
3
0
3
3
0
3
0
3
3
0
3
0
3
0
0
3
3
0
0
3
3
0
3
0
3
0
3
0
0
3
3
0
3
0
0
3
3
0
0
3
0
3
3
0
3
0
3
0
0
3
,1
Figure 12: Alliance portfolios of European major powers and United States, 1950. The table shows
the actual alliance portfolios of the European major powers and the United States over the European
state system, the b -based max and min portfolios for Great Britain (corresponding to the E (Uji)
terms in equation 8), the Sw -based max and min portfolios, and the associated risk scores. There is
some divergence between the b and Sw risk scores for Great Britain (,:7 versus ,1, respectively).
Comparing the b and Sw max and min portfolios, the Sw portfolios seem to be more reasonable.
25
Nation
USA
RUS
UK
GMY
FRN
ITA
POL
SPN
CZE
BEL
RUM
TUR
YUG
NTH
AUS
GRC
HUN
SWD
POR
LUX
DEN
FIN
SWZ
BUL
IRE
NOR
LAT
LIT
EST
ALB
ITA | b
Portfolios
Max Min
0
3
0
3
1
2
0
3
2
2
3
3
2
2
2
1
2
2
2
2
2
2
1
2
2
1
2
1
2
1
2
1
2
1
2
1
1
2
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
1
2
2
1
2
1
Ri =
:16
Share of
System
Actual Portfolios
Capabilities USA UK FRN GMY ITA RUS
.30
3
0
0
0
0
0
.16
0
0
0
2
0
3
.10
0
3
0
0
0
0
.10
0
0
0
3
0
2
.08
0
0
3
0
0
0
.05
0
0
0
0
3
0
.03
0
0
3
0
0
0
.03
0
0
0
0
0
0
.02
0
0
3
0
1
0
.02
0
0
3
0
0
0
.01
0
0
2
0
1
0
.01
0
0
0
0
0
2
.01
0
0
0
0
2
0
.01
0
0
0
0
0
0
.01
0
0
0
0
0
0
.01
0
0
0
0
0
0
.01
0
0
0
0
0
0
.01
0
0
0
0
0
0
.01
0
3
0
0
0
0
.00
0
0
0
0
0
0
.00
0
0
0
0
0
0
.00
0
0
0
0
0
0
.00
0
0
0
0
0
0
.00
0
0
0
0
0
0
.00
0
0
0
0
0
0
.00
0
0
0
0
0
0
.00
0
0
0
0
0
0
.00
0
0
0
0
0
2
.00
0
0
0
0
0
0
.00
0
0
0
0
1
0
ITA | Sw
Portfolios
Max Min
3
0
3
0
3
0
3
0
3
0
3
3
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
3
0
,:94
Figure 13: Alliance portfolios of European major powers and United States, 1926. The table shows
the actual alliance portfolios of the European major powers and the United States over the European
state system, the b -based max and min portfolios for Italy (corresponding to the E (Uji ) terms in
equation 8), the Sw -based max and min portfolios, and the associated risk scores. There is extreme
divergence between the b and Sw risk scores for Italy, with the b -based score (:16) indicating slight
risk acceptance and the Sw -based score (,:94) indicating almost complete risk aversion. Comparing
the b and Sw max and min portfolios, the Sw portfolios seem to be more appropriate, given the
ordinality assumption imposed on the alliance data.
26
b and Sw measures of Italy's risk attitude diverge substantially. The major powers had very few
alliances at all in 1926, and even fewer with each other. As a result, Sw implies that Italy can
minimize its exposure to risk by holding no alliances with any other state, thereby guaranteeing
that its alliance portfolio will be nearly identical to the portfolios of all the other states in the
system, with almost all the cell entries in each dyadic portfolio comparison table falling into the
(0,0) cell. Subtantively, this is a rather odd result, but it makes perfect sense given the assumption
that (0,0) represents an ordered category such that when two states i and j have no alliance with
state k, their portfolios are similar in exactly the same sense as if they both held mutual defense
pacts with k. Given that few states have alliances with anyone, it seems peculiar that the b results
imply that Italy's alliance portfolio would be \most similar" to the alliance portfolios of the other
states if Italy were to adopt some formal alliance commitment with every other state in the system.
6 From Alliances to Similarity of Interests
Up to this point we have focused solely on the technical issue of measuring the similarity of two
vector of alliance data. As we mentioned at the beginning of this paper, in practice the similarity
of two states' alliance portfolios is generally used to make inferences about the similarity of their
interests. That forces us to ask the next logical question: Even if we are able to measure similarity
of alliance portfolios perfectly, is that a good indicator of similarity of interests? We have our
doubts.
One of the main problems with using the current alliance data as an indicator of similarity
of interests centers around the 0=\no alliance" category of the data and two problems it poses.
First, two alliance portfolios may be very \similar" but almost totally comprised of zeros | e.g.,
two nations with no alliances with anyone. This is nicely illustrated by the similarity of Great
Britain and France's 1921 alliance portfolios in Figure 10. Neither has an alliance with the other
and both have almost no alliances with anyone else. Because of the predominance of zeros in their
portfolios, Su reects a fair amount of similarity. Because neither had alliances with any of the
stronger nations, Sw also reects high similarity. Giving their portfolios the ocular test, they do
look quite similar. Yet, if we were to ask: Do their portfolios reect similarity of interest?, we
could say "yes" in the negative sense | if we meant they were interested in remaining free of any
alliances. However, in the positive sense, we would have to say "no" | from the alliance data they
appear to share no joint interest in each other or common positive goals. Much the same could be
said concerning Italy's 1926 Sw -based E (Uji) minimizing portfolio in Figure 13. The portfolio that
minimizes the expected utility others would derive from attacking Italy is not a portfolio where
Italy allies with them all, but a portfolio that is most similar to theirs: one with almost all zeros. In
this case too, similarity of portfolios does not necessarily imply similarity of interests in a positive
sense. Not all cases are like these. 1950 Great Britain's Sw -based E (Uji) minimizing portfolio in
27
Figure 12 is one based mostly on similarity with the United States and France's portfolios in a
positive sense | i.e., based on defense pacts with those countries.
An important issue this raises is the eect of irrelevant states in the comparison of alliance
portfolios. States that are irrelevant to the referrent state's decisionmaking will tend to show up as
zeros. The more of these irrelevant states in the portfolios, the more it will shift the similarity scores
upwards towards one. While weighting by capabilities can help alleviate this, the real solution is
appropriate specication of the domain of the portfolios. If particular states are not relevant to
decisionmaking in some region, they should not be included.
The second problem the \no alliance" category poses concerns the ordinality assumption. The
alliance categories are generally assumed to be ordinal from the lowest level of commitment, \no
alliance," to \entente" to \neutrality pact" to the highest level of commitment, \defense pact."
However, for any given portfolio, the states in the \no alliance" category fall along a continuum of
varying degrees of commitment towards the referrant country. For sake of argument, we divide these
into three categories: those who have no alliance with the referrant country because of hostility,
those who have no alliance because of indierence, and those who have no alliance because of an
implicit alignment with that country. While the rst two types of nations can be grouped in the
\no alliance" category without destroying the ordinality of the data, those that have an implicit
alignment with the referrant country clearly have a higher level of commitment | perhaps as high
as a defense pact.
An example of this can be seen in Figure 9. There, the similarity of 1914 Germany and Russia's
portfolios is Su = :4 and Sw = ,:5 | in the former case because they jointly have no alliances with
the majority of states in the system and in the latter case because they have dissimilar alliance
levels with the strongest states in the system. What is not shown in the alliance data is that Russia
was the informal protector of the Balkan Slavs. Alignments existed between Russia and Yugoslavia,
Rumania, and Bulgaria. However, no formal treaties existed, so the three Balkan countries show
up in the (0,0) cell of the contingency table, inating the values of all the similarity measures, given
the joint ranking of the other alliances. Other examples like this easily come to mind: pre-World
War II US and Great Britain, current US and Israel, and current US and Taiwan.
This problem can be alleviated in a manner that should be undertaken regardless of the ordinality issue: More (dierent) data should be brought to bear in determining similarity of interests.
Data exists on UN votes, diplomatic missions, trade, and disputes. Such data could (1) ll in the
alliance \gaps" of states that are aligned with the referrant state but do not have a formal treaty
and (2) would bring a richer array of information with which to determine the interests of states.
It turns out that the measure of similarity S we develop here is well suited for such multidimensional data. Simply code the UN votes, diplomatic missions, trade amounts, and disputes into
their own issue portfolios similar to the alliance data, create a stacked vector of these issue port28
folios, provide intra- and inter-issue weights, and then use S to calculate the similarity of any two
multi-issue stacked portfolios. For the many reasons just cited, we suspect this will provide a much
better method of arriving at similarity of interests than the current method based on Kendall's b
and alliance ties.
7 Concluding Remarks
We have made two claims. First, contrary to twenty years' practice, Kendall's b should not be used
as an indicator of the similarity of states' alliance portfolios. b measures the linear association
of two alliance portfolio rankings, which is not at all the same as measuring the similarity of
their alliance portfolios. We have demonstrated through hypothetical and empirical examples that
b is inappropriate for this task. Rather than simply identifying these problems, we have also
developed an alternative measure of alliance portfolio similiarity, S , which avoids many of the
pitfalls associated with b , and we have employed data on alliances among European states to
compare the eects of S versus b in measures of utility and risk propensity.
Secondly, we have also claimed that inferring state interest from alliance commitments is problematic, even given a perfect measure of portfolio similarity. However, we believe our method in
combination with alternative sources of data (e.g., on UN votes, diplomatic missions, trade, and
disputes) can provide leverage on the dicult problem of estimating states' similarity of interests.
Finally, we certainly do not mean to suggest that the inappropriate use of Kendall's b renders
decades' of published work worthless. Quite the contrary, we feel the theoretical sophistication of
many of the studies using b has advanced far beyond the point where b might be accepted as
\good enough." We hope that our eorts here will contribute to the improvement of several very
valuable research programs.
29
References
[1] Altfeld, Michael F. 1984. \The Decision to Ally: A Theory and Test." Western Political
Quarterly 37(4): 523{544.
[2] Altfeld, Michael F., and Bruce Bueno de Mesquita. 1979. \Choosing Sides in Wars." International Studies Quarterly 23(2): 87{112.
[3] Berkowitz, Bruce D. 1983. \Ralignment in International Treaty Organizations." International
Studies Quarterly 27(1): 77{96.
[4] Berry, Kenneth J. and Paul W. Mielke, Jr. 1988. \A Generalization of Cohen's Kappa Agreement Measure to Interval Measurement and Multiple Raters." Educational and Psychological
Measurement. 48(4):921{33.
[5] Berry, Kenneth J. and Paul W. Mielke, Jr. 1990. \A Generalized Agreement Measure." Educational and Psychological Measurement. 50(1):123{5.
[6] Bishop, Yvonne M. M., Stephen E. Fienberg, and Paul W. Holland. 1975. Discrete Multivariate
Analysis: Theory and Practice. Cambridge, MA: MIT Press.
[7] Bueno de Mesquita, Bruce. 1975. \Measuring Systemic Polarity." Journal of Conict Resolution 19(2): 187{216.
[8] Bueno de Mesquita, Bruce. 1978. \Systemic Polarization and the Occurrence and Duration of
War." Journal of Conict Resolution 22(2): 241{267.
[9] Bueno de Mesquita, Bruce. 1980. \An Expected Utility Theory of International Conict."
American Political Science Review 74(4): 917{931.
[10] Bueno de Mesquita, Bruce. 1981a. The War Trap. New Haven: Yale University Press.
[11] Bueno de Mesquita, Bruce. 1981b. \Risk, Power Distributions, and the Likelihood of War."
International Studies Quarterly 25(4): 541{568.
[12] Bueno de Mesquita, Bruce. 1985. \The War Trap Revisited: A Revised Expected Utility
Model." American Political Science Review 79(1):156{76.
[13] Bueno de Mesquita, Bruce, and David Lalman. 1986. \Reason and War." American Political
Science Review 80: 113{131.
[14] Bueno de Mesquita, Bruce, and David Lalman. 1988. \Empirical Support for Systemic and
Dyadic Explanations of International Conict." World Politics 41(1): 1{20.
30
[15] Bueno de Mesquita, Bruce, and David Lalman. 1992. War and Reason: Domestic and International Imperatives. New Haven, CT: Yale University Press.
[16] Bueno de Mesquita, Bruce, David Newman, and Alvin Rabushka. 1984. Forecasting Political
Events: Hong Kong's Future. New Haven, CT: Yale University Press.
[17] Cohen, Jacob. 1960. \A Coecient of Agreement for Nominal Scales." Educational and Psychological Measurement. 20:37{46.
[18] Cohen, Jacob. 1968. \Weighted Kappa: Nominal Scale Agreement with Provision for Scaled
Disagreement or Partial Credit." Psychological Bulletin. 70(4):213{20.
[19] Davies, Mark and Joseph L. Fleiss. 1982. \Measuring Agreement for Multinomial Data." Biometrics. 38:1047{51.
[20] Huth, Paul, D. Scott Bennett, and Christopher Gelpi. 1993. System Uncertainty, Risk Propensity, and Interanational Conict Among the Great Powers. Journal of Conict Resolution
36(3): 478{517.
[21] Iusi-Scarborough, Grace. 1988. \Polarity, Power and Risk in International Disputes." Journal
of Conict Resolution 32(3): 511{533.
[22] Kendall, Maurice G. and Alan Stuart. 1961. The Advanced Theory of Statistics, Vol2_ . London:
Charles Grin and Company Limited.
[23] Kendall, Maurice G. and Jean Dickinson Gibbons. 1990. Rank Correlation Methods. 5th Edition. London: Edward Arnold.
[24] Kim, Cahe-Han. 1991. \Third-Party Participation in Wars." Journal of Conict Resolution
35(4): 659{677.
[25] Kim, Woosang. 1989. \Power, Alliance, and Major Wars, 1816-1975." Journal of Conict
Resolution 33(2): 255{273.
[26] Kim, Woosang. 1991. \Alliance Transitions and Great Power War." American Journal of
Political Science 35(4): 833{50.
[27] Kim, Woosang, and James D. Morrow. 1992. \When Do Power Shifts Lead to War?" American
Journal of Political Science 36(4): 896{922.
[28] King, Gary. 1989. Unifying Political Methodology: The Likelihood Theory of Statistical Inference. New York: Cambridge University Press.
31
[29] Kotz, Samuel and Norman L. Johnson. 1988. Encyclopedia of Statistical Sciences. New York:
Wiley.
[30] Lalman, David. 1988. \Conict Resolution and Peace." American Journal of Political Science
32: 590{613.
[31] Lalman, David, and David Newman. 1991. \Alliance Formation and National Security." International Interactions 16(4): 239{253.
[32] Levy, Jack S. 1981. Alliance Formation and War Behavior. Journal of Conict Resoluton 25:
581-614.
[33] Liebetrau, Albert M. 1983. Measures of Association. Newbury Park: Sage.
[34] Majeski, Stephen J., and David J. Sylvan. 1984. Simple Choices and Complex Calculations: A
Critique of The War Trap 28(2): 316{340.
[35] Morrow, James D. 1987. \On The Theoretical Basis of a Measure of National Risk Attitudes."
International Studies Quarterly 31(3): 423{438.
[36] O'Connell, Dianne L. and Annette J. Dobson. 1984. \General Observer-Agreement Measures
on Individual Subjects and Groups of Subjects." Biometrics. 40:973{983.
[37] Organski, A.F.K., and Jacek Kugler. 1980. The War Ledger. Chicago, IL: University of Chicago
Press.
[38] Ostrom, Charles W., Jr., and John H. Aldrich. 1978. \The Relationship Between Size and
Stability in the Major Power International System." American Journal of Political Science
22(4): 743{771.
[39] Schouten, H. J. A. 1982. \Measuring Pairwise Interobserver Agreement when All Subjects Are
Judged by the Same Observers." Statistica Neerlandica. 36(2):45{61.
[40] Signorino, Curtis S., and Jerey M. Ritter. 1997 (in progress). Calculating State Risk Propensities: A Genetic Algorithm Method for a Combinatorial Optimization Problem. Mimeo. Harvard University.
[41] Stoll, Richard J. 1984. \Bloc Concentration and the Balance of Power." Journal of Conict
Resolution 28(1): 25{50.
[42] Stoll, Richard J., and Michael Champion. 1985. Capability Concentration, Alliance Bonding,
and Conict Among the Major Powers. In Alan Ned Sabrosky (ed.) Polarity and War: The
Changing Structure of International Conict. Boulder, CO: Westview Press.
32
[43] Wallace, Michael. 1985. \Polarization: Towards a Scientic Conception." In Alan Ned Sabrosky
(ed.) Polarity and War: The Changing Structure of International Conict. Boulder, CO: Westview Press.
33
Fly UP