Tau-b or Not Tau-b: Measuring Alliance Portfolio Similarity Curtis S. Signorino
by user
Comments
Transcript
Tau-b or Not Tau-b: Measuring Alliance Portfolio Similarity Curtis S. Signorino
Tau-b or Not Tau-b: Measuring Alliance Portfolio Similarity Curtis S. Signorino Jerey M. Ritter1 Work in Progress Comments Welcome April 2, 1997 Littauer Center North Yard, Harvard University. Email: [email protected] and Prepared for the 1997 annual meeting of the Midwest Political Science Association. Although they may not agree with all our conclusions, we would like to thank Bruce Bueno de Mesquita and Scott Bennett for providing us with data and with many helpful discussions, as well as Chris Gelpi and the participants in the Harvard Government Department's Rational Choice Discussion Group for their valuable comments. 1 [email protected]. Abstract The pattern of alliance commitments among states is commonly assumed to reect the extent to which states have common or conicting security interests. For the past twenty years, Kendall's b has been used to measure the similarity between two nations' \portfolios" of alliance commitments. Widely employed indicators of systemic polarity, state utility, and state risk propensity all rely upon b . We demonstrate that b is inappropriate for measuring the similarity of states' alliance commitments. We develop an alternative measure of alliance portfolio similiarity, S , which avoids many of the problems associated with b , and we use data on alliances among European states to compare the eects of S versus b in measures of utility and risk propensity. Finally, we identify several problems with inferring state interest from alliance commitments and we provide a method to overcome those problems using S in combination with data on alliances, trade, UN votes, diplomatic missions, and other types of state interaction. 1 Introduction In recent years, international relations scholars have devoted considerable eort to testing hypotheses derived from systemic theories of international politics and from choice-theoretic models of international interactions. For each of these purposes, researchers have attempted to measure and compare the patterns of military alliances among states. Bruce Bueno de Mesquita rst proposed using Kendall's b rank-order correlation coecient as a measure of the similarity of states' alliance portfolios in his article \Measuring Systemic Polarity" (Bueno de Mesquita 1975). Over the last twenty years, it has become a common practice to rely upon b as a measure of the similarity of two states' alliance commitments with the other states in the system. Those interested in testing systemic theories of international politics use the b measure of alliance portfolio similarity to identify alliance \clusters" and to measure the extent to which those clusters are discrete or overlapping (e.g. Bueno de Mesquita 1975; Ostrom and Aldrich 1978; Bueno de Mesquita 1978, Organski and Kugler 1980; Bueno de Mesquita 1981b, Stoll 1984; Stoll and Champion 1985; Iusi-Scarborough 1988; Bueno de Mesquita & Lalman 1988; W. Kim 1989, 1991; C. Kim 1991). The method of constructing state utilities by treating alliance portfolios as revealed preferences over security issuess was developed in a pioneering article by Altfeld and Bueno de Mesquita (Altfeld and Bueno de Mesquita 1979). Since then, many scholars have employed the similarity of states' alliance portfolios as a useful indicator of the similarity of those states' security interests. These authors subject choice-theoretic models of international conict to empirical tests, using b as the basis for operational measures of states' willingness to take risks and of those states' expected utilities for challenging each other. (Bueno de Mesquita 1978; Altfeld and Bueno de Mesquita 1979; Bueno de Mesquita 1980; Bueno de Mesquita 1981; Berkowitz 1983; Altfeld 1984; Bueno de Mesquita 1985; Altfeld 1985; Bueno de Mesquita & Lalman 1986; Lalman 1988; Scarborough 1988; Bueno de Mesquita & Lalman 1988; Lalman and Newman 1991; C. Kim 1991; W. Kim 1991; Bueno de Mesquita & Lalman 1992; Morrow and Kim 1992; Huth, Bennett and Gelpi 1993). The fundamental question we ask in this paper is: How well does b measure the similarity of two states' alliance commitments? We argue that while Kendall's b is a useful measure of association for ranked categorical data (e.g. Levy 1981), it is inappropriate to use Kendall's b as an indicator of the similarity of states' alliance portfolios. Our examination of this question also raises some concerns about the practice of interpreting the similarity of alliance portfolios as a measure of states' common interests. By way of a roadmap, we begin by presenting the construct of the \alliance portfolio" in section two. In section three, we present Kendall's b and demonstrate the problem of applying it as a measure of portfolio similarity. We develop an alternative measure S for the similarity of alliance 2 FRN UK GMY AUH RUS FRN UK GMY AUH RUS 3 0 0 0 0 0 3 3 3 3 0 3 3 3 3 0 3 3 3 3 UK FRN GMY AUH ITA RUS 0 3 3 3 3 UK FRN GMY AUH ITA RUS 3 1 0 0 0 0 1 3 0 0 2 3 0 0 3 3 3 0 0 0 3 3 3 1 0 2 3 3 3 0 0 3 0 1 0 3 (a) 1816 (b) 1905 Figure 1: Major Power Alliances in 1816 and 1905. Each table element denotes the type of alliance the column nation has with the row nation. A state's alliance portfolio is the column vector of its alliances with each of the row nations. (0=no alliance, 1=entente, 2=neutrality pact, and 3=defense pact). commitments in section four and show that it suers from none of the problems associated with b . In section ve, we provide empirical comparisons of S with b using data on alliances between European states. In section six, we identify several reasons why the similarity of states' alliance portfolios are likely to be poor measures of their similarity of interests, but suggest a method for obtaining better estimates of the similarity of states' interests using S . In section seven, we conclude. 2 Alliance \Portfolios" To set the stage for further discussion of b and other measures, we rst need to identify the variable of interest here: the alliance portfolio. The Correlates of War (COW) Alliances Data Set classies alliances into four types, which in this paper we code as follows: 0=no alliance, 1=entente, 2=neutrality or nonaggression pact, 3=mutual defense pact. We follow Bueno de Mesquita (1975:195) in assuming that that these categories represent increasing degrees of formal alliance commitment between states and that it is therefore appropriate to treat the data as ordinal.1 Let the states in the system for a given year be indexed k = 1 : : :N . Then state i's alliance portfolio is an N 1 vector Ai = [ai1 ai2 : : : aiN ] , where each element aik 2 f0; 1; 2; 3g is i's alliance with state k. For the moment, consider a simple example in which we compare the major European powers' alliance portfolios with each other. Figure 1 displays the alliances between major powers in 1816 and 1905.2 As an example of an alliance portfolio, France's portfolio in 1905 AFRN = [ 1 3 0 0 2 3 ] .3 To determine the similarity of alliance portfolios, we compare the portfolios of two nations i 0 0 This assumption is certainly not unproblematic, but we postpone a discussion of it until section six. Major powers are based on COW codings. States are treated as having mutual defense pacts with themselves. 3 For the empirical analyses later in this paper, each state's portfolio will actually be taken over all the states in the European system, not just the major powers. 1 2 3 0 0 ITA 1 FRN 2 UK 3 RUS 1 2 0 ITA 1 2 3 FRN 3 GMY AUH ITA 0 0 0 0 2 FRN 1 1 0 0 0 2 0 0 0 1 3 1 0 1 0 (b) (a) Figure 2: Contingency tables based on France's and Italy's alliance portfolios over major powers in 1905. Table (a) shows the distribution of the bivariate alliance rankings (aik ; ajk ). Table (b) shows the corresponding distribution of counts. (The alliance categories along the top and left of the tables are 0=no alliance, 1=entente, 2=neutrality pact, and 3=defense pact.) and j . In 1816 for Germany and Austria-Hungary, this is not terribly dicult, since their portfolios are exactly the same: both have no alliance with France and defense pacts with Great Britain and Russia. However, it is not so clear for France and Italy in 1905 | hence the need for a measure to quantify the similarity of two nations' alliance portfolios. An alternative (equivalent) way of conceptualizing the data | which will be used throughout this paper | is to treat two alliance portfolios as cross-classications of alliances and to represent this in a 4 4 contingency table. If Ai and Aj represent states i and j 's vectors of alliance commitments with N states indexed k = 1 : : :N , then the elements of the contingency table are comprised of the joint rankings (aik ; ajk ). Figure 2 gives an example of the square contingency table that results from the alliance portfolios of France and Italy in 1905, the data for which is based on their respective columns in Figure 1(b). Figure 2(a) displays the distribution of the joint rankings (aik ; ajk ). Figure 2(b) shows the corresponding distribution of counts. 3 Kendall's as a Measure of Association b Since Bueno de Mesquita (1975), Kendall's b has been used in the overwhelming majority of published articles that attempt to measure the similarity of alliance portfolios. Kendall's b is one measure among a host of others that fall under the rubric of \measures of association." Chi-square and proportional reduction in error measures for nominal data, the Goodman-Kruskal for ordinal data, Spearman's for interval data, and Pearson's product-moment correlation for continuous data are all measures of this type.4 For general references to these measures (and others cited in this paper) see Bishop et al. (1975), Kotz & Johnson (1988), Kendall & Stuart (1961), Kendall & Gibbons (1990), and Liebetrau (1983). 4 4 The appeal of Kendall's b rests on the facts that it is specically designed for measuring the association between two sets of ordinal rankings when \tied" rankings are permitted, and it is easily interpretable, varying from ,1 for perfectly negative association to 1 for perfectly positive assocation.5 A decade after Bueno de Mesquita rst introduced this approach to measuring the similarity of states' alliance portfolios, Michael Wallace called it \a major advance in every respect," that represents \a notable improvement in sophistication" over previous measures."6 (Wallace 1985:102,103) Given two rankings x and y over N items, the calculation for b is based on comparisons of pairs of joint rankings (xi; yi ) and (xj ; yj ) and determining whether those pairs of joint rankings are \concordant," \discordant," or tied. A pair of rankings (xi ; yi) and (xj ; yj ) is considered concordant if xi > xj and yi > yj or if xi < xj and yi < yj . They are considered discordant if xi > xj and yi < yj or if xi < xj and yi > yj . If all pairings of joint rankings are concordant, then x and y are perfectly positively associated. If all are discordant, x and y are perfectly negatively associated. To calculate b for two rankings x and y over N items (see e.g. Kendall & Stuart, 1961:562{3), rst dene for x a matrix representing whether the paired comparisons of xi and xj are concordant, discordant, or tied for all raters i; j : 8 < +1 if xi < xj aij = : 0 if xi = xj ,1 if xi > xj (1) and similarly for all paired comparisons in y : 8 < +1 if yi < yj bij = : 0 if yi = yj : ,1 if yi > yj The measure for b is then given by P i;j b = rP i;j aij bij a2ij i; j = 1; 2; : : :n; i 6= j P 2 bij (2) (3) i;j When x and y share the same number of ordinal levels, the contingency table is square and the b measure of correlation takes on values in the interval [,1; 1], where b = 1 represents complete concordance in rankings, b = ,1 represents complete discordance in rankings, and b = 0 represents independence in rankings. Bueno de Mesquita (1975:198) oers no explicit justication for the use of b other than it being one of a variety of possible measures of association. Majeski and Sylvan, criticizing Bueno de Mesquita's use of b , appear to mistakenly believe that b is measured on a cardinal or interval scale (Majeski and Sylvan 1984:331). 6 Wallace nevertheless has strong reservations about using b to measure systemic polarity and he develops an alternate approach to measuring polarity. 5 5 3.1 b as a Measure of Alliance Portfolio Similarity The application of b to measuring the similarity of alliance portfolios between states is straightforward. In a system of N states, the b statistic compares the way state i ranks its alliance relationships with states 1; : : :; N to the way state j ranks its alliance relationship with states 1; : : :; N . Take the pair of alliance rankings (aik ; ajk ) and (ail ; ajl ) of i and j for states k and l. When i has a stronger (weaker) alliance commitment to k than it does to l and j has a stronger (weaker) alliance commitment to k than it does to l, then the pair of rankings are concordant. When i has a stronger (weaker) alliance with k than with l, while j has a weaker (stronger) alliance with k than with l, their rankings are said to be discordant. With the two portfolios Ai and Aj of i's and j 's alliance commitments to each nation k = 1; : : :; N , we can then use Equation 3 to calculate the association between i's and j 's alliance commitments. The question is whether this measure of association is also a good measure of the similarity of alliance portfolios. We argue that it is not|that there are critical dierences between the rank{ order association measured by Kendall's b and the \similarity" of alliance commitments. We provide the following series of examples to show why a measure of association is not necessarily an appropriate measure of similarity. 3.1.1 Perfect Association, But Imperfect Similarity At rst glance, one would expect that if two alliance portfolios are perfectly positively associated then they should be similar. This is not necessarily the case. A b score of 1 does not necessarily mean that two states in fact have identical alliance portfolios. Figure 3 shows three hypothetical contingency tables with dierent distributions of elements that produce a b score of 1.7 It is true that b generates a score of 1 when all of the elements fall on the main positive diagonal of the contingency table, as in Figure 3 (a). Kendall's b also generates a score of 1 when all the elements fall in one of the o-diagonals, as in Figure3 (b) and (c). Despite their identical b scores, there are substantial dierences between the three cases. In case (a), states i and i have identical types of alliances with every state in the system, while in case (b) the states agree about the rank-order of their alliances but have dierent types of alliances with every state in the system. Case (c) is similar to case (b), except that the types of alliances i and j have with the other system members are even less similar: their alliance types with the other system members dier by two ordinal categories rather than one. In fact, it is dicult to tell if the data should be interpreted as a positive diagonal or as a slight deviation from a point on the main negative diagonal. 7 These cases are of i and j 's alliances with four other nations. 6 0 1 2 3 0 1 0 0 0 1 0 1 0 0 2 0 0 1 0 3 0 0 0 1 0 1 2 3 0 0 0 0 0 1 2 0 0 0 2 0 1 0 0 3 0 0 1 0 0 1 2 3 0 0 0 0 0 1 0 0 0 0 2 2 0 0 0 b = 1 b = 1 b = 1 (a) (b) (c) 3 0 2 0 0 Figure 3: Illustrations of dierent alliance rankings yielding b = 1. The gure displays three cases of state i's and state j 's alliances with four other nations. In case (a), i and j have the same alliances with the four nations and we would expect a measure of alliance portfolio similarity to indicate perfect similarity. However, i's and j 's alliances with the four nations diverge slightly in case (b) and even more in case (c) | yet b still reects perfect similarity of alliance portfolios. (The alliance categories along the top and left of the tables are 0=no alliance, 1=entente, 2=neutrality pact, and 3=defense pact.) 0 1 2 3 0 0 0 0 1 1 0 0 1 0 2 0 1 0 0 3 1 0 0 0 0 1 2 3 0 0 0 0 0 1 0 0 0 2 2 0 0 1 0 3 0 1 0 0 0 1 2 3 0 0 0 0 0 1 0 0 0 0 2 0 0 0 2 b = ,1 b = ,1 b = ,1 (a) (b) (c) 3 0 0 2 0 Figure 4: Illustrations of dierent alliance rankings yielding b = ,1. The gure displays three cases of state i's and state j 's alliances with four other nations. In case (a), i and j have completely opposite alliances with the four nations and we would expect a measure of alliance portfolio similarity to indicate perfect dissimilarity. However, i's and j 's alliances with the four nations converge slightly in case (b) and even more in case (c) | yet b still reects perfect dissimilarity of alliance portfolios. (The alliance categories along the top and left of the tables are 0=no alliance, 1=entente, 2=neutrality pact, and 3=defense pact.) 7 3.1.2 Perfect Negative Association, But Imperfect Dissimilarity The hypothetical contingency tables in Figure 4 show that a similar problem exists at b 's lower boundary value. In Figure 4 (a), i and j have opposing views about the ordinal relationship of their alliance commitments to the four other system members. Nevertheless, i and j do not really have antithetical alliance portfolios, in that they are mutually allied to the states in the (1,2) and (2,1) cells, albeit at dierent levels of commitment. International relations researchers examining alliance patterns should not be pleased that b takes on a value of ,1 whenever all of the elements fall into the main negative diagonal of the contingency table, because truly antithetical disagreement occurrs only when all of the elements are concentrated in the (0,3) and (3,0) cells of the table. In Figure 4 (b), i and j have diametrically opposing views on the correct rank-ordering of their alliance commitments with the other states in the system, but they agree that they should have some formal alliance commitment with all of the other states in the system. Moreover, they agree about the specic type of alliance commitment they should have with one of the system members, represented by the element in the (2,2) box. In Figure 4 (c), both i and j have relatively high alliance commitments to the other states in the system, but they disagree over the ordinal ranking of those commitments by one category. While it seems clear that the alliance portfolios represented in Figure 4 (c) are more similar than the portfolios represented in Figure 4 (b), which are themselves more similar than the portfolios represented in Figure 4 (a), all three contingency tables generate a Kendall's b of ,1. 3.1.3 When b Is Undened While Figures 3 and 4 provide examples of cases in which it is misleading to interpret identical b scores as representing identical degrees of similarity in states' alliance portfolios, Figure 5 displays plausible hypothetical cases in which a comparison of two states' alliance portfolios yields a b that is undened. Whenever one or both of the states rank the elements in the same ordinal category, b is undened. Recall from Equations (1) and (2) that b compares one states' ordinal rankings of elements to the others' ordinal rankings of the same elements. If either i or j ranks all of the elements in the same category, there is no \order" to its ranking of the elements. As a result, one of the terms in the denominator of Equation (3) is zero, and no meaningful value exists for b . In cases (a) and (b) of Figure 5, one of the states has dierent alliance commitments to the other members of the system, while the other state has identical levels of alliance commitments to all the members of the system. The result is that the elements appear in a row or a column of the contingency table. If either pair of alliance commitments were to appear in the real world, it would be impossible to measure their similarity using b . The example in Figure 5 (c) is particularly disturbing. When states i and j both have defense pacts with all the states in the system, it seems reasonable to expect a valid measure of alliance portfolio similarity to indicate that i and j have 8 0 1 2 3 0 0 0 0 1 1 0 0 0 1 2 0 0 0 1 3 0 0 0 1 0 1 2 3 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3 1 1 1 1 0 1 2 3 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3 0 0 0 4 b = undened b = undened b = undened (a) (b) (c) Figure 5: Illustrations of dierent alliance rankings where b is undened. The gure displays three cases of state i's and state j 's alliances with four other nations. Most concerning of the three cases is (c), where i and j have defense pacts with all four nations, indicating perfect similarity of alliance portfolios | yet b is undened. (The alliance categories along the top and left of the tables are 0=no alliance, 1=entente, 2=neutrality pact, and 3=defense pact.) 0 1 2 3 0 10 0 0 1 1 0 0 0 0 2 0 0 0 0 b = ,:09 (a) 3 1 0 0 0 0 1 2 3 0 0 0 0 1 1 0 10 0 0 2 0 0 0 0 3 1 0 0 0 0 1 2 3 b = ,1 0 0 0 0 1 1 0 0 0 0 2 0 0 10 0 b = ,1 (b) (c) 3 1 0 0 0 0 1 2 3 0 0 0 0 1 1 0 0 0 0 2 0 0 0 0 3 1 0 0 10 b = ,:09 (d) Figure 6: Association changes when \similarity" has not. Cases (a){(d) show how the association between Ai and Aj can change, yielding a dierent b , even though the alliance portfolios are no more similar or dissimilar in any of the cases. In cases (b) and (c), the commitment of i and j to the ten other countries increases jointly, but does not change with respect to each other. Yet, b = ,1 would imply greater dissimilarity in cases (b) and (c) than in (a) or (d). identical alliance portfolios. Kendall's b , however, is undened. 3.1.4 Constant Similarity With Varying Association Figure 5 (c) calls our attention to another characteristic of b that reduces its attractiveness as a measure of the similarity of alliance portfolios: because the b measure of association is sensitive to cells in which many elements are concentrated, the value of b may vary over pairs of states with equally similar alliance portfolios. Figure 6 demonstrates how this behavior can aect the value of b . We imagine a system of twelve states, in which states i and j are treated as having mutual defense pacts with themselves, no alliance with each other, and identical alliance commitments to the other ten states. The cases in Figure 6 dier only in the type of alliance commitments i and j have with the other ten states in the system: \no alliance" in case (a), ententes in case (b), 9 neutrality/non-aggression pacts in case (c), and defense pacts in case (d). Notice that cases (a) and (d) produce b scores of ,:09, indicating mild dissimilarity, even though the i and j are in perfect agreement about the type of alliance commitment they should have with 10 of the 12 states in the system. Moreover, cases (b) and (c) generate b scores of ,1, indicating complete disagreement, despite the fact that i and j remain in complete agreement about the type of alliance commitment they should have with 10 of the 12 states in the system and despite the fact that the extent of their disagreement over the other two relationships is no greater than it is in (a) or (d). In cases (b) and (c), i and j agree that they should have increasingly binding alliance commitments with the vast majority of states in the system, but their b rating of alliance portfolio similarity is lower than in case (a), where i and j have no alliances with each other or with any other state. The results in Figure 6 are due to the manner in which b handles tied rankings. In all four cases, the pair comparisons of the ten states in the same cell contribute nothing to the b score, since the aij and bij terms of Equation 3 are all zero. However in cases (a) and (d), there are partial ties due to the ten states in the same cell being paired with i and j 's rankings of alliances with each other, providing some non-zero aij and bij terms. In cases (b) and (c), all of the pairings (that are not among the ten states in the same cell) are completely discordant. Since the pairings in of the rankings in the same cell do not contribute anything to the score, only the discordant pairings contribute, yielding a b = ,1. But clearly we can not say that (a) displays more similar portfolios than (b) when in fact i and j jointly increase their commitent to the ten states in (b) and (c). There are two critical lessons to be drawn from this example. First, measures of linear association may obscure the actual \similarity" of states' alliance portfolios precisely because they focus on linear association rather than actual agreement. Second, coding rules that determine the domain of other possible alliance partners for i and j are extremely important, since denitions of systemic membership that include larger numbers of states that are \irrelevant" to i's and j 's security interests will generally result in a larger number of elements falling into the (0,0) corner of the contingency table, skewing the value of b . 3.1.5 \Not Tau-b" Kendall's b is a rank-order correlation coecient; it is designed to measure the linear association between i's and j 's ordinal rankings of the elements in a square contingency table, which is not the same thing as the \similarity" of i's and j 's relationships with the elements in the table. In fact, b can seriously misrepresent the degree to which two states' alliance portfolios are similar. It seems reasonable to claim that i's and j 's alliance portfolios are more similar when i and j have identical types of alliances with each of the other system members than they are when i and j have dierent types of alliances with each of the other system members but identical ordinal rankings of those commitments. Kendall's b is insensitive to the dierent degrees of alliance portfolio similarity and 10 dissimilarity exhibited in the dierence between elements falling into the main-diagonals or the o-diagonals of the table. b fails to notice the type of dissimilarity in alliance portfolios exhibited in the dierence between the diagonals and the o-diagonals. Equally disturbingly, b is undened whenever i or j rank all of their potential alliance partners in the same ordinal category, and b is undened whenever elements are concentrated at the extreme corners of the contingency table, even when such concentrations should clearly be interpreted as representing completely identical or opposite alliance portfolios. These characteristics render b inappropriate as a measure of alliance portfolio similarity. This indictment is not limited to b alone: any of the measures of association or correlation mentioned at the beginning of this section would be inappropriate as an indicator of the similarity of two states' alliance portfolios. The fact that b can seriously misrepresent the degree to which two states' alliance portfolios are similar, combined with the absence of any convenient and eective substitute for b , leads us to seek an alternative approach to measuring alliance portfolio similarity. 3.2 Be Not A Borrower: Agreement and Cohen's Political Science is often called a borrowing discipline, and in this case it seems reasonable to ask our sister social sciences if they could loan us a better measure of alliance portfolio similarity. Social psychologists and educational testing statisticians are very interested in the extent to which two observers \agree" about how a collection of elements should be rated. Measures of agreement, notably Cohen's and its variants, do have some features that make them more appealing than measures of association as indicators of the similarity of alliance portfolios. Unfortunately, these measures suer from at least two problems that render them inappropriate for the task at hand. Recall that a measure of association quanties the extent to which two variables are dependent and the direction of that dependence. If two variables are perfectly associated, the measure of association is only required to predict the category of one variable given the category of the other (Bishop et al., 1975:394). This ability to predict the category of one variable given the category of the other is most clearly shown by the perfectly positive and perfectly negative association results in Figures 3 and 4, respectively. In both cases, the association is such that knowing the value of one category allows us to determine the category of the second | even when the portfolios are not what we would consider perfectly \similar." Measures of agreement are a special case of measures of association, assuming more of the structure of the contingency table. Measures of agreement are particularly appropriate for data where x and y represent rankings (or categorizations) over N items by two dierent raters using the same classications.8 The question answered by measures of agreement is not the extent to which the ratings are associated, but the extent to which the raters actually agree in their ratings. 8 Measures of association do not require that x and y have similar categories, which allows for J K tables. 11 This places more restrictions on the structure of the contingency table because the emphasis is on the extent to which the elements of the contingency table diverge from the positive diagonal (i.e., where the cells represent agreement between the raters). One of the earliest and most widely used measures of agreement is Cohen's (1960) for nominal data. It measures the extent to which contingency table elements fall on the positive diagonal, corrected for the probability of falling on the positive diagonal due to chance. One drawback of is that it is asymmetrically oriented towards measuring agreement and not disagreement. International relations researchers would like a measure that is symmetric, allowing for both complete agreement and complete disagreement. Unlike b , which varies from ,1 for complete discordance to 1 for complete concordance, does not have a similar nice, symmetric character, ranging between complete disagreement and complete agreement. Although a number of variations of Cohen's have been developed to account for partial agreement and for use with other data types, none are appropriate for our task. Cohen's (1968) weighted for nominal data allows the weighting of cells o the positive diagonal to give partial credit for rankings that are \close" to being in agreement. Davies & Fleiss (1982) develop a generalized -like statistic for nominal data. For interval data, -like measures or generalizations of have been developed by Schouten (1982), O'Connell & Dobson (1984), and Berry & Mielke (1988,1990). While some of these claim to be measures of agreement for ordinal data, all require the use of a distance metric or scoring that imposes interval assumptions on the rankings. To our knowledge, no measure of agreement has been developed for ordinal data that respects the ordinality limitations. This may simply be an information problem inherent to the data: granting partial credit for \close" agreement requires specication of the extent of that credit, which can not be done with the information available in rank-order data. Finally, with respect to measuring alliance portfolio similarity, all of the aforementioned measures of agreement suer from the same problem as in that they are oriented towards measuring agreement per se and not towards being symmetric measures of agreement versus disagreement. In sum, measures of agreement take us closer to \similarity" conceptually, but not far enough. Unfortunately, there are no measures of agreement appropriate for ordinal alliance data, and existing (non-ordinal) measures of agreement are incapable of reecting antithetical disagreement in a convenient form. As we are unable to borrow a method for measuring the similarity of alliance portfolios, we are forced to develop one ourselves. 4 S: A Distance-Based Measure of Similarity The approach we use in this paper is instead based on \measures of similarity and dissimilarity," which have been widely employed (e.g. Kotz & Johnson, 1988: 397{405) to assess the extent to which two vectors of (nominal, ordinal, interval, or continuous) data dier from each other. For 12 our proposed solution we ask, literally: How \far" are two portfolios from each other? Assuming there are N states in the system, we let a state i's alliance portfolio vector Ai represent a point in an N -dimensional, discrete data space. We dene the similarity S of states i and j 's alliance portfolios Ai = [ ai1 ai2 aiN ] and Aj = [ aj1 aj2 ajN ] , respectively, as i; Aj ; W; L) (4) S (Ai; Aj ; W; L) = 1 , 2 d(dAmax (W; L) where W = [ w1 w2 wN ] is a vector of weights over the N dimensions, L = [ l1 l2 lN ] is a vector of scoring rules for the ranks within each of the N dimensions, d(Ai ; Aj ; W; L) is the distance metric N X i j d(A ; A ; W; L) = wk jlk (aik ) , lk (ajk )j (5) 0 0 0 0 k=1 max and d (W; L) is the maximum distance possible in the N -dimensional space (given the dimension weights W and scoring rule L) denoted by N X max d (W; L) = wk(lkmax , lkmin) k=1 (6) where lkmax and lkmin are the maximum and minimum scores possible, respectively, for dimension k. There are a number of advantages to this measure of similarity. Unlike b , S is dened for all possible alliance patterns. S does not measure the linearity of the association of ordinal rankings of elements, so it can distinguish more subtle shades of similarity and dissimilarity than b can recognize. (The next section provides extensive examples showing why S does not suer from many of the shortcomings of b .) Specied as above, the similarity measure S ranges from ,1, corresponding to two portfolios at \opposite" ends of the data space, to 1, corresponding to identical portfolios. This (arbitrary) standardization allows us to compare the scores of S and b for pairs of portfolios whenever they are used to measure the \similarity" of alliance portfolios, since both have the same range and the same substantive meaning at the endpoints of that range. It also allows us to substitute S into any of the various applications for which b has been used in the study of international politics (e.g., in calculating choice-theoretic risk scores) and to compare the resulting scores when S is substituted for b . Care should be taken, however, in direct comparisons of S and b , since they measure dierent characteristics of a contingency table (or the pair of rankings on which the table is based). For example, b = 0 indicates independence between two vectors of rankings, while S = 0 indicates that the distance between the scored rankings is half the maximum it could be. Sometimes these coincide; sometimes they do not. Another advantage of S is that one can specify a weighting W of the dimensions, where each dimension represents a country with whom the referrent nation could be allied. While the weighting could be as simple as wk = 1 8k (i.e., an alliance with one country is worth the same as with another), there might be theoretical reasons to assume that not all alliances should be considered 13 equal. For example, if states ally to increase their security, then it might be appropriate to weight the states proportionally to their military power in order to avoid exaggerating the importance of small states. In our empirical examples, we will use the notation Sw when referring to a version of S that weights each state according to its national material capabilities as a share of the sum of the capabilities of all the states in the system.9 The similarity measure S does force two restrictions on the data, which may not always be valid, although the measure allows for exibly modeling these. First, just as with the measures of agreement, we are not aware of any measure of similarity that respects the ordinal limitations of rank-order data. Most often, the ranks are converted to intervals through some scoring rule, which is what is done here through the scoring rule L. This may be as simple as setting the intervals to the rank values: l(aik ) = aik 8k. With no other information available, this may be an acceptable scoring rule.10 However, if one does not believe, for example, that the dierence in commitment between defense pacts and neutrality pacts is the same as the dierence between ententes and no alliances, then one can specify any number of theoretically or empirically informed scoring rules for the ranks. Moreover, one may specify dierent scoring rules for each of the N dimensions | i.e., for each nation with which one could be allied. The second assumption placed on the data is that nations are assumed to view the alliance categories in the same way. In other words, every nation has a similar conception of the commitment embodied in an entente, in a neutrality pact, and in a defense pact. We do not feel this is an heroic assumption to make. In fact, the COW Alliance Dataset coding rules for the alliance types imply a common understanding between two states as to the responsibilities born to the other alliance partner. A more heroic assumption along these lines concerns the weights W , which are assumed shared by all countries. This might be the case in the hypothetical situation noted above | where nations weight a possible alliance partner by the capabilities it would bring to the table. However, one can easily think of two nations having dierent weights W i and W j for alliance partners based on language, culture, or other nonmaterial factors. The similarity measure S does not allow for such an heterogeneous weighting scheme. We believe that would be an interesting avenue of future research. 4.1 Comparison of S to b Although S is not available in canned statistics programs, it is actually easier to implement than b, and it adheres to the convention of varying from ,1 to 1 in order to reect complete similarity and complete dissimilarity of states' alliance portfolios. Moreover, we contend that in addition to preserving the welcome characteristics of b , S is free of many of b 's liabilities. As a rst step toward establishing that S does a better job than b of capturing the actual similarity of states' 9 10 We refer to capabilities as measured by the COW National Material Capabilities data set. In fact, this is what we use in the empirical examples that follow. 14 0 1 2 3 0 1 0 0 0 1 0 1 0 0 2 0 0 1 0 3 0 0 0 1 0 1 2 3 b = 1 S=1 0 0 0 0 1 1 0 0 1 0 2 0 1 0 0 3 1 0 0 0 0 1 2 3 0 1 2 3 2 0 0 0 1 3 0 0 0 1 0 1 2 3 0 1 2 3 2 0 0 0 0 b = ,:09 S = :67 (j) 3 1 0 0 0 0 0 0 0 0 1 0 0 0 2 2 0 0 1 0 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 0 1 2 3 1 0 10 0 0 2 0 0 0 0 3 1 0 0 0 0 1 2 3 b = ,1 S = :67 3 0 1 0 0 0 1 2 3 3 0 2 0 0 0 0 0 0 0 1 0 0 0 0 2 0 0 0 2 3 0 0 2 0 (f) 3 1 1 1 1 0 1 2 3 0 0 0 0 0 1 0 0 0 0 2 0 0 0 0 3 0 0 0 4 b = undened S=1 (i) 0 0 0 0 1 1 0 0 0 0 2 0 0 10 0 3 1 0 0 0 (l) Figure 7: Comparison of S with b . 15 2 2 0 0 0 b = ,1 S = :33 b = ,1 S = :67 (k) 1 0 0 0 0 (c) (h) 0 0 0 0 1 0 0 0 0 0 b = 1 S = ,:33 b = undened S=0 (g) 1 0 0 0 0 0 1 2 3 (e) b = undened S=0 0 10 0 0 1 3 0 0 1 0 b = ,1 S=0 (d) 1 0 0 0 1 2 0 1 0 0 (b) b = ,1 S = ,:33 0 0 0 0 1 1 2 0 0 0 b = 1 S = :33 (a) 0 1 2 3 0 0 0 0 0 0 1 2 3 0 0 0 0 1 1 0 0 0 0 2 0 0 0 0 b = ,:09 S = :67 (m) 3 1 0 0 10 alliance portfolios, we re-examine the hypothetical examples of Figures 3{6 in order to determine how S performs when b provides misleading results. The results of this comparison appear in Figure 7. Consider the examples in Figure 7, tables (a){(c). In (a), states i and j have exactly identical alliance commitments with the other states in the system, and both b and S produce values of 1. In table (b), i and j agree on the ordinal ranking of their alliance commitments with the other states in the system, but they do not have identical levels of alliance commitments with any of those states. b would produce a misleading value of 1, while S produces a score of .33, reecting the fact that the states' alliance portfolios are similar but not identical as they are in table (a). In table (c), i is strongly allied with two states that are not allied to j , and j is strongly allied with two states that are not allied to i. b reects only the perfect similarity of ordinal rankings, while S accurately characterizes this table as representing signicant dissimilarity (,:33) between the row and column states' alliance portfolios. In section 3.1.2, we noted that b takes on a value of ,1 in cases like that pictured in Figure 7 (d), misleadingly implying that i and j have antithetically-opposed alliance portfolios when in fact both i and j have some formal alliance with the states in the (1,2) and (2,1) cells. The value of S , in contrast, implies less{than{complete dissimilarity between the portfolios.11 In table (e), i and j disagree over a narrower range of alliance commitments than they do in table (d), and they actually agree on the exact level of one of the alliance commitments. b again takes on a value of ,1, while S correctly indicates that the portfolios in (e) are less dissimilar than the portfolios depicted in (d). In table (f), i and j have relatively strong alliance commitments with all of the other four states in the system, but they disagree as to which pair of states they should ally themselves with more strongly. S produces a value of .33, suggesting the states have moderately similar alliance portfolios, while b would once again imply complete disagreement. S is not a measure of assocation between rank-orderings, and so it is dened even in those cases in which one state has identical alliance commitments to all of the other states in the system. For example, in Figure 7, tables (g) and (h) display cases in which no value of b exists simply because there is no \order" to the ranking of alliance commitments by i and by j , respectively. S is dened even when all of the elements in the table fall into a single cell. When both i and j have mutual defense pacts with all of the other states in the system, their alliance portfolios are identical. As Figure 7 (i) shows, the value of S accurately reects this complete similarity, while the value of b is undened. Figure 7 tables (j) through (m) reproduces the perplexing example from Figure 6. Recall that in all four tables, i and j have mutual defense pacts with themselves, no alliance with each other, and identical alliance commitments to the other ten states in the system. Although this is true 11 S does take on a value of -1 when all of the elements fall into the (0,3) and (0,3) cells of the table. 16 in each of the four tables, the value of b varies as we alter the type of alliance i and j have with the other ten states. This example was particularly striking because b actually indicates that the states' alliance portfolios are less similar in table (l), where i and j share nonaggression pacts with ten states, than in table (j), where neither i nor j has any formal alliance commitment to any state but itself. In contrast, S behaves just as we would like: it reects only the similarity of i's and j 's patterns of alliances.12 In sum, S has produced meaningful measures of alliance portfolio similarity in every one of these hypothetical cases, whereas b produced scores that were misleading. 5 Empirical Examples of S at Work Thus far, we have considered several reasons why a measure of rank-order correlation like Kendall's b is not the correct tool for measuring the similarity of states' alliance portfolios. We have examined a variety of hypothetical examples which illustrate how b may mislead researchers. We have developed a measure which avoids many of the problems associated with b . And, we have demonstrated that it is a reliable guide to the actual degree of alliance portfolio similarity even when b is not. Nevertheless, one might still ask whether S would actually alter one's interpretations of empirical, rather than hypothetical, alliance patterns. If in practice states tend to adopt only those patterns of alliances for which b and S yield very similar portfolio similarity scores, then one may use b without worry. Using data on European major power alliances, we demonstrate in this section that the values of b generated by empirical data deviate substantially from the values of S . In the following examples, we identify the type of alliance commitment each of the European major powers had with every other state in Europe from 1816 to 1965 using the Correlates of War Alliances data set. We derive the b measure of alliance portfolio similarity according to the procedure developed by (Bueno de Mesquita 1975). We also calculate our standard similarity score S and a weighted similarity score Sw . As discussed above in Section 4.1, we calculate Sw , by weighting the entries in the contingency table according to each state's material capabilities as a share of the total material capabilities of all the states in Europe in that year. We oer examples of how the raw values of b and Sw dier, and we examine the impact of these dierences on measures of states' attitudes towards risky ventures. 5.1 The Dierence Between Sw and b In Practice For every pair of major European powers we graphed, there were substantial dierences between the annual values of b and Sw . Two typical examples appear in Figure 8. Figure 8 (top) plots the i's and j 's alliance portfolios are perfectly similar if they have the same types of alliances with all the states in the system, regardless of the level of alliance commitments they agree upon. Thus, S should n ot increase steadily from table (j) to table (m). The level of overall commitment may be an interesting variable, but it is not pertinent to measuring portfolio similarity. 12 17 Figure 8: Comparison of S and b from 1816 to 1965. The top graph shows the similarity of Great Britain and Germany's alliance portfolios as measured by Sw and b for each year from 1816 to 1965. The bottom graph displays the similarity of France and Germany's alliance portfolios for the two measures. As the two graphs show, there is often great dierence empirically between the two measures of similarity. 18 Nation GMY RUS UK FRN AUH ITA BEL SPN TUR NTH SWD RUM POR SWZ GRC DEN YUG BUL NOR ALB System Portfolios Cap GMY RUS .25 3 0 .21 0 3 .20 0 1 .08 0 3 .08 3 0 .05 3 1 .03 0 0 .03 0 0 .02 0 0 .01 0 0 .01 0 0 .01 3 0 .01 0 0 .01 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 0 GMY 1 2 3 0 13 0 0 3 RUS 1 1 0 0 1 2 0 0 0 0 3 2 0 0 0 b = :03 S = :4 Sw = ,:5 (b) (a) Figure 9: Comparison of alliance similarity for Germany and Russia, 1914. Table (a) shows Germany and Russia's alliance portfolios with all the European nations in 1914, along with the nations' proportion of system capabilities. Table (b) displays the contingency table based on those portfolios and the values of b and S . (S is unweighted similarity. Sw is similarity weighted by capability share.) similarity of the United Kingdom's and Germany's alliance portfolios for every year from 1816 to 1965, while Figure 8 (bottom) displays the similarity of France's and Germany's alliance portfolios. Although they draw on identical data, Sw and b draw very dierent pictures the similarity of these states' alliance portfolios. The dierences in the values of b and Sw are not localized in specic time periods, and the magnitude of those dierences is often quite substantial. Although Sw and b are not scaled identically, they are only moderately correlated (in these two examples, U.K.-Germany = :64, France-Germany = :52), indicating that the values of b stray from those of Sw in tendency as well as in scale. For scholars testing systemic theories of international politics, it is particularly noteworthy that Sw is often a dierent sign than b , as this is likely to aect the composition and characteristics of alliance \clusters." 19 Figure 9 (a) displays the alliance portfolios of Germany and Russia in 1914. The leftmost column lists the states listed by the Correlates of War as members of the European region in 1914, plus Turkey. The second column indicates each state's material capabilities measured as a proportion of the sum of all states' capabilties. This information is used in the weighting of Sw as described in section 4. The remaining two columns display Germany's and Russia's alliance portfolios across the European region. Figure 9 (b) shows the contingency table generated by plotting Germany's alliance portfolio against Russia's, along with the b , S , and Sw indicators of the similarity of those alliance portfolios. The value of b is easy to understand in light of our earlier discussions. There is little linearity to the ordinal ranking of the elements in the table in Figure 9, so that the value of b approaches 0. This is clearly misleading, in that the elements are not distributed randomly through the cells of the table. S does a better job of capturing the similarity of the portfolios, insofar as Germany and Russia agree precisely on the type of alliance commitment they share with thirteen of their twenty potential partners, and they very nearly agree on their relationship with a fourteenth.13 Nevertheless, the thirteen states they agree upon are not nearly as strategically important as the handful of states over which they disagree. Sw = ,:5 because the four entries in the lower-left corner of the contingency table represent Germany, Austria, Rumania, and Italy, while the two entries in the upper-right corner of the table represent Russia and France. Thus, where S is positive because Germany and Russia appear to agree about their alliance commitments to fourteen of the twenty states, Sw is solidly negative because it reects the fact that Germany and Russia strongly disagree about their alliance commitments to states representing approximately 58% of the total material capabilities of Europe. Figure 10 displays the alliance portfolios and similarity scores of the U.K. and France in 1921. France has mutual defense pacts with itself, Poland, and Belgium, while the Britain has no alliance with those states. Britain, on the other hand, has mutual defense pacts with itself and with Portugaul. Neither Britain nor France has any formal alliance with the other twenty four states that qualify as members of the Euoropean region in 1921. b takes on a slightly negative value because Britain and France do not share each other's alliances with Poland, Belgium, and Portugaul. In this case, as in the previous example, S is positive because Britain and France share exactly the same type of alliance commitment with twenty{four states out of the twenty-nine. Sw also indicates that these alliance portfolios are very similar, because the twenty{four states with which Britain and France share identical alliance relationships account for 89% of the total material capabilities of Europe. Finally, Figure 11 displays the same information for the United States and the Soviet Union in It may seem counterintuitive to treat \no-alliance" relationships as points of agreement, but as we discuss below, this is a logical consequence of treating the four alliance types as ordinal categories and not a problem attributable to the similarity measure. 13 20 Nation USA RUS GMY UK FRN ITA POL SPN CZE RUM BEL TUR AUS YUG HUN NTH SWD GRC POR LUX DEN SWZ BUL FIN LIT NOR LAT EST ALB System Portfolios Cap UK FRN .25 0 0 .22 0 0 .12 0 0 .10 3 0 .07 0 3 .04 0 0 .03 0 3 .02 0 0 .02 0 0 .02 0 0 .01 0 3 .01 0 0 .01 0 0 .01 0 0 .01 0 0 .01 0 0 .01 0 0 .01 0 0 .01 3 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 0 UK 1 2 3 0 24 0 0 2 FRN 1 0 0 0 0 2 0 0 0 0 3 3 0 0 0 b = ,:09 S = :66 Sw = :62 (b) (a) Figure 10: Comparison of alliance similarity for Great Britain and France, 1921. Table (a) shows Great Britain and France's alliance portfolios with all the European nations in 1921, along with the nations' proportion of system capabilities. Table (b) displays the contingency table based on those portfolios and the values of b and S . (S is unweighted similarity. Sw is similarity weighted by capability share.) 21 Nation USA RUS UK FRN ITA POL SPN CZE BEL NTH TUR SWD YUG RUM HUN POR SWZ GRC DEN BUL NOR LUX FIN IRE ALB ICE System Portfolios Cap USA RUS .46 3 0 .17 0 3 .14 0 3 .04 0 3 .03 0 0 .02 0 3 .02 0 0 .01 0 3 .01 0 0 .01 0 0 .01 0 0 .01 0 0 .01 0 3 .01 0 0 .01 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 .00 0 0 0 USA 1 2 3 0 19 0 0 1 RUS 1 0 0 0 0 2 0 0 0 0 3 6 0 0 0 b = ,:11 S = :46 Sw = ,:74 (b) (a) Figure 11: Comparison of alliance similarity for the United States and Russia, 1946. Table (a) shows the United State and Russia's alliance portfolios with all the European nations in 1946, along with the nations' proportion of system capabilities. Table (b) displays the contingency table based on those portfolios and the values of b and S . (S is unweighted similarity. Sw is similarity weighted by capability share.) 22 1946. Interestingly, according to the COW Alliances data set, the U.S. had no formal alliances with any of its wartime allies in 1946, but the U.S.S.R. had mutual defense pacts with Britain, France, Czechoslovakia, Poland, and Yugoslavia. The pattern of formal alliances in this particular year is probably a poor indicator of states' actual loyalties; it is wise to remember that any measure of the similarity of formal alliance commitments is destined to overlook the eects of informal, tacit alliance commitments, as we discuss below. Because so many of the elements are grouped in one cell of the contingency table, b once again nds very limited evidence of linear order to the rankings of the elements. b is only very slightly more negative than it was when comparing the U.K. and France in 1921. S is once again solidly positive. By treating all states as equally important, S appears likely to generate positive scores as a general rule, since the majority of alliance relationships in any given year consist of \no alliance." Sw , however, shifts the emphasis from the number of states with which the U.S. and the U.S.S.R. have identical alliance commitments to the share of the total material resources with which they have similar alliance commitments. Sw thus appears strongly negative, as we would intuitively expect for these two states in 1946. In summary, S does a consistently better job than b at measuring the extent to which alliance portfolios are similar, and Sw does a consistently better job than either S or b at indicating the extent to which states agree over the allotment of security resources, which is its intended purpose. b measures neither of these characteristics well. 5.2 S , b , and Attitudes Toward Risk It is increasingly common for researchers to use a measure of alliance portfolio similarity as an indicator of the extent to which the states have common or conicting interests. This is then used as the operationalization of the \utilities" states expect to receive by making demands on each other. More recently, Bruce Bueno de Mesquita (1985) and James Morrow (1986) have suggested how the similarity of alliance commitments can be used to measure how willing a state is to take risks in an uncertain environment. In this section we examine how the signicant dierences in the values of Sw and b aect measures of states' risk-taking propensities. The approach to measuring a state's risk propensity is given in Bueno de Mesquita (1985:157). P Dene nation i's \security level" as E (Uji), where E (Uji) is the expected utility each state j j =i expects to receive from conict with i The greater this sum, the more utility i believes its adversaries expect to derive from challenging it. These utilities, in turn, depend on how similar state i's alliance portfolio is with each state j . Formally, following Bueno de Mesquita (1985) and Bueno de Mesquita (1981a): 3 2 X X X ck (Ukj , Uki ) 5 E (Uji) = 4Pj (1 , Uji ) + (1 , Pj )(Uji , 1) + (7) c + c + c i j k j =i j =i k=i;j 6 6 6 6 23 where Uji is the value of the b or Sw measure of the similarity of j 's and i's alliance portfolios, ci is i's material capabilites as a share of the total capabilties of all the states, and Pj is j 's probability of beating i in a bilateral war, operationalized as Pj = ci c+jcj (Bueno de Mesquita 1981:58; Bueno de Mesquita, Newman, & Rabushka 1985:46,50). Holding the alliance portfolios of all the other states in the system constant, i's security level will vary as it alters its alliance commitments. We use Signorino & Ritter's (1997) genetic algorithm method to identify the hypothetical alliance commitments that i could adopt that would minimize and maximize E (Uji). i's m aximium security level would be attained if i were to adopt the prole of alliance commitments that minimizes the sum of j 's expected gains from conict with i. Similarly, i's m inimum security level would be attained if i were to adopt the prole of alliance commitments that maximizes the sum of j 's expected gains from conict with i. We can then measure i's willingness to take risks by comparing the actual security level provided by its observed alliance portfolio to the hypothetical minimum and maximum security levels i could have achieved with some alternate alliance portfolio, according to (Bueno de Mesquita 1985) P P P E (Uji) , E (Uji)max , E (Uji)min j =i P j =i P j =i Ri = E (Uji)max , E (Uji)min 2 6 6 (8) 6 j =i j =i 6 6 which yields values on the interval [,1; 1], where ,1 represents complete risk aversion, 0 represents risk neutrality, and 1 represents complete risk acceptance. We oer two examples of how Sw alters the measure of states' risk propensities obtained using b. Figure 12 displays the information used to calculate the risk attitude of Britain in 1950. The rst column of Figure 12 lists the states who qualify as members of the European region in 1950, including the U.S. and Turkey. The second column displays each state's share of the total material capabilities in the system, which reveals how the states are weighted in Sw . The next four columns display the alliance portfolios of the four most powerful states in Europe, including Britain's actual alliance commitments. The remaining columns show the hypothetical alliance portfolios that Britain would adopt to minimize or maximize its security depending on whether the similarity of P alliance portfolios is measured by b or Sw . The \Max" portfolios maximize E (Uji), which P j =i is to say they minimize Britain's security. The \Min" portfolios minimize E (Uji), maximizing j =i Britain's security. Figure 12 shows Britain's attitude toward risk as being quite risk-averse according to either of the b - or Sw -based measures, with the Sw -based measure indicating complete risk aversion. Britain's Sw -based Min portfolio shows that Britain could maximizes its security not by allying with all the most powerful countries, but rather by allying with a subset of the major powers | those in a particular bloc. That is exactly what Britain has done, so Ri = ,1, indicating complete risk aversion. Figure 13 presents the same type of information for Italy in 1926. In this case, however, the 6 6 24 Nation USA RUS UK FRN ITA POL SPN CZE BEL YUG NTH TUR RUM SWD HUN POR GRC SWZ DEN BUL NOR LUX FIN IRE ALB ICE Share of System Capabilities 0.41 0.24 0.10 0.05 0.04 0.03 0.02 0.02 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 UK | b Portfolios Max Min 1 2 1 2 3 3 1 2 1 2 1 2 3 1 1 2 1 2 3 1 1 2 3 1 1 2 3 1 3 1 1 2 3 1 3 1 1 2 1 2 1 2 1 2 1 2 3 1 3 1 1 2 Ri = ,:7 Actual Portfolios USA UK FRN RUS 3 3 3 0 0 0 0 3 3 3 3 0 3 3 3 0 3 3 3 0 0 0 0 3 0 0 0 0 0 0 0 3 3 3 3 0 0 0 0 0 3 3 3 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 3 3 3 3 0 0 0 0 0 0 0 0 0 3 3 3 0 0 0 0 3 3 3 3 0 3 3 3 0 0 0 0 3 0 0 0 0 0 0 0 0 3 3 3 0 UK | Sw Portfolios Max Min 0 3 3 0 3 3 0 3 0 3 3 0 3 0 3 0 0 3 3 0 0 3 3 0 3 0 3 0 3 0 0 3 3 0 3 0 0 3 3 0 0 3 0 3 3 0 3 0 3 0 0 3 ,1 Figure 12: Alliance portfolios of European major powers and United States, 1950. The table shows the actual alliance portfolios of the European major powers and the United States over the European state system, the b -based max and min portfolios for Great Britain (corresponding to the E (Uji) terms in equation 8), the Sw -based max and min portfolios, and the associated risk scores. There is some divergence between the b and Sw risk scores for Great Britain (,:7 versus ,1, respectively). Comparing the b and Sw max and min portfolios, the Sw portfolios seem to be more reasonable. 25 Nation USA RUS UK GMY FRN ITA POL SPN CZE BEL RUM TUR YUG NTH AUS GRC HUN SWD POR LUX DEN FIN SWZ BUL IRE NOR LAT LIT EST ALB ITA | b Portfolios Max Min 0 3 0 3 1 2 0 3 2 2 3 3 2 2 2 1 2 2 2 2 2 2 1 2 2 1 2 1 2 1 2 1 2 1 2 1 1 2 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 1 2 2 1 2 1 Ri = :16 Share of System Actual Portfolios Capabilities USA UK FRN GMY ITA RUS .30 3 0 0 0 0 0 .16 0 0 0 2 0 3 .10 0 3 0 0 0 0 .10 0 0 0 3 0 2 .08 0 0 3 0 0 0 .05 0 0 0 0 3 0 .03 0 0 3 0 0 0 .03 0 0 0 0 0 0 .02 0 0 3 0 1 0 .02 0 0 3 0 0 0 .01 0 0 2 0 1 0 .01 0 0 0 0 0 2 .01 0 0 0 0 2 0 .01 0 0 0 0 0 0 .01 0 0 0 0 0 0 .01 0 0 0 0 0 0 .01 0 0 0 0 0 0 .01 0 0 0 0 0 0 .01 0 3 0 0 0 0 .00 0 0 0 0 0 0 .00 0 0 0 0 0 0 .00 0 0 0 0 0 0 .00 0 0 0 0 0 0 .00 0 0 0 0 0 0 .00 0 0 0 0 0 0 .00 0 0 0 0 0 0 .00 0 0 0 0 0 0 .00 0 0 0 0 0 2 .00 0 0 0 0 0 0 .00 0 0 0 0 1 0 ITA | Sw Portfolios Max Min 3 0 3 0 3 0 3 0 3 0 3 3 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 3 0 ,:94 Figure 13: Alliance portfolios of European major powers and United States, 1926. The table shows the actual alliance portfolios of the European major powers and the United States over the European state system, the b -based max and min portfolios for Italy (corresponding to the E (Uji ) terms in equation 8), the Sw -based max and min portfolios, and the associated risk scores. There is extreme divergence between the b and Sw risk scores for Italy, with the b -based score (:16) indicating slight risk acceptance and the Sw -based score (,:94) indicating almost complete risk aversion. Comparing the b and Sw max and min portfolios, the Sw portfolios seem to be more appropriate, given the ordinality assumption imposed on the alliance data. 26 b and Sw measures of Italy's risk attitude diverge substantially. The major powers had very few alliances at all in 1926, and even fewer with each other. As a result, Sw implies that Italy can minimize its exposure to risk by holding no alliances with any other state, thereby guaranteeing that its alliance portfolio will be nearly identical to the portfolios of all the other states in the system, with almost all the cell entries in each dyadic portfolio comparison table falling into the (0,0) cell. Subtantively, this is a rather odd result, but it makes perfect sense given the assumption that (0,0) represents an ordered category such that when two states i and j have no alliance with state k, their portfolios are similar in exactly the same sense as if they both held mutual defense pacts with k. Given that few states have alliances with anyone, it seems peculiar that the b results imply that Italy's alliance portfolio would be \most similar" to the alliance portfolios of the other states if Italy were to adopt some formal alliance commitment with every other state in the system. 6 From Alliances to Similarity of Interests Up to this point we have focused solely on the technical issue of measuring the similarity of two vector of alliance data. As we mentioned at the beginning of this paper, in practice the similarity of two states' alliance portfolios is generally used to make inferences about the similarity of their interests. That forces us to ask the next logical question: Even if we are able to measure similarity of alliance portfolios perfectly, is that a good indicator of similarity of interests? We have our doubts. One of the main problems with using the current alliance data as an indicator of similarity of interests centers around the 0=\no alliance" category of the data and two problems it poses. First, two alliance portfolios may be very \similar" but almost totally comprised of zeros | e.g., two nations with no alliances with anyone. This is nicely illustrated by the similarity of Great Britain and France's 1921 alliance portfolios in Figure 10. Neither has an alliance with the other and both have almost no alliances with anyone else. Because of the predominance of zeros in their portfolios, Su reects a fair amount of similarity. Because neither had alliances with any of the stronger nations, Sw also reects high similarity. Giving their portfolios the ocular test, they do look quite similar. Yet, if we were to ask: Do their portfolios reect similarity of interest?, we could say "yes" in the negative sense | if we meant they were interested in remaining free of any alliances. However, in the positive sense, we would have to say "no" | from the alliance data they appear to share no joint interest in each other or common positive goals. Much the same could be said concerning Italy's 1926 Sw -based E (Uji) minimizing portfolio in Figure 13. The portfolio that minimizes the expected utility others would derive from attacking Italy is not a portfolio where Italy allies with them all, but a portfolio that is most similar to theirs: one with almost all zeros. In this case too, similarity of portfolios does not necessarily imply similarity of interests in a positive sense. Not all cases are like these. 1950 Great Britain's Sw -based E (Uji) minimizing portfolio in 27 Figure 12 is one based mostly on similarity with the United States and France's portfolios in a positive sense | i.e., based on defense pacts with those countries. An important issue this raises is the eect of irrelevant states in the comparison of alliance portfolios. States that are irrelevant to the referrent state's decisionmaking will tend to show up as zeros. The more of these irrelevant states in the portfolios, the more it will shift the similarity scores upwards towards one. While weighting by capabilities can help alleviate this, the real solution is appropriate specication of the domain of the portfolios. If particular states are not relevant to decisionmaking in some region, they should not be included. The second problem the \no alliance" category poses concerns the ordinality assumption. The alliance categories are generally assumed to be ordinal from the lowest level of commitment, \no alliance," to \entente" to \neutrality pact" to the highest level of commitment, \defense pact." However, for any given portfolio, the states in the \no alliance" category fall along a continuum of varying degrees of commitment towards the referrant country. For sake of argument, we divide these into three categories: those who have no alliance with the referrant country because of hostility, those who have no alliance because of indierence, and those who have no alliance because of an implicit alignment with that country. While the rst two types of nations can be grouped in the \no alliance" category without destroying the ordinality of the data, those that have an implicit alignment with the referrant country clearly have a higher level of commitment | perhaps as high as a defense pact. An example of this can be seen in Figure 9. There, the similarity of 1914 Germany and Russia's portfolios is Su = :4 and Sw = ,:5 | in the former case because they jointly have no alliances with the majority of states in the system and in the latter case because they have dissimilar alliance levels with the strongest states in the system. What is not shown in the alliance data is that Russia was the informal protector of the Balkan Slavs. Alignments existed between Russia and Yugoslavia, Rumania, and Bulgaria. However, no formal treaties existed, so the three Balkan countries show up in the (0,0) cell of the contingency table, inating the values of all the similarity measures, given the joint ranking of the other alliances. Other examples like this easily come to mind: pre-World War II US and Great Britain, current US and Israel, and current US and Taiwan. This problem can be alleviated in a manner that should be undertaken regardless of the ordinality issue: More (dierent) data should be brought to bear in determining similarity of interests. Data exists on UN votes, diplomatic missions, trade, and disputes. Such data could (1) ll in the alliance \gaps" of states that are aligned with the referrant state but do not have a formal treaty and (2) would bring a richer array of information with which to determine the interests of states. It turns out that the measure of similarity S we develop here is well suited for such multidimensional data. Simply code the UN votes, diplomatic missions, trade amounts, and disputes into their own issue portfolios similar to the alliance data, create a stacked vector of these issue port28 folios, provide intra- and inter-issue weights, and then use S to calculate the similarity of any two multi-issue stacked portfolios. For the many reasons just cited, we suspect this will provide a much better method of arriving at similarity of interests than the current method based on Kendall's b and alliance ties. 7 Concluding Remarks We have made two claims. First, contrary to twenty years' practice, Kendall's b should not be used as an indicator of the similarity of states' alliance portfolios. b measures the linear association of two alliance portfolio rankings, which is not at all the same as measuring the similarity of their alliance portfolios. We have demonstrated through hypothetical and empirical examples that b is inappropriate for this task. Rather than simply identifying these problems, we have also developed an alternative measure of alliance portfolio similiarity, S , which avoids many of the pitfalls associated with b , and we have employed data on alliances among European states to compare the eects of S versus b in measures of utility and risk propensity. Secondly, we have also claimed that inferring state interest from alliance commitments is problematic, even given a perfect measure of portfolio similarity. However, we believe our method in combination with alternative sources of data (e.g., on UN votes, diplomatic missions, trade, and disputes) can provide leverage on the dicult problem of estimating states' similarity of interests. Finally, we certainly do not mean to suggest that the inappropriate use of Kendall's b renders decades' of published work worthless. Quite the contrary, we feel the theoretical sophistication of many of the studies using b has advanced far beyond the point where b might be accepted as \good enough." We hope that our eorts here will contribute to the improvement of several very valuable research programs. 29 References [1] Altfeld, Michael F. 1984. \The Decision to Ally: A Theory and Test." Western Political Quarterly 37(4): 523{544. [2] Altfeld, Michael F., and Bruce Bueno de Mesquita. 1979. \Choosing Sides in Wars." International Studies Quarterly 23(2): 87{112. [3] Berkowitz, Bruce D. 1983. \Ralignment in International Treaty Organizations." International Studies Quarterly 27(1): 77{96. [4] Berry, Kenneth J. and Paul W. Mielke, Jr. 1988. \A Generalization of Cohen's Kappa Agreement Measure to Interval Measurement and Multiple Raters." Educational and Psychological Measurement. 48(4):921{33. [5] Berry, Kenneth J. and Paul W. Mielke, Jr. 1990. \A Generalized Agreement Measure." Educational and Psychological Measurement. 50(1):123{5. [6] Bishop, Yvonne M. M., Stephen E. Fienberg, and Paul W. Holland. 1975. Discrete Multivariate Analysis: Theory and Practice. Cambridge, MA: MIT Press. [7] Bueno de Mesquita, Bruce. 1975. \Measuring Systemic Polarity." Journal of Conict Resolution 19(2): 187{216. [8] Bueno de Mesquita, Bruce. 1978. \Systemic Polarization and the Occurrence and Duration of War." Journal of Conict Resolution 22(2): 241{267. [9] Bueno de Mesquita, Bruce. 1980. \An Expected Utility Theory of International Conict." American Political Science Review 74(4): 917{931. [10] Bueno de Mesquita, Bruce. 1981a. The War Trap. New Haven: Yale University Press. [11] Bueno de Mesquita, Bruce. 1981b. \Risk, Power Distributions, and the Likelihood of War." International Studies Quarterly 25(4): 541{568. [12] Bueno de Mesquita, Bruce. 1985. \The War Trap Revisited: A Revised Expected Utility Model." American Political Science Review 79(1):156{76. [13] Bueno de Mesquita, Bruce, and David Lalman. 1986. \Reason and War." American Political Science Review 80: 113{131. [14] Bueno de Mesquita, Bruce, and David Lalman. 1988. \Empirical Support for Systemic and Dyadic Explanations of International Conict." World Politics 41(1): 1{20. 30 [15] Bueno de Mesquita, Bruce, and David Lalman. 1992. War and Reason: Domestic and International Imperatives. New Haven, CT: Yale University Press. [16] Bueno de Mesquita, Bruce, David Newman, and Alvin Rabushka. 1984. Forecasting Political Events: Hong Kong's Future. New Haven, CT: Yale University Press. [17] Cohen, Jacob. 1960. \A Coecient of Agreement for Nominal Scales." Educational and Psychological Measurement. 20:37{46. [18] Cohen, Jacob. 1968. \Weighted Kappa: Nominal Scale Agreement with Provision for Scaled Disagreement or Partial Credit." Psychological Bulletin. 70(4):213{20. [19] Davies, Mark and Joseph L. Fleiss. 1982. \Measuring Agreement for Multinomial Data." Biometrics. 38:1047{51. [20] Huth, Paul, D. Scott Bennett, and Christopher Gelpi. 1993. System Uncertainty, Risk Propensity, and Interanational Conict Among the Great Powers. Journal of Conict Resolution 36(3): 478{517. [21] Iusi-Scarborough, Grace. 1988. \Polarity, Power and Risk in International Disputes." Journal of Conict Resolution 32(3): 511{533. [22] Kendall, Maurice G. and Alan Stuart. 1961. The Advanced Theory of Statistics, Vol2_ . London: Charles Grin and Company Limited. [23] Kendall, Maurice G. and Jean Dickinson Gibbons. 1990. Rank Correlation Methods. 5th Edition. London: Edward Arnold. [24] Kim, Cahe-Han. 1991. \Third-Party Participation in Wars." Journal of Conict Resolution 35(4): 659{677. [25] Kim, Woosang. 1989. \Power, Alliance, and Major Wars, 1816-1975." Journal of Conict Resolution 33(2): 255{273. [26] Kim, Woosang. 1991. \Alliance Transitions and Great Power War." American Journal of Political Science 35(4): 833{50. [27] Kim, Woosang, and James D. Morrow. 1992. \When Do Power Shifts Lead to War?" American Journal of Political Science 36(4): 896{922. [28] King, Gary. 1989. Unifying Political Methodology: The Likelihood Theory of Statistical Inference. New York: Cambridge University Press. 31 [29] Kotz, Samuel and Norman L. Johnson. 1988. Encyclopedia of Statistical Sciences. New York: Wiley. [30] Lalman, David. 1988. \Conict Resolution and Peace." American Journal of Political Science 32: 590{613. [31] Lalman, David, and David Newman. 1991. \Alliance Formation and National Security." International Interactions 16(4): 239{253. [32] Levy, Jack S. 1981. Alliance Formation and War Behavior. Journal of Conict Resoluton 25: 581-614. [33] Liebetrau, Albert M. 1983. Measures of Association. Newbury Park: Sage. [34] Majeski, Stephen J., and David J. Sylvan. 1984. Simple Choices and Complex Calculations: A Critique of The War Trap 28(2): 316{340. [35] Morrow, James D. 1987. \On The Theoretical Basis of a Measure of National Risk Attitudes." International Studies Quarterly 31(3): 423{438. [36] O'Connell, Dianne L. and Annette J. Dobson. 1984. \General Observer-Agreement Measures on Individual Subjects and Groups of Subjects." Biometrics. 40:973{983. [37] Organski, A.F.K., and Jacek Kugler. 1980. The War Ledger. Chicago, IL: University of Chicago Press. [38] Ostrom, Charles W., Jr., and John H. Aldrich. 1978. \The Relationship Between Size and Stability in the Major Power International System." American Journal of Political Science 22(4): 743{771. [39] Schouten, H. J. A. 1982. \Measuring Pairwise Interobserver Agreement when All Subjects Are Judged by the Same Observers." Statistica Neerlandica. 36(2):45{61. [40] Signorino, Curtis S., and Jerey M. Ritter. 1997 (in progress). Calculating State Risk Propensities: A Genetic Algorithm Method for a Combinatorial Optimization Problem. Mimeo. Harvard University. [41] Stoll, Richard J. 1984. \Bloc Concentration and the Balance of Power." Journal of Conict Resolution 28(1): 25{50. [42] Stoll, Richard J., and Michael Champion. 1985. Capability Concentration, Alliance Bonding, and Conict Among the Major Powers. In Alan Ned Sabrosky (ed.) Polarity and War: The Changing Structure of International Conict. Boulder, CO: Westview Press. 32 [43] Wallace, Michael. 1985. \Polarization: Towards a Scientic Conception." In Alan Ned Sabrosky (ed.) Polarity and War: The Changing Structure of International Conict. Boulder, CO: Westview Press. 33