Revisiting Profitability: Firm, Industry, Country and Network Effects
by user
Comments
Transcript
Revisiting Profitability: Firm, Industry, Country and Network Effects
Revisiting Profitability: Firm, Industry, Country and Network Effects Paul Kattuman*, Diego Rodriguez**, Dmitry Sharapov* and F. Javier Velázquez** * Cambridge Judge Business School, University of Cambridge ** Universidad Complutense de Madrid and GRIPICO (draft version; November 2010) (Preliminary and incomplete) Abstract: This paper assesses the determinants of heterogeneity of firm profitability. The paper has two main contributions. Firstly, by contrast to the previous literature which has focused on the United States corporations, we analyze differences across firms in a wide range of European countries (twenty five) and includes the effects related to networks of firms (business groups). These effects are included jointly with year and industry effects. Additionally, the database (Amadeus) includes all type of firms (and not only of publically listed) in all industries. The second contribution is methodological. We propose a new approach to variance decomposition, the Shapley Value method, which avoids some of the problems related to the standard ANOVA procedure. We use this method dealing with perfect multicollinearity that emerges with the inclusion of firm-specific effects and we previously evaluate it against alternatives by using simulation methods. The preliminary results, based on repeated sampling taking into account representativeness coefficients across industries, size and countries, suggest that the relevance of business group effects, which seems to be decreasing with the network size. Additionally, firm effects dominate and industry effects appear to be much smaller than estimated in previous work. The differences among European countries, once controlled the rest of effects, seem moderate. Keywords: firm profitability, Shapley value, ANOVA, country effects, business-group effects. 1. Introduction Understanding the determinants of heterogeneity of firm profitability is likely one of the most recurrent fields of analysis both for economists and strategic management researchers. The literature has tried to disentangle such heterogeneity by measuring the relative contribution of different sets of observable characteristics. In particular, three features have been the most frequently used: year, industry and a suprafirm effect captured by corporate-parent effects. Year effects capture profitability changes throughout the economic cycle that are common to all firms. Increasing market openness as a consequence of economic integration, exchange rate fluctuations or aggregate demand shocks are examples in that context. Industry effects capture idiosyncratic and persistent industry characteristics that lead to different average values of profits across sectors. For example, differences in entry barriers among sectors can generate persistent differences in average profitability between the tobacco and the meat food processing industries. Finally, corporate-parent effects capture the influence of firm membership of a business group with heterogeneous structural or managerial characteristics controlled by the corporate parent on firm-level profitability outcomes. The extent of business group diversification and the degree of management structure centralization are examples of such corporate-parent characteristics. The key discussion in the debate to which this study seeks to contribute has been about the relative importance of industry and ‘firm’ effects in explaining the differences in performance between different firms.1 The definition of ‘firm’ effects has varied with the characteristics of the database used in the empirical analysis. Both Schamelensee (1985) (widely considered as the seminal paper in this area) and Rumelt (1991) used the Federal Trade Commission (FTC) Line-of-Business Survey. The main conclusion by Rumelt (1991) was that firm effects (called business-units) were very relevant and that, in contrast to Schmalensee’s findings, industry effects were relatively small. Additionally, unlike Schmalensee, he found significant (though small) corporate-parents effects. 1 To see this one need only look at the titles of some of the papers in this debate: ‘Do Markets Differ Much?’ (Schmalensee, 1985), ‘How Much Does Industry Matter?’ (Rumelt, 1991), ‘How Much Does Industry Matter, Really?’ (McGahan and Porter, 1997). 1 The second database used in the literature is the Compustat Business Segments Reports. Its main difference with the FTC database is that Compustat provides the information on corporate profitability by specific industries (4-digit SIC codes). Due to this reason, McGahan and Porter (1997) point out that the observed statistical unit, that they call the business-segment, may be an aggregation of several firms owned by the same corporate parent and operating in the same industry. As we will explain in more detail later, the database used in this study is more similar to the characteristics of the FTC Line-ofBusiness Survey than to the Compustat Reports. Two main empirical methods have been used in the literature to decompose the observed variability in profitability: Components of Variance (COV) and ANOVA estimations. The COV method does not provide coefficient estimates but uses summary statistics to judge effect importance, in terms of its contribution to overall variance in profitability. By contrast, ANOVA is a fixed effects model that uses exclusion F-tests going from the null to the full model to test effect significance. Rumelt (1991) and McGahan and Porter (1997) use nested ANOVA, while simultaneous ANOVA is used in McGahan and Porter (2002). McGahan and Porter (2002) offer an excellent revision of the literature on variation in profitability, reconciling the results from main previous studies and providing new evidence. In doing so they claim that “the most direct opportunities for further research reside in exploring new data. Reliable and comparable data on the accounting profits of firms in other parts of the world would yield insight on questions about the relationships between the national economic environment and industrial performance. Data on the profitability of privately held firms would provide results more representative of the entire economy. Opportunities lie in exploring additional measures of firm performance, including stock-market return and market share” (p. 849). This paper takes on that task and contributes to the literature in two specific ways. Firstly, in contrast to previously mentioned papers, this paper assesses profitability using firms located in a wide range of countries. Specifically, it uses the Amadeus database of publically listed as well as privately held firms operating in all industrial and service sectors in 25 European countries. This allows for the estimation of country effect contribution to profitability variance. In such a way, the analysis is more general 2 that in the previous literature, which mostly have data on publically traded corporations in the manufacturing sector. Due to the huge size of the database, the empirical analysis uses random sampling and recursive estimations to assess the robustness of results. We use an external source, the Structural Business Statistics, to develop a stratified sampling procedure which avoids size, industry and country-related biases of the original database. In this way, the results can be considered as representative of crosscountry industrial populations. Secondly, the paper contributes to the literature by complementing previous approaches (COV and ANOVA) with a Shapley Value method. This method allows us to calculate the relative contribution for each effect by weighting the average of increases to R2 due to effect inclusion in all possible paths from null to full model. This approach is computationally intensive in the context of the sampling procedure previously mentioned. However, it provides a more robust approach to assessing the relative importance of the different sets of effects than that of considering a single path from null to full model, or that of presenting the results of several paths (as done in previous papers) without aggregating them in a consistent manner. Additionally, to assess the effectiveness of the Shapley value and other approaches in a data context affected by covariance between effects we previously perform a simulation analysis in a Monte Carlo framework. The structure of the paper is as follows. Section 2 discusses definition and measurement issues that are key matters to understanding the results of previous studies, and of the expected results of this paper in particular. It also introduces the Shapley value method, which is then compared with alternatives approaches using a simulation framework. Section 3 describes the sampling procedure and provides an exploratory analysis of profitability across European countries. Main results and discussion are introduced in Section 4. Finally, Section 5 concludes pointing out the main implications of the results. 3 2. The decomposition of accounting profitability 2.1. The empirical model It is useful to discuss the main issues in the literature departing from the specification of a full model in which profitability for a statistical unit in a period t is explained by a set of characteristics. In this study the statistical unit refers to a firm f with main activity in industry i. That firm is located in a country c and it belongs to a business-group g. For that firm, we observe accounting profits in the year y. The specification of a full model in which all these characteristics enter additively is then: yicgf yY i I cC g G f F yicgf y i c g (1) f It is important to understand the differences between the statistical unit used in this paper, the firm, and in previous papers using the US Federal Trade Commission Line of Business data and Compustat Reports to interpret the results. In this study, as is usual in the statistical procedures, each firm is “assigned” to an industry according to its main activity. Although we know if the firm is product diversified, the information on profits is about the firm a whole, not about a specific product or activity. A similar procedure is followed with the FTC database, in which an industry classification corresponding to its main industry is assigned for each business unit. However, with Compustat each statistical unit is a business segment (i.e., a specific industry) in which a corporation operates. A main difference of this study is that the business group is elaborated down-top using information on share ownership links among firms. For that reason the business group is closer to a concept of firm networks than to a concept of corporations. It is then possible, and even likely, that two or more firms in the same business-group are located in the same industry. That is not possible with Compustat, where each business segment in a corporate-parent is placed in a different industry. This multidivisional characteristic allows industry and corporate parent effects to be identified in the Compustat data framework. This implies that, though similar in general terms, corporate-effects and 4 business-groups are not strictly comparable. In both cases an important contribution to explaining profitability variance is expected. However, it is likely that the links between business segments in a corporation may be stronger than between firms that share ownership links, but which are not necessarily are under a common umbrella of a corporation. If that is true, the expected effects of business groups would be smaller than with respect to corporate effects. The different statistical units could also have consequences on the expected relevance of industry effects. In particular, in the Compustat data framework the expected industry effect may be lower insofar as a business-segment comprises different business-units.2 The opposite occurs with the Amadeus database. In this case, diversified firms are assigned to the “main” activity. There are not data on sales across activities for diversified firms, though we can control if industrial effects are lower for them than for non-diversified firms. The last set of effects included refers to firm-effects.3 It introduces a statistical problem emerges: any model that includes firm and year effects will have the same explanatory power that a fully specified model with the five sets of effects (year, country, industry, business-group and firm). In other words, after including business-specific effects and years, it is not possible to increase the explanatory power of the model. Accordingly, Ftests do not reject the restriction that additional effects (business-group, industry or country) are equal to zero. That is due to industry, business-group and country effects are collinear by design with firm effects. The same issue emerges in the Compustat framework with the business-specific effects, which are linear by design with industry and corporate-parents effects, as McGahan and Porter (2002) point out. 2 That was precisely pointed out by M&P (1997) as a difference between his study and those using the FTC Lines of Business. 3 The seminal paper by Schamelensee (1985) used a single year of data. Accordingly, firm effects were not included and they were partially captured by introducing business-unit market share as an explanatory variable. Some caution should be taken because corporate-parent effects were labeled ‘firm’ effects by the author. 5 2.2. The Shapley Value approach The literature has taken two main approaches for the decomposition of variance in business-specific (firm in our case) accounting profits. Several papers, including the seminal paper by Schamelensee (1985), have used the Components of Variance Approach (COV). This method assumes effects randomly drawn before sample selected, then constant. It also allows for covariance between effects but need to assume that these are also random. This procedure does not provide significance tests on effects, but it provides the contribution on each effect on total variance. The second approach is based on an ANOVA procedure based on a regression analysis. The strategy consists on analyzing the increase in the explanatory power that emerges when different set of effects are introduced in the model. A main problem with the ANOVA approach is that it would require considering all possible paths from null to full model. If there is covariance between effects, as has been discussed in the literature (Schmalensee, 1985: p. 344), the estimates of the significance and importance of effects will be different depending on the order in which effects are introduced in building up the econometric model from the null specification to the full one. Being aware of this issue and its implications, the authors of previous papers have presented estimation results from a number of paths from null to full model. For example, McGahan and Porter (1997) show two specific paths in Table 5.4 In the first path (A), industry effects are calculated using the residuals of a previous regression in which only year dummies have been introduced. In (B) path the ordering is slightly different and industry effects are included in the third step, after consider consecutively year and corporate-parents effects. Since Schmalensee (1985), none paper has considered all possible paths. The previous work also makes no attempt to aggregate the results of the different paths followed. This leads to the possibility of the identification of the contributions of different effects being confounded by the inequitable attribution of the covariance between the effects to one or more of the effects, at the expense of the others. 4 In M&P (1997) they assume serial correlations on the errors in eq. 1. Then, they first estimate the full model to have an estimation of the intertemporal persistence of all effects (regardless of source). This is used to obtain an estimation of the null model controlling for persistent shocks. 6 In this paper we propose that in order to get more accurate results regressions corresponding to all possible routes from null to full model should be run, and the results of these regressions, specifically the marginal increase in adjusted R2 due to effect inclusion, should be aggregated into a single measure of effect importance in accounting for firm performance. To achieve this we use a concept from co-operative game theory, the Shapley value, to interpret the results of our regressions. Consider year, country, group, industry and firm effects to be players in a co-operative game, with the outcome of the game being the proportion of variance in firm profitability, the adjusted R2, accounted for by any given coalition of these effects. The Shapley value of any given effect is then defined as the weighted average of it contribution to all possible coalitions. More formally, ∪ Here, the Shapley value of effect j is the weighted sum of the differences in adjusted R2 between a coalition of variables including j other effects in ∪ ∪ and a coalition which included the but not j. The weight assigned to the increase in adjusted R2 due to the introduction of effect j into the coalition is ! 1 ! ! where n is the total number of effects being considered in the study and m is the number of effects present in the Mth coalition, excluding j. The Shapley value approach then consists of calculating all possible increases in adjusted R2 due to the inclusion of an effect,5 assigning a weight to each of these marginal increases, and adding up these weighted results to produce an overall estimated contribution of the effect to explaining the variance in firm profitability. Doing this for all effects allows each effect an equal chance to contribute to R2 in all 5 To apply this methodology to our context we should keep in mind that, in contrast to the initial proposal by Lipovetsky & Conklin, (2001), the variables are not included one by one but in complete sets reflecting each type of effects. 7 possible paths from null to full model, thus giving each effect an equal chance to claim any covariance between effects. 2.3. Simulation To test the performance of the standard and corrected for collinearity by design Shapley value methods in comparison to the ANOVA and COV ones we use a Monte Carlo simulation approach. We generate a dataset of industry, corporate parent, business segment and year effects with defined variance and covariance structure and apply the different methods to it. Our full model is thus: where rikt is the profitability of corporate parent k’s business unit in industry i, μ is the overall average profitability, αi and βk are industry and corporate parent dummy variables that are correlated with each other, φik are business segment dummy variables (an interaction of αi and βk), γt are year dummy variables and εikt is a normally distributed error term which is uncorrelated with any of the effects. The business segment dummy variables are perfectly collinear with both the industry and corporate parent effects, making this data suitable for examining the benefits of using a Shapley value procedure with correction for collinearity by design. We evaluate the performance of the different methods by comparing the proportions of total variance assigned to the effects by the methods with their theoretical values, which we calculate using the parameters set in our data generating process. Consider the variance decomposition equation: 2 , If we know the variance and covariance parameters of the data generating process we can calculate the theoretical proportion of total variance attributable to each effect. For and year and segment effects these are respectively. To calculate the theoretical proportion of total variance attributable to the industry and corporate parent effects we must divide their covariance between them. The fairest way to do so is to 8 split the covariance term evenly between the two effects. Thus the theoretical proportions of total variance attributable to industry and corporate parent effects are , and , respectively. Having calculated these theoretical proportions, we can compare them to the proportions of total variance attributable to each effect as estimated by the different variance decomposition methods. The method most suitable for this kind of analysis will then be the one that produces estimates closest to the theoretical values. We construct our simulation data in the following way. We want to generate effects for 500 corporate parents operating in 250 industries over 4 years, with each corporate parent operating in two industries, resulting in 1,000 business segments. First we draw 250 industry effects and 250 average corporate parent effects within an industry from a bivariate normal distribution with mean zero and standard deviation 5 for both effects, and a fixed correlation between them. To examine the performance of the methods when used on data with different correlation structures we allow the correlation in the data generating process to vary from 0 to 0.9. Next we generate 2 individual corporate parent effects for each industry by adding a normally distributed error term with mean zero and variance one to the average corporate parent effects. We now have 500 corporate parents, each operating in a primary industry. To assign a secondary industry to each corporate parent while maintaining the correlation structure between industry and corporate parent effects, we generate a hypothetical secondary industry effect for each corporate parent by adding an error term drawn randomly from a standard normal distribution to their primary industry effect. We then match this hypothetical secondary industry effect to the closest existing industry effect different from the corporate parent’s primary industry effect. The data now consists of 1,000 observations of corporate parent profitability in 250 industries, making up 1,000 business segments. We next generate the business segment effects by drawing these from a normal distribution with mean 0 and variance 100. The business segment effects are thus drawn independently from the industry or corporate parent effects. To construct the year effects we create an additional 3 copies of the dataset and assign a year effect drawn 9 randoomly from a normal diistribution w with mean 0 and varian nce 1 to each ch of the 4 copies c of thhe data. Thhe final steep is to coonstruct the profitability measuree as the su um of indusstry, corpoorate paren nt, businesss segment and year effects, pplus a norm mally distriibuted errorr term with h mean zeroo and variaance 100. We W run the data generrating proceess 100 tim mes for eacch value off the correllation betweeen industrry and corp porate parennt effects raanging from m 0 to 0.9 inn steps of 0.1, making 1,000 runs of the simu ulated data generating process in total. t Figure 1 Industtry and Corp porate Parennt estimated d shares of explanatory e y power Giveen our data generating g process, w we can calcculate the components c s of the varriance decomposition equation and a substituute them in nto it. , (variiance of thee average co orporate parrent effect plus p variancce of the errror term ussed to generate individdual corporaate parent efffects, plus some small distortion due to matching proceess used to selecct secondd industry y), , , and . 10 The simulation results can be seen in Figure 1. The Shapley value approach estimate of proportion of total variance attributable to each effect appears to be consistently close to the theoretical value for every effect. This is not true for the other methods. 3. Data and sampling The Amadeus database, elaborated by Bureau Van Dijk, reports balance sheet and additional data for about 14 millions of firms in forty European countries.6 This database provides information on several profitability measures, though for comparability with previous studies, we limit our assessment to returns on assets (ROA). This variable is defined as operating profits, including extraordinary charges, over total assets, the latter including both tangible and intangible assets. We have restricted the initial sample to those firms that provide full information to elaborate ROA, industrial activity and the number of employees for the period 2001-2006 (balanced panel).7 Even though it is possible to include some information on more recent years, it is achieved by losing information for some countries. This six-years period is slightly smaller than Roquebert et al (1996), who used a 7 years-period, but larger than Rumelt (1991), who used a 4-years serie (1974-1977), and similar to McGahan and Porter (1997 and 2002), in which the average time series of each economic unit was 5.7 years, in a unbalanced panel for a 14-years period. Industries are defined at four-digits NACE and the country is defined at the country in which the firm is located and reports. We describe next the procedure followed to construct the business group and the sampling method that has been implemented to run the regression analysis with this huge database. 3.1 The construction of business groups In the previous section we have mentioned some characteristics of the Amadeus database by contrast to Compustat and FTC database. As was explained then, our statistical unit refers to firms. We have information about firms’ subsidiaries (other 6 Amadeus database currently covers all European Union countries, Belarus, Bosnia-Herzegovina, Croatia, Iceland, Liechtenstein, Macedonia, Moldova, Montenegro, Norway, Russian Federation, Serbia, Switzerland and Ukraine. 7 The information on the number of employees is required because we control for size-classes in the sampling procedure, as we next describe. 11 firms), if they exist, and the corresponding share of ownership. Using than information, we define a business group as the set of companies that have ownership links. Those links are considered when the percentage of ownership in a subsidiary controlled by the main firm is bigger than 50%.8 The main firm can have one or more subsidiaries, and it can be also controlled (i.e, their shares can be owned) by other firm. In this way, the business group is defined as the network of paired firms that surpasses the 50% threshold. This is similar to the definition of Korean business groups in Chang and Honj (2002), though chaebols definition uses a 30% threshold.9 Given that the on-line version of the Amadeus database only offers a snapshot of the last available information on ownership structure (the last wave), we have used seven waves (one a year) to measure with more precision the shape of the network. Departing from of firms, 887,443 pairs of different links surpassing the 50% threshold were identified in the period 2002-2006.10 Later, we include information on the industrial code for each firm (either main or subsidiary), defined by the 4-digits NACE rev 1.1 classification. In doing so, we should keep in mind that we lose some links in which the subsidiary is not included in the Amadeus database. This is clearer in the case of nonEuropean firms. We also drop some industries that amplify the size of the network, but that are purely instrumental companies. In particular, we drop those firms in financial industries (divisions 65 to 67), two particular business-service industries linked to financial services (7415 and 7487) and non-service sectors. 11 The final number of links is then 450,782, with a total number of 628,055 firms. A 28.7% of all firms were only main firms, while 66.1% of firms were only subsidiaries and the remaining 5.1% were simultaneously main (they have at least one subsidiary) and subsidiary (they have one main firm). 8 In some cases there is not information about the percentage of ownership. Those links have not been included to elaborate the business group. 9 Chang and Honj (2002) obtain a strong effect for Korean business groups (chaebols) to explain the variance of profits. The economic relevance of these conglomerates (40 percent of Korea’s total output in 1996) is peculiar. Our analysis for a wide set of European countries allows us to include a heterogeneous set of institutional settings, though the Korean case has not resemblance in any European country 10 This represents a 42.1% of initially observed 2,107,422 pairs. An additional condition is imposed in the process: the ownership link larger than 50% should be observed for at least two years. If that condition is not imposed, the number of links increases to 1.4 million of links. 11 This is similar to the exclusion of depositary institutions in papers using the Compustat database. 12 Table 1 Business groups size (number of observed links) Number of observed links Percentage 1 113,626 25.21 2 59,964 13.3 3 37,131 8.24 4 26,524 5.88 5 20,275 4.5 6 15,390 3.41 7 11,998 2.66 8 10,648 2.36 9 9,153 2.03 10 or more 146,073 32.00 Total 450,782 100 An algorithm is implemented to define the business groups, defined as the network of connected links. The number of identified networks is 179,089. Table 1 shows the size distribution of business groups (i.e., networks) according to the number of links. As can be observed, the majority of business groups are very simple: they are constituted by only one main and one subsidiary firm. It occurs in 25.2% of the observed networks. In almost half of observed networks (specifically, 46.63% of cases) the networks is constituted by three or less links. The distribution of bigger networks is fairly smooth and the bigger network has 1,096 links. The shape of the business group depends on the specific structure of the links in the network. Figure 2 shows the shape of two business groups. The first one corresponds to a randomly chosen network with 10 pairs, and the second (right panel) corresponds to the biggest business group. 13 Figure 2 Busineess network k shape Netw work with 10 links Neetwork with 1096 links (bbigger netwo ork) Addiitionally, thhe majority of links (833.5%) occurr between tw wo firms thhat are locatted in the same countrry. Of course, that is moore likely iff it is a very y small grouup, with only one link. In that casse, 92.0% of o business groups hav ve the subssidiary in thhe same co ountry than the main fiirm. Analyzzing it from m the perspeective of thee groups ass a whole, 66.1% 6 of all groups haave all their firms in thhe same cou untry. The number n of ffirms in diffferent counntries is, as expected, increasing i w with busineess group siize. The bigggest group p (the groupp with 10966 links), has 712 links in which th he main and the subsiddiary are lo ocated in different counntries. Figu ure 3 displayys the numb ber of main n and subsiddiary firms in all linkss across couuntries. On average, thhere are 2.5 links per business b grooup. The av verage size iis larger (6.2) for multiinationals ggroups (i.e., those busin ness groups in which att least one ffirm is locatted in more than one coountry). 14 Figure 3 Country of the main and subsidiary firm (links) 140.000 120.000 100.000 80.000 Main 60.000 Subsidiary 40.000 20.000 Other RU PL IE CH PT FI IT NO SE NL GB 0 Finally, we should mention that even though business group can be identified as a network of firms, they do not capture strategic networks. These “represent an attempt to achieve shared goals through collective efforts by multiple participants, each of which also have their own strategic interests that are not necessarily always aligned” (Wincent et al, 2010, p. 599). Of course, the analysis of strategic networks is a related literature that requires a different approach, usually focusing on a small set of networks in an industry defined and analyzed with precision. 3.2 The sampling procedure The huge size of the Amadeus database makes computationally impossible to use all available data in a regression analysis if individual firm effects are included. Therefore, a sampling procedure is implemented to extract samples of firms that can be used in the empirical analysis. There is an additional reason that supports this sampling procedure. Some recent papers have used the Amadeus database due to there is not a database that provides homogeneous micro-level information for different European countries nowadays.12 However, Amadeus has not been elaborated to fulfill representativeness criteria across countries, industries or size-classes. It raises questions about if the results 12 The only exception is the Community Innovation Survey. The European Statistical Office provides restricted access to this survey in their premises in Luxembourg. However, that is a survey designed to measure and assesses innovation activities and does not contain any information about performance variables. 15 can be generalized for the whole population of firms or if, by the opposite, they should be interpreted in the context of the specific characteristics of the firms included.13 In particular, it is well-known that the distribution of firms included in that database has an upward bias towards larger average sizes. However, in our knowledge, none of previous papers that have used this Amadeus database has treated this issue. To elaborate representative samples we have to resemble the population of firms in each stratum, defined by the intersection of country, industry and size-class segment. That population is defined using the information contained in the Structural Business Statistics (SBS), elaborated by the Statistical Office of the European Commission (Eurostat) and freely available in its web server. The SBS provides information on the number of firms in each EU-27 country and Norway according to industry (NACE Revision 1.1) and size-classes (<10, 10-19, 20-49, 50-249, 250 and more employees). Therefore, using the SBS implies to restrict the study to European Union countries, plus Norway. The sampling procedure is implemented to have a compensated samples of firms belonging to business groups and firms that do not belong to a business group. In the first case, we drop all those business groups for which only one firm has been included in the sample. It is done because in those cases we could not distinguish business group and firm effects. We should note that the construction of the business group, as was defined in the previous subsection, is done before implementing this procedure. It allows us to know that two firms belong to the same business group even though its link is throughout a firm that is not included in the initial sample. For example, a German firm could have a subsidiary in Russia, which has a subsidiary in Sweden. Even though Russian firms are not included in the sample, both the German and Swedish firms are considered as belonging to the same business group. 13 Of course, we could run the analysis without worrying about that issue. In such a manner they would be conditional to the specific sample covered by Amadeus, in a similar way that those papers using Compustat provide results for the specific set of firms covered in that database (i.e., publicly traded companies). 16 4. Results (very preliminary) As was explained in Section 2.2, by contrast to nested ANOVA procedures, the Shapley value method allows us to obtain a more accurate estimation of each set of effects. It is done by calculating all possible increases in adjusted R2 due to the inclusion of each specific effect, assigning a weight to each of these marginal increases, and adding up these weighted results to produce an overall estimated contribution of the effect to explaining the variance in firm profitability. In other words, it implies to estimate all possible routes from null to full model. Additionally, as was explained in Section 2.1, the estimation of any model that introduces firm and year effects will achieve the same explanatory power than the full model (equation 1) or any restricted model that include both set of effects. It has two consequences on the method. On the one hand, it implies that some of the regressions are not required because the explanatory power can not be improved, so its marginal contribution is equal to zero. For example, once firm effects are included, any regression that includes industry, business-group or country effects will not increase the adjusted R2. On the other hand, the fact that business-group, industry and country effects are linear by design with firm effects implies that it is not easy to infer the relative contribution of each one, due to the Shapley Value predictor of an effect is downward biased. For this last reason we implement two versions of the Shapley Value method. The first one is a direct application of the procedure explained in Section 2.2 , that we call as uncorrected SV. The second one is the modified version, that we call as corrected SV. In this case a two-step procedure is introduced always than firm-effects enter in the game. For example, assume that we want to know the increase in adjusted R2 of (firm and country) effects with respect to (country) effects. Then, we first run a regression with only industry and group as explanatory variables. We call the adjusted R2 of this first regression as R12F, being (1- R12F) the non-explained part. We then use the residuals in that first regression as the variable to explain in a second regression that includes firm and country effects as explanatory variables (actually, in this case only firm effects are required). The adjusted R2 of this second step is R22F. Then, R2F = (1R12F) x R22F. The intuitive idea is that the firm effect explains a portion of the model (identified by R22F) that is not explained by the other variables with respect to which 17 firm effects are collinear (identified by 1-R12F). The marginal effect of firm in this case would be R2F -R2C. Both approaches are initially implemented using 100 samples of 5000 firms, obtained according to the sampling procedure explained in Section 3.2. We should mention that perfect multicollinearity between firm effects and each one of the other type of effects (except the year effects) would not exist if firms could change in the observed period across countries, industries or business-groups. All three possibilities are excluded in the analysis, though for different reasons. Firstly, a firm can not change across countries because in that case it would be, de facto, a different firm. Secondly, even though a firm can have activities in more than one industry, only its main activity throughout the whole observed period has been taken into account. In such a way, no transient industry effects (as in Rumelt, 1991) have been considered. Finally, business effects have been calculated using the maximum network of ownership links throughout the whole period (see Section 3.2). In such a way, either a firm belongs to the same network in all years considered or it is not integrated in any network. These two possibilities are similar to diversified/non-diversified corporations in the context of the McGahan and Porter (1997, 2002) papers. Table 2 shows the results obtained using three levels of aggregation for industry effects: two, three and four-digits in the NACE rev 1.1 classfication. This allows us to obtain an initial perspective of the sensibility of results to the level of industrial aggregation. 18 Table 2 Contribution to explanatory power by type of effects (simple average of Shapley Values for 100 samples) 2-digits industries Country Industry Group Year Firm Total 3-digits industries 4-digits industries Uncorrected Corrected Uncorrected Corrected Uncorrected Corrected 0.00590 0.01204 0.06013 0.00070 0.40175 0.48052 0.01206 0.04433 0.13192 0.00070 0.31129 0.50031 0.00585 0.01872 0.05994 0.00070 0.39531 0.48052 0.01197 0.05707 0.13159 0.00071 0.29675 0.49809 0.00576 0.02842 0.06007 0.00070 0.38557 0.48052 0.01179 0.07581 0.13191 0.00071 0.27476 0.49498 The majority of the empirical literature with micro-data analyzes the manufacturing sector. It is usually due to larger availability of datasets. However, that is not the case in this study. The results showed previously combined all sectors, with the exception of agrarian activities. However, McGahan and Porter (1997) claimed about possible differences between manufacturing and other sectors. It is based on the (visual) comparison among the COV for business. The COV excludes the possibility of any contrast. In McGahan and Porter (2002) they run the procedure again for manufacturers (with the three possibilities dealing with serial correlation). The observe differences are partially due to the larger serial correlation observed for manufacturers, which suggests a higher rate of persistence. Table 3 shows the results obtained with manufacturing and services industries. To have samples of similar size to those used in Table 2, two new 100 samples of 5000 firms (again, with half of firms belonging to business groups) are used, one for manufacturing industries and the other for services. 19 Table 3 Contribution to explanatory power by type of effects: manufacturing vs services (simple average of Shapley Values for 100 samples) Manufacturing Country Industry Group Year Firm Total Services Uncorrected Corrected Uncorrected Corrected 0.0054 0.0217 0.0529 0.0010 0.4276 0.5086 0.0111 0.0644 0.1170 0.0010 0.3332 0.5267 0.0068 0.0249 0.0662 0.0016 0.3997 0.4991 0.0138 0.0676 0.1435 0.0016 0.2887 0.5153 Finally, Roquebert et al (1996) and McGahan and Porter (2002) find that the effect of diversified corporations is decreasing with respect to the degree of diversification. “Diversified” means, in the context of Compustat database, corporations with more business units. That is parallel to the number of firms in each group as here defined. Table 4. Table 4 Contribution to explanatory power by type of effects: size of business groups (simple average of Shapley Values for 100 samples) Business groups with at least two links All sizes Country Industry Group Year Firm Total Uncorrected Corrected Uncorrected Corrected 0.00576 0.02842 0.06007 0.00070 0.38557 0.48052 0.01179 0.07581 0.13191 0.00071 0.27476 0.49498 0.0054 0.0311 0.0590 0.0017 0.3844 0.4816 0.0111 0.0799 0.1291 0.0017 0.2729 0.4947 20 5. Conclusions (very preliminary) This paper analyzes the relative contribution of several sets of characteristics in explaining the heterogeneity in firm-specific accounting profits. It uses a huge database of European firms in all non-agrarian industries (Amadeus) and analyzes the period 2002-2006. The paper has two main contributions. On the one hand, it proposes a different approach, the Shapley Value method. It deals with multicollinearity between explanatory variables, which implies that the precise sequence of entering each one in a more complete model can result in a different assessment of their relative importance. For example, McGahan and Porter (2002) point out that the relative importance of corporate effects are about 9%, due to that is the increase in the adjusted R2 of a model with Year & Industry & Corporate effects over a model that only includes Year & Industry effects. However, industry effects already capture part of the corporate effects that are later added. If, for example, the order changes and Corporate effects are included just after Year effects, a model that includes both effects over a model with only Year effects would suggest that the increase in explanatory power of Corporate effect would be about 14%. What is the correct answer? Obviously, insofar as the explanatory variables are correlated with each other, none of them is the correct value. The Shapley Value deals with that issue calculating the contribution of each variable across all possible models, that is, across all possible combinations of predictors. The (very) preliminary results suggest the relevance of business group effects. These business group effects seem to be decreasing in the size of firm network. These are bigger than industry effects, even though industry was defined at 4 digits. However, differences across countries have small relevance in explaining observed heterogeneity in profitability. Year effects are negligible in all cases while, as expected, the relative importance of firm effects is smaller when a corrected version of the Shapley Value method is implemented. Finally, some differences between manufacturing and services seem to emerge. In particular, idiosyncratic firm effects are smaller in services, while the relative contribution of the rest of effects, in particular business group effects, is larger. 21 References: Bowman EH, Helfat CE. 2001. Does corporate strategy matter? Strategic Management Journal 22(1): 1–23. Chang SJ, Singh H. 2000. Corporate and industry effects on business unit competitive position. Strategic Management Journal 21(7): 739–752. Lipovetsky S, Michael C. 2001. Analysis of regression in game theory approach. Applied Stochastic Models in Business and Industry 17:319-330. McGahan AM, Porter ME. 1997. How much does industry matter, really? Strategic Management Journal, Summer Special Issue 18: 15–30. McGahan AM, Porter ME. 2002. What do we know about variance in accounting profitability? Management Science 48: 834–851. Misangyi VF, Elms H, Greckhamer T, Lepine JA. 2006. A new perspective on a fundamental debate: a multilevel approach to industry, corporate, and business unit effects. Strategic Management Journal 27(6): 571–590. Roquebert JA, Phillips RL, Westfall PA. 1996. Markets vs. management: what ‘drives’ profitability? Strategic Management Journal 17(8): 653–664. Rumelt RP. 1991. How much does industry matter? Strategic Management Journal 12(3): 167– 185. Schmalensee R. 1985. Do markets differ much? American Economic Review 75: 341–351. Wincent J, Anokhin S, Örtqvist D, Autio E. 2010. Quality Meets Structure: Generalized Reciprocity and Firm-Level Advantage in Strategic Networks. Journal of Management Studies 47(4): 597-624. 22 Appendix: Table A1 Descriptive statistics (average values across 100 samples*) Average Minimum Maximum Countries Industries 2 digits 3 digits 4 digits Business groups (networks) Firms in networks Firms per network Industries per network 2 digits 3 digits 4 digits Countries per network 25 25 44 192 415 1072 2504 2.11 44 183 400 1052 2495 2 1.59 1.73 1.80 1.20 1 1 1 1 25 44 200 435 1089 2505 14 6 8 9 7 *Samples used in Table 2. Average values of samples used in Tables 3 and 4 are very similar. Figure A1 0 .5 1 Density 1.5 2 2.5 Distribution of ROA (full sample) -2 -1 0 roa 23 1 2 Figure A2 Distribution of relative contribution to Shapley Value (average contribution across 100 firms, 2 digits, corrected SV) Corrected Shapley Values for Set of Effects . 100 200 . .015 .02 .025 .03 Country .035 Density 50 0 0 0 50 Density 50 100 Density 100 150 150 . .04 .042 .044 .046 .048 .05 Industry . . .01 . . . 0 0 10 Density 500 1000 Density 20 30 40 1500 .005 0 .0005 .001 .0015 .002 Year . .36 .38 .4 .42 Firm . Country Industry Group Year Firm Std.dev 0.0029127 0.002143 0.003249 0.000299 0.0099978 24 Min 0.0171007 0.0409424 0.0064913 0.0000813 0.3674617 Max 0.0305791 0.0495361 0.0221965 0.0015417 0.4151278 .015 Group .02 .025