Revisiting Profitability: Firm, Industry, Country and Network Effects

by user

on 15-09-2016

Category: Documents

>> Downloads: 16

114

views

Report

Comments

Description

Download Revisiting Profitability: Firm, Industry, Country and Network Effects

Transcript

Revisiting Profitability: Firm, Industry, Country and Network Effects

Revisiting Profitability: Firm, Industry, Country and Network Effects
Paul Kattuman*, Diego Rodriguez**, Dmitry Sharapov* and F. Javier Velázquez**
* Cambridge Judge Business School, University of Cambridge
** Universidad Complutense de Madrid and GRIPICO
(draft version; November 2010)
(Preliminary and incomplete)
Abstract: This paper assesses the determinants of heterogeneity of firm profitability.
The paper has two main contributions. Firstly, by contrast to the previous literature
which has focused on the United States corporations, we analyze differences across
firms in a wide range of European countries (twenty five) and includes the effects
related to networks of firms (business groups). These effects are included jointly with
year and industry effects. Additionally, the database (Amadeus) includes all type of
firms (and not only of publically listed) in all industries. The second contribution is
methodological. We propose a new approach to variance decomposition, the Shapley
Value method, which avoids some of the problems related to the standard ANOVA
procedure. We use this method dealing with perfect multicollinearity that emerges with
the inclusion of firm-specific effects and we previously evaluate it against alternatives
by using simulation methods. The preliminary results, based on repeated sampling
taking into account representativeness coefficients across industries, size and countries,
suggest that the relevance of business group effects, which seems to be decreasing with
the network size. Additionally, firm effects dominate and industry effects appear to be
much smaller than estimated in previous work. The differences among European
countries, once controlled the rest of effects, seem moderate.
Keywords: firm profitability, Shapley value, ANOVA, country effects, business-group
effects.
1. Introduction
Understanding the determinants of heterogeneity of firm profitability is likely
one of the most recurrent fields of analysis both for economists and strategic
management researchers. The literature has tried to disentangle such heterogeneity by
measuring the relative contribution of different sets of observable characteristics. In
particular, three features have been the most frequently used: year, industry and a suprafirm effect captured by corporate-parent effects. Year effects capture profitability
changes throughout the economic cycle that are common to all firms. Increasing market
openness as a consequence of economic integration, exchange rate fluctuations or
aggregate demand shocks are examples in that context. Industry effects capture
idiosyncratic and persistent industry characteristics that lead to different average values
of profits across sectors. For example, differences in entry barriers among sectors can
generate persistent differences in average profitability between the tobacco and the meat
food processing industries. Finally, corporate-parent effects capture the influence of
firm membership of a business group with heterogeneous structural or managerial
characteristics controlled by the corporate parent on firm-level profitability outcomes.
The extent of business group diversification and the degree of management structure
centralization are examples of such corporate-parent characteristics.
The key discussion in the debate to which this study seeks to contribute has been about
the relative importance of industry and ‘firm’ effects in explaining the differences in
performance between different firms.1 The definition of ‘firm’ effects has varied with
the characteristics of the database used in the empirical analysis. Both Schamelensee
(1985) (widely considered as the seminal paper in this area) and Rumelt (1991) used the
Federal Trade Commission (FTC) Line-of-Business Survey. The main conclusion by
Rumelt (1991) was that firm effects (called business-units) were very relevant and that,
in contrast to Schmalensee’s findings, industry effects were relatively small.
Additionally, unlike Schmalensee, he found significant (though small) corporate-parents
effects.
1
To see this one need only look at the titles of some of the papers in this debate: ‘Do Markets Differ
Much?’ (Schmalensee, 1985), ‘How Much Does Industry Matter?’ (Rumelt, 1991), ‘How Much Does
Industry Matter, Really?’ (McGahan and Porter, 1997).
1
The second database used in the literature is the Compustat Business Segments Reports.
Its main difference with the FTC database is that Compustat provides the information
on corporate profitability by specific industries (4-digit SIC codes). Due to this reason,
McGahan and Porter (1997) point out that the observed statistical unit, that they call the
business-segment, may be an aggregation of several firms owned by the same corporate
parent and operating in the same industry. As we will explain in more detail later, the
database used in this study is more similar to the characteristics of the FTC Line-ofBusiness Survey than to the Compustat Reports.
Two main empirical methods have been used in the literature to decompose the
observed variability in profitability: Components of Variance (COV) and ANOVA
estimations. The COV method does not provide coefficient estimates but uses summary
statistics to judge effect importance, in terms of its contribution to overall variance in
profitability. By contrast, ANOVA is a fixed effects model that uses exclusion F-tests
going from the null to the full model to test effect significance. Rumelt (1991) and
McGahan and Porter (1997) use nested ANOVA, while simultaneous ANOVA is used
in McGahan and Porter (2002).
McGahan and Porter (2002) offer an excellent revision of the literature on variation in
profitability, reconciling the results from main previous studies and providing new
evidence. In doing so they claim that “the most direct opportunities for further research
reside in exploring new data. Reliable and comparable data on the accounting profits of
firms in other parts of the world would yield insight on questions about the relationships
between the national economic environment and industrial performance. Data on the
profitability of privately held firms would provide results more representative of the
entire economy. Opportunities lie in exploring additional measures of firm performance,
including stock-market return and market share” (p. 849). This paper takes on that task
and contributes to the literature in two specific ways.
Firstly, in contrast to previously mentioned papers, this paper assesses profitability
using firms located in a wide range of countries. Specifically, it uses the Amadeus
database of publically listed as well as privately held firms operating in all industrial
and service sectors in 25 European countries. This allows for the estimation of country
effect contribution to profitability variance. In such a way, the analysis is more general
2
that in the previous literature, which mostly have data on publically traded corporations
in the manufacturing sector. Due to the huge size of the database, the empirical analysis
uses random sampling and recursive estimations to assess the robustness of results. We
use an external source, the Structural Business Statistics, to develop a stratified
sampling procedure which avoids size, industry and country-related biases of the
original database. In this way, the results can be considered as representative of crosscountry industrial populations.
Secondly, the paper contributes to the literature by complementing previous approaches
(COV and ANOVA) with a Shapley Value method. This method allows us to calculate
the relative contribution for each effect by weighting the average of increases to R2 due
to effect inclusion in all possible paths from null to full model. This approach is
computationally intensive in the context of the sampling procedure previously
mentioned. However, it provides a more robust approach to assessing the relative
importance of the different sets of effects than that of considering a single path from
null to full model, or that of presenting the results of several paths (as done in previous
papers) without aggregating them in a consistent manner. Additionally, to assess the
effectiveness of the Shapley value and other approaches in a data context affected by
covariance between effects we previously perform a simulation analysis in a Monte
Carlo framework.
The structure of the paper is as follows. Section 2 discusses definition and measurement
issues that are key matters to understanding the results of previous studies, and of the
expected results of this paper in particular. It also introduces the Shapley value method,
which is then compared with alternatives approaches using a simulation framework.
Section 3 describes the sampling procedure and provides an exploratory analysis of
profitability across European countries. Main results and discussion are introduced in
Section 4. Finally, Section 5 concludes pointing out the main implications of the results.
3
2. The decomposition of accounting profitability
2.1. The empirical model
It is useful to discuss the main issues in the literature departing from the
specification of a full model in which profitability for a statistical unit in a period t is
explained by a set of characteristics. In this study the statistical unit refers to a firm f
with main activity in industry i. That firm is located in a country c and it belongs to a
business-group g. For that firm, we observe accounting profits in the year y. The
specification of a full model in which all these characteristics enter additively is then:
 yicgf     yY  i I   cC g G   f F   yicgf
y
i
c
g
(1)
f
It is important to understand the differences between the statistical unit used in this
paper, the firm, and in previous papers using the US Federal Trade Commission Line of
Business data and Compustat Reports to interpret the results. In this study, as is usual in
the statistical procedures, each firm is “assigned” to an industry according to its main
activity. Although we know if the firm is product diversified, the information on profits
is about the firm a whole, not about a specific product or activity. A similar procedure is
followed with the FTC database, in which an industry classification corresponding to its
main industry is assigned for each business unit. However, with Compustat each
statistical unit is a business segment (i.e., a specific industry) in which a corporation
operates.
A main difference of this study is that the business group is elaborated down-top using
information on share ownership links among firms. For that reason the business group is
closer to a concept of firm networks than to a concept of corporations. It is then
possible, and even likely, that two or more firms in the same business-group are located
in the same industry. That is not possible with Compustat, where each business segment
in a corporate-parent is placed in a different industry. This multidivisional characteristic
allows industry and corporate parent effects to be identified in the Compustat data
framework. This implies that, though similar in general terms, corporate-effects and
4
business-groups are not strictly comparable. In both cases an important contribution to
explaining profitability variance is expected. However, it is likely that the links between
business segments in a corporation may be stronger than between firms that share
ownership links, but which are not necessarily are under a common umbrella of a
corporation. If that is true, the expected effects of business groups would be smaller
than with respect to corporate effects.
The different statistical units could also have consequences on the expected relevance of
industry effects. In particular, in the Compustat data framework the expected industry
effect may be lower insofar as a business-segment comprises different business-units.2
The opposite occurs with the Amadeus database. In this case, diversified firms are
assigned to the “main” activity. There are not data on sales across activities for
diversified firms, though we can control if industrial effects are lower for them than for
non-diversified firms.
The last set of effects included refers to firm-effects.3 It introduces a statistical problem
emerges: any model that includes firm and year effects will have the same explanatory
power that a fully specified model with the five sets of effects (year, country, industry,
business-group and firm). In other words, after including business-specific effects and
years, it is not possible to increase the explanatory power of the model. Accordingly, Ftests do not reject the restriction that additional effects (business-group, industry or
country) are equal to zero. That is due to industry, business-group and country effects
are collinear by design with firm effects. The same issue emerges in the Compustat
framework with the business-specific effects, which are linear by design with industry
and corporate-parents effects, as McGahan and Porter (2002) point out.
2
That was precisely pointed out by M&P (1997) as a difference between his study and those using the
FTC Lines of Business.
3
The seminal paper by Schamelensee (1985) used a single year of data. Accordingly, firm effects were
not included and they were partially captured by introducing business-unit market share as an explanatory
variable. Some caution should be taken because corporate-parent effects were labeled ‘firm’ effects by the
author.
5
2.2. The Shapley Value approach
The literature has taken two main approaches for the decomposition of variance
in business-specific (firm in our case) accounting profits. Several papers, including the
seminal paper by Schamelensee (1985), have used the Components of Variance
Approach (COV). This method assumes effects randomly drawn before sample selected,
then constant. It also allows for covariance between effects but need to assume that
these are also random. This procedure does not provide significance tests on effects, but
it provides the contribution on each effect on total variance.
The second approach is based on an ANOVA procedure based on a regression analysis.
The strategy consists on analyzing the increase in the explanatory power that emerges
when different set of effects are introduced in the model. A main problem with the
ANOVA approach is that it would require considering all possible paths from null to
full model. If there is covariance between effects, as has been discussed in the literature
(Schmalensee, 1985: p. 344), the estimates of the significance and importance of effects
will be different depending on the order in which effects are introduced in building up
the econometric model from the null specification to the full one. Being aware of this
issue and its implications, the authors of previous papers have presented estimation
results from a number of paths from null to full model. For example, McGahan and
Porter (1997) show two specific paths in Table 5.4 In the first path (A), industry effects
are calculated using the residuals of a previous regression in which only year dummies
have been introduced. In (B) path the ordering is slightly different and industry effects
are included in the third step, after consider consecutively year and corporate-parents
effects. Since Schmalensee (1985), none paper has considered all possible paths. The
previous work also makes no attempt to aggregate the results of the different paths
followed. This leads to the possibility of the identification of the contributions of
different effects being confounded by the inequitable attribution of the covariance
between the effects to one or more of the effects, at the expense of the others.
4
In M&P (1997) they assume serial correlations on the errors in eq. 1. Then, they first estimate the full
model to have an estimation of the intertemporal persistence of all effects (regardless of source). This is
used to obtain an estimation of the null model controlling for persistent shocks.
6
In this paper we propose that in order to get more accurate results regressions
corresponding to all possible routes from null to full model should be run, and the
results of these regressions, specifically the marginal increase in adjusted R2 due to
effect inclusion, should be aggregated into a single measure of effect importance in
accounting for firm performance. To achieve this we use a concept from co-operative
game theory, the Shapley value, to interpret the results of our regressions.
Consider year, country, group, industry and firm effects to be players in a co-operative
game, with the outcome of the game being the proportion of variance in firm
profitability, the adjusted R2, accounted for by any given coalition of these effects. The
Shapley value of any given effect is then defined as the weighted average of it
contribution to all possible coalitions. More formally,
∪
Here, the Shapley value of effect j is the weighted sum of the differences in adjusted R2
between a coalition of variables including j
other effects in
∪
∪
and a coalition which included the
but not j. The weight assigned to the increase in adjusted R2 due
to the introduction of effect j into the coalition is
!
1 !
!
where n is the total number of effects being considered in the study and m is the number
of effects present in the Mth coalition, excluding j.
The Shapley value approach then consists of calculating all possible increases in
adjusted R2 due to the inclusion of an effect,5 assigning a weight to each of these
marginal increases, and adding up these weighted results to produce an overall
estimated contribution of the effect to explaining the variance in firm profitability.
Doing this for all effects allows each effect an equal chance to contribute to R2 in all
5
To apply this methodology to our context we should keep in mind that, in contrast to the initial proposal
by Lipovetsky & Conklin, (2001), the variables are not included one by one but in complete sets
reflecting each type of effects.
7
possible paths from null to full model, thus giving each effect an equal chance to claim
any covariance between effects.
2.3. Simulation
To test the performance of the standard and corrected for collinearity by design
Shapley value methods in comparison to the ANOVA and COV ones we use a Monte
Carlo simulation approach. We generate a dataset of industry, corporate parent, business
segment and year effects with defined variance and covariance structure and apply the
different methods to it. Our full model is thus:
where rikt is the profitability of corporate parent k’s business unit in industry i, μ is the
overall average profitability, αi and βk are industry and corporate parent dummy
variables that are correlated with each other, φik are business segment dummy variables
(an interaction of αi and βk), γt are year dummy variables and εikt is a normally
distributed error term which is uncorrelated with any of the effects. The business
segment dummy variables are perfectly collinear with both the industry and corporate
parent effects, making this data suitable for examining the benefits of using a Shapley
value procedure with correction for collinearity by design. We evaluate the performance
of the different methods by comparing the proportions of total variance assigned to the
effects by the methods with their theoretical values, which we calculate using the
parameters set in our data generating process.
Consider the variance decomposition equation:
2
,
If we know the variance and covariance parameters of the data generating process we
can calculate the theoretical proportion of total variance attributable to each effect. For
and
year and segment effects these are
respectively. To calculate the
theoretical proportion of total variance attributable to the industry and corporate parent
effects we must divide their covariance between them. The fairest way to do so is to
8
split the covariance term evenly between the two effects. Thus the theoretical
proportions of total variance attributable to industry and corporate parent effects are
,
and
,
respectively. Having calculated these theoretical
proportions, we can compare them to the proportions of total variance attributable to
each effect as estimated by the different variance decomposition methods. The method
most suitable for this kind of analysis will then be the one that produces estimates
closest to the theoretical values.
We construct our simulation data in the following way. We want to generate effects for
500 corporate parents operating in 250 industries over 4 years, with each corporate
parent operating in two industries, resulting in 1,000 business segments. First we draw
250 industry effects and 250 average corporate parent effects within an industry from a
bivariate normal distribution with mean zero and standard deviation 5 for both effects,
and a fixed correlation between them. To examine the performance of the methods
when used on data with different correlation structures we allow the correlation in the
data generating process to vary from 0 to 0.9.
Next we generate 2 individual corporate parent effects for each industry by adding a
normally distributed error term with mean zero and variance one to the average
corporate parent effects. We now have 500 corporate parents, each operating in a
primary industry. To assign a secondary industry to each corporate parent while
maintaining the correlation structure between industry and corporate parent effects, we
generate a hypothetical secondary industry effect for each corporate parent by adding an
error term drawn randomly from a standard normal distribution to their primary industry
effect. We then match this hypothetical secondary industry effect to the closest existing
industry effect different from the corporate parent’s primary industry effect. The data
now consists of 1,000 observations of corporate parent profitability in 250 industries,
making up 1,000 business segments.
We next generate the business segment effects by drawing these from a normal
distribution with mean 0 and variance 100. The business segment effects are thus drawn
independently from the industry or corporate parent effects. To construct the year
effects we create an additional 3 copies of the dataset and assign a year effect drawn
9
randoomly from a normal diistribution w
with mean 0 and varian
nce 1 to each
ch of the 4 copies
c
of thhe data. Thhe final steep is to coonstruct the profitability measuree as the su
um of
indusstry, corpoorate paren
nt, businesss segment and year effects, pplus a norm
mally
distriibuted errorr term with
h mean zeroo and variaance 100. We
W run the data generrating
proceess 100 tim
mes for eacch value off the correllation betweeen industrry and corp
porate
parennt effects raanging from
m 0 to 0.9 inn steps of 0.1, making 1,000 runs of the simu
ulated
data generating process in total.
t
Figure 1
Industtry and Corp
porate Parennt estimated
d shares of explanatory
e
y power
Giveen our data generating
g process, w
we can calcculate the components
c
s of the varriance
decomposition equation and
a
substituute them in
nto it.
,
(variiance of thee average co
orporate parrent effect plus
p variancce of the errror term ussed to
generate individdual corporaate parent efffects, plus some small distortion due to matching
proceess
used
to
selecct
secondd
industry
y),
,
,
and
.
10
The simulation results can be seen in Figure 1. The Shapley value approach estimate of
proportion of total variance attributable to each effect appears to be consistently close to
the theoretical value for every effect. This is not true for the other methods.
3. Data and sampling
The Amadeus database, elaborated by Bureau Van Dijk, reports balance sheet
and additional data for about 14 millions of firms in forty European countries.6 This
database provides information on several profitability measures, though for
comparability with previous studies, we limit our assessment to returns on assets
(ROA). This variable is defined as operating profits, including extraordinary charges,
over total assets, the latter including both tangible and intangible assets. We have
restricted the initial sample to those firms that provide full information to elaborate
ROA, industrial activity and the number of employees for the period 2001-2006
(balanced panel).7 Even though it is possible to include some information on more
recent years, it is achieved by losing information for some countries. This six-years
period is slightly smaller than Roquebert et al (1996), who used a 7 years-period, but
larger than Rumelt (1991), who used a 4-years serie (1974-1977), and similar to
McGahan and Porter (1997 and 2002), in which the average time series of each
economic unit was 5.7 years, in a unbalanced panel for a 14-years period. Industries are
defined at four-digits NACE and the country is defined at the country in which the firm
is located and reports. We describe next the procedure followed to construct the
business group and the sampling method that has been implemented to run the
regression analysis with this huge database.
3.1 The construction of business groups
In the previous section we have mentioned some characteristics of the Amadeus
database by contrast to Compustat and FTC database. As was explained then, our
statistical unit refers to firms. We have information about firms’ subsidiaries (other
6
Amadeus database currently covers all European Union countries, Belarus, Bosnia-Herzegovina,
Croatia, Iceland, Liechtenstein, Macedonia, Moldova, Montenegro, Norway, Russian Federation, Serbia,
Switzerland and Ukraine.
7
The information on the number of employees is required because we control for size-classes in the
sampling procedure, as we next describe.
11
firms), if they exist, and the corresponding share of ownership. Using than information,
we define a business group as the set of companies that have ownership links. Those
links are considered when the percentage of ownership in a subsidiary controlled by the
main firm is bigger than 50%.8 The main firm can have one or more subsidiaries, and it
can be also controlled (i.e, their shares can be owned) by other firm. In this way, the
business group is defined as the network of paired firms that surpasses the 50%
threshold. This is similar to the definition of Korean business groups in Chang and Honj
(2002), though chaebols definition uses a 30% threshold.9
Given that the on-line version of the Amadeus database only offers a snapshot of the
last available information on ownership structure (the last wave), we have used seven
waves (one a year) to measure with more precision the shape of the network. Departing
from of firms, 887,443 pairs of different links surpassing the 50% threshold were
identified in the period 2002-2006.10 Later, we include information on the industrial
code for each firm (either main or subsidiary), defined by the 4-digits NACE rev 1.1
classification. In doing so, we should keep in mind that we lose some links in which the
subsidiary is not included in the Amadeus database. This is clearer in the case of nonEuropean firms. We also drop some industries that amplify the size of the network, but
that are purely instrumental companies. In particular, we drop those firms in financial
industries (divisions 65 to 67), two particular business-service industries linked to
financial services (7415 and 7487) and non-service sectors. 11 The final number of links
is then 450,782, with a total number of 628,055 firms. A 28.7% of all firms were only
main firms, while 66.1% of firms were only subsidiaries and the remaining 5.1% were
simultaneously main (they have at least one subsidiary) and subsidiary (they have one
main firm).
8
In some cases there is not information about the percentage of ownership. Those links have not been
included to elaborate the business group.
9
Chang and Honj (2002) obtain a strong effect for Korean business groups (chaebols) to explain the
variance of profits. The economic relevance of these conglomerates (40 percent of Korea’s total output in
1996) is peculiar. Our analysis for a wide set of European countries allows us to include a heterogeneous
set of institutional settings, though the Korean case has not resemblance in any European country
10
This represents a 42.1% of initially observed 2,107,422 pairs. An additional condition is imposed in the
process: the ownership link larger than 50% should be observed for at least two years. If that condition is
not imposed, the number of links increases to 1.4 million of links.
11
This is similar to the exclusion of depositary institutions in papers using the Compustat database.
12
Table 1
Business groups size (number of observed links)
Number of observed links Percentage
1
113,626
25.21
2
59,964
13.3
3
37,131
8.24
4
26,524
5.88
5
20,275
4.5
6
15,390
3.41
7
11,998
2.66
8
10,648
2.36
9
9,153
2.03
10 or more
146,073
32.00
Total
450,782
100
An algorithm is implemented to define the business groups, defined as the network of
connected links. The number of identified networks is 179,089. Table 1 shows the size
distribution of business groups (i.e., networks) according to the number of links. As can
be observed, the majority of business groups are very simple: they are constituted by
only one main and one subsidiary firm. It occurs in 25.2% of the observed networks. In
almost half of observed networks (specifically, 46.63% of cases) the networks is
constituted by three or less links. The distribution of bigger networks is fairly smooth
and the bigger network has 1,096 links. The shape of the business group depends on the
specific structure of the links in the network. Figure 2 shows the shape of two business
groups. The first one corresponds to a randomly chosen network with 10 pairs, and the
second (right panel) corresponds to the biggest business group.
13
Figure 2
Busineess network
k shape
Netw
work with 10 links
Neetwork with 1096 links (bbigger netwo
ork)
Addiitionally, thhe majority of links (833.5%) occurr between tw
wo firms thhat are locatted in
the same countrry. Of course, that is moore likely iff it is a very
y small grouup, with only one
link. In that casse, 92.0% of
o business groups hav
ve the subssidiary in thhe same co
ountry
than the main fiirm. Analyzzing it from
m the perspeective of thee groups ass a whole, 66.1%
6
of all groups haave all their firms in thhe same cou
untry. The number
n
of ffirms in diffferent
counntries is, as expected, increasing
i
w
with busineess group siize. The bigggest group
p (the
groupp with 10966 links), has 712 links in which th
he main and the subsiddiary are lo
ocated
in different counntries. Figu
ure 3 displayys the numb
ber of main
n and subsiddiary firms in all
linkss across couuntries. On average, thhere are 2.5 links per business
b
grooup. The av
verage
size iis larger (6.2) for multiinationals ggroups (i.e., those busin
ness groups in which att least
one ffirm is locatted in more than one coountry).
14
Figure 3
Country of the main and subsidiary firm (links)
140.000
120.000
100.000
80.000
Main
60.000
Subsidiary
40.000
20.000
Other
RU
PL
IE
CH
PT
FI
IT
NO
SE
NL
GB
0
Finally, we should mention that even though business group can be identified as a
network of firms, they do not capture strategic networks. These “represent an attempt to
achieve shared goals through collective efforts by multiple participants, each of which
also have their own strategic interests that are not necessarily always aligned” (Wincent
et al, 2010, p. 599). Of course, the analysis of strategic networks is a related literature
that requires a different approach, usually focusing on a small set of networks in an
industry defined and analyzed with precision.
3.2 The sampling procedure
The huge size of the Amadeus database makes computationally impossible to use all
available data in a regression analysis if individual firm effects are included. Therefore,
a sampling procedure is implemented to extract samples of firms that can be used in the
empirical analysis. There is an additional reason that supports this sampling procedure.
Some recent papers have used the Amadeus database due to there is not a database that
provides homogeneous micro-level information for different European countries
nowadays.12 However, Amadeus has not been elaborated to fulfill representativeness
criteria across countries, industries or size-classes. It raises questions about if the results
12
The only exception is the Community Innovation Survey. The European Statistical Office provides
restricted access to this survey in their premises in Luxembourg. However, that is a survey designed to
measure and assesses innovation activities and does not contain any information about performance
variables.
15
can be generalized for the whole population of firms or if, by the opposite, they should
be interpreted in the context of the specific characteristics of the firms included.13 In
particular, it is well-known that the distribution of firms included in that database has an
upward bias towards larger average sizes. However, in our knowledge, none of previous
papers that have used this Amadeus database has treated this issue.
To elaborate representative samples we have to resemble the population of firms in each
stratum, defined by the intersection of country, industry and size-class segment. That
population is defined using the information contained in the Structural Business
Statistics (SBS), elaborated by the Statistical Office of the European Commission
(Eurostat) and freely available in its web server. The SBS provides information on the
number of firms in each EU-27 country and Norway according to industry (NACE
Revision 1.1) and size-classes (<10, 10-19, 20-49, 50-249, 250 and more employees).
Therefore, using the SBS implies to restrict the study to European Union countries, plus
Norway.
The sampling procedure is implemented to have a compensated samples of firms
belonging to business groups and firms that do not belong to a business group. In the
first case, we drop all those business groups for which only one firm has been included
in the sample. It is done because in those cases we could not distinguish business group
and firm effects. We should note that the construction of the business group, as was
defined in the previous subsection, is done before implementing this procedure. It
allows us to know that two firms belong to the same business group even though its link
is throughout a firm that is not included in the initial sample. For example, a German
firm could have a subsidiary in Russia, which has a subsidiary in Sweden. Even though
Russian firms are not included in the sample, both the German and Swedish firms are
considered as belonging to the same business group.
13
Of course, we could run the analysis without worrying about that issue. In such a manner they would be
conditional to the specific sample covered by Amadeus, in a similar way that those papers using
Compustat provide results for the specific set of firms covered in that database (i.e., publicly traded
companies).
16
4. Results (very preliminary)
As was explained in Section 2.2, by contrast to nested ANOVA procedures, the
Shapley value method allows us to obtain a more accurate estimation of each set of
effects. It is done by calculating all possible increases in adjusted R2 due to the
inclusion of each specific effect, assigning a weight to each of these marginal increases,
and adding up these weighted results to produce an overall estimated contribution of the
effect to explaining the variance in firm profitability. In other words, it implies to
estimate all possible routes from null to full model. Additionally, as was explained in
Section 2.1, the estimation of any model that introduces firm and year effects will
achieve the same explanatory power than the full model (equation 1) or any restricted
model that include both set of effects. It has two consequences on the method. On the
one hand, it implies that some of the regressions are not required because the
explanatory power can not be improved, so its marginal contribution is equal to zero.
For example, once firm effects are included, any regression that includes industry,
business-group or country effects will not increase the adjusted R2. On the other hand,
the fact that business-group, industry and country effects are linear by design with firm
effects implies that it is not easy to infer the relative contribution of each one, due to the
Shapley Value predictor of an effect is downward biased.
For this last reason we implement two versions of the Shapley Value method. The first
one is a direct application of the procedure explained in Section 2.2 , that we call as
uncorrected SV. The second one is the modified version, that we call as corrected SV.
In this case a two-step procedure is introduced always than firm-effects enter in the
game. For example, assume that we want to know the increase in adjusted R2 of (firm
and country) effects with respect to (country) effects. Then, we first run a regression
with only industry and group as explanatory variables. We call the adjusted R2 of this
first regression as R12F, being (1- R12F) the non-explained part. We then use the
residuals in that first regression as the variable to explain in a second regression that
includes firm and country effects as explanatory variables (actually, in this case only
firm effects are required). The adjusted R2 of this second step is R22F. Then, R2F = (1R12F) x R22F. The intuitive idea is that the firm effect explains a portion of the model
(identified by R22F) that is not explained by the other variables with respect to which
17
firm effects are collinear (identified by 1-R12F). The marginal effect of firm in this case
would be R2F -R2C.
Both approaches are initially implemented using 100 samples of 5000 firms, obtained
according to the sampling procedure explained in Section 3.2.
We should mention that perfect multicollinearity between firm effects and each one of
the other type of effects (except the year effects) would not exist if firms could change
in the observed period across countries, industries or business-groups. All three
possibilities are excluded in the analysis, though for different reasons. Firstly, a firm can
not change across countries because in that case it would be, de facto, a different firm.
Secondly, even though a firm can have activities in more than one industry, only its
main activity throughout the whole observed period has been taken into account. In
such a way, no transient industry effects (as in Rumelt, 1991) have been considered.
Finally, business effects have been calculated using the maximum network of
ownership links throughout the whole period (see Section 3.2). In such a way, either a
firm belongs to the same network in all years considered or it is not integrated in any
network. These two possibilities are similar to diversified/non-diversified corporations
in the context of the McGahan and Porter (1997, 2002) papers.
Table 2 shows the results obtained using three levels of aggregation for industry effects:
two, three and four-digits in the NACE rev 1.1 classfication. This allows us to obtain an
initial perspective of the sensibility of results to the level of industrial aggregation.
18
Table 2
Contribution to explanatory power by type of effects
(simple average of Shapley Values for 100 samples)
2-digits industries
Country
Industry
Group
Year
Firm
Total
3-digits industries
4-digits industries
Uncorrected
Corrected
Uncorrected
Corrected
Uncorrected
Corrected
0.00590 0.01204 0.06013 0.00070 0.40175 0.48052 0.01206 0.04433 0.13192 0.00070 0.31129 0.50031 0.00585 0.01872 0.05994 0.00070 0.39531 0.48052 0.01197 0.05707 0.13159 0.00071 0.29675 0.49809 0.00576 0.02842 0.06007 0.00070 0.38557 0.48052 0.01179 0.07581 0.13191 0.00071 0.27476 0.49498 The majority of the empirical literature with micro-data analyzes the manufacturing
sector. It is usually due to larger availability of datasets. However, that is not the case in
this study. The results showed previously combined all sectors, with the exception of
agrarian activities. However, McGahan and Porter (1997) claimed about possible
differences between manufacturing and other sectors. It is based on the (visual)
comparison among the COV for business. The COV excludes the possibility of any
contrast. In McGahan and Porter (2002) they run the procedure again for manufacturers
(with the three possibilities dealing with serial correlation). The observe differences are
partially due to the larger serial correlation observed for manufacturers, which suggests
a higher rate of persistence. Table 3 shows the results obtained with manufacturing and
services industries. To have samples of similar size to those used in Table 2, two new
100 samples of 5000 firms (again, with half of firms belonging to business groups) are
used, one for manufacturing industries and the other for services.
19
Table 3
Contribution to explanatory power by type of effects:
manufacturing vs services
(simple average of Shapley Values for 100 samples)
Manufacturing
Country
Industry
Group
Year
Firm
Total
Services
Uncorrected
Corrected
Uncorrected
Corrected
0.0054 0.0217 0.0529 0.0010 0.4276 0.5086 0.0111 0.0644 0.1170 0.0010 0.3332 0.5267 0.0068 0.0249 0.0662 0.0016 0.3997 0.4991 0.0138 0.0676 0.1435 0.0016 0.2887 0.5153 Finally, Roquebert et al (1996) and McGahan and Porter (2002) find that the effect of
diversified corporations is decreasing with respect to the degree of diversification.
“Diversified” means, in the context of Compustat database, corporations with more
business units. That is parallel to the number of firms in each group as here defined.
Table 4.
Table 4
Contribution to explanatory power by type of effects: size of business groups
(simple average of Shapley Values for 100 samples)
Business groups with at
least two links
All sizes
Country
Industry
Group
Year
Firm
Total
Uncorrected
Corrected
Uncorrected
Corrected
0.00576 0.02842 0.06007 0.00070 0.38557 0.48052 0.01179 0.07581 0.13191 0.00071 0.27476 0.49498 0.0054 0.0311 0.0590 0.0017 0.3844 0.4816 0.0111 0.0799 0.1291 0.0017 0.2729 0.4947 20
5. Conclusions (very preliminary)
This paper analyzes the relative contribution of several sets of characteristics in
explaining the heterogeneity in firm-specific accounting profits. It uses a huge database
of European firms in all non-agrarian industries (Amadeus) and analyzes the period
2002-2006. The paper has two main contributions. On the one hand, it proposes a
different approach, the Shapley Value method. It deals with multicollinearity between
explanatory variables, which implies that the precise sequence of entering each one in a
more complete model can result in a different assessment of their relative importance.
For example, McGahan and Porter (2002) point out that the relative importance of
corporate effects are about 9%, due to that is the increase in the adjusted R2 of a model
with Year & Industry & Corporate effects over a model that only includes Year &
Industry effects. However, industry effects already capture part of the corporate effects
that are later added. If, for example, the order changes and Corporate effects are
included just after Year effects, a model that includes both effects over a model with
only Year effects would suggest that the increase in explanatory power of Corporate
effect would be about 14%. What is the correct answer? Obviously, insofar as the
explanatory variables are correlated with each other, none of them is the correct value.
The Shapley Value deals with that issue calculating the contribution of each variable
across all possible models, that is, across all possible combinations of predictors.
The (very) preliminary results suggest the relevance of business group effects. These
business group effects seem to be decreasing in the size of firm network. These are
bigger than industry effects, even though industry was defined at 4 digits. However,
differences across countries have small relevance in explaining observed heterogeneity
in profitability. Year effects are negligible in all cases while, as expected, the relative
importance of firm effects is smaller when a corrected version of the Shapley Value
method is implemented. Finally, some differences between manufacturing and services
seem to emerge. In particular, idiosyncratic firm effects are smaller in services, while
the relative contribution of the rest of effects, in particular business group effects, is
larger.
21
References:
Bowman EH, Helfat CE. 2001. Does corporate strategy matter? Strategic Management Journal
22(1): 1–23.
Chang SJ, Singh H. 2000. Corporate and industry effects on business unit competitive position.
Strategic Management Journal 21(7): 739–752.
Lipovetsky S, Michael C. 2001. Analysis of regression in game theory approach. Applied
Stochastic Models in Business and Industry 17:319-330.
McGahan AM, Porter ME. 1997. How much does industry matter, really? Strategic
Management Journal, Summer Special Issue 18: 15–30.
McGahan AM, Porter ME. 2002. What do we know about variance in accounting profitability?
Management Science 48: 834–851.
Misangyi VF, Elms H, Greckhamer T, Lepine JA. 2006. A new perspective on a fundamental
debate: a multilevel approach to industry, corporate, and business unit effects. Strategic
Management Journal 27(6): 571–590.
Roquebert JA, Phillips RL, Westfall PA. 1996. Markets vs. management: what ‘drives’
profitability? Strategic Management Journal 17(8): 653–664.
Rumelt RP. 1991. How much does industry matter? Strategic Management Journal 12(3): 167–
185.
Schmalensee R. 1985. Do markets differ much? American Economic Review 75: 341–351.
Wincent J, Anokhin S, Örtqvist D, Autio E. 2010. Quality Meets Structure: Generalized
Reciprocity and Firm-Level Advantage in Strategic Networks. Journal of Management
Studies 47(4): 597-624.
22
Appendix:
Table A1
Descriptive statistics
(average values across 100 samples*)
Average Minimum Maximum
Countries
Industries
2 digits
3 digits
4 digits
Business groups (networks)
Firms in networks
Firms per network
Industries per network
2 digits
3 digits
4 digits
Countries per network
25
25
44
192
415
1072
2504
2.11
44
183
400
1052
2495
2
1.59
1.73
1.80
1.20
1
1
1
1
25 44 200 435 1089 2505 14 6 8 9 7 *Samples used in Table 2. Average values of samples used in Tables 3 and 4 are very
similar.
Figure A1
0
.5
1
Density
1.5
2
2.5
Distribution of ROA (full sample)
-2
-1
0
roa
23
1
2
Figure A2
Distribution of relative contribution to Shapley Value
(average contribution across 100 firms, 2 digits, corrected SV)
Corrected Shapley Values for Set of Effects
.
100
200
.
.015
.02
.025
.03
Country
.035
Density
50
0
0
0
50
Density
50
100
Density
100 150
150
.
.04 .042 .044 .046 .048 .05
Industry
.
.
.01
.
.
.
0
0
10
Density
500
1000
Density
20
30
40
1500
.005
0
.0005 .001 .0015 .002
Year
.
.36
.38
.4
.42
Firm
.
Country
Industry
Group
Year
Firm
Std.dev
0.0029127
0.002143
0.003249
0.000299
0.0099978
24
Min
0.0171007
0.0409424
0.0064913
0.0000813
0.3674617
Max
0.0305791
0.0495361
0.0221965
0.0015417
0.4151278
.015
Group
.02
.025