...

Tractable and Consistent Random Graph Models September 27, 2014 Arun G. Chandrasekhar

by user

on
Category: Documents
13

views

Report

Comments

Transcript

Tractable and Consistent Random Graph Models September 27, 2014 Arun G. Chandrasekhar
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Tractable and Consistent Random
Graph Models
Arun G. Chandrasekhar‡
‡
?
Matthew O. Jackson?
Stanford, Dept. of Economics
Stanford, Dept. of Economics, Santa Fe Institute, CIFAR
September 27, 2014
Motivation
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Network formation models:
Risk-sharing relationships typically function when embedded
in grouped relationships?
Do cross race/ethnicity/caste friendships typically occur in
private?
Social learning, etc.
Core motivations for models:
1. Can test economic hypotheses from theory
2. Counterfactuals/policy
Desiderata
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Estimable network formation models
Conclusion
Link dependencies
More likely to link to friends of friends?
Do relationships need to be embedded in others to function?
(e.g., need for support, cliques, etc.?)
Link to others who are well-connected?
Good statistical properties (i.e., consistency)
Random utility foundations
Difficulties
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Typical dataset:
Observe one draw g and covariates x = (x1 , ..., xn )
Extremes:
Extreme 1: Full independence
Each decision is independent =⇒
n
2
independent obs.
Extreme 2: Full dependence
e.g., Nodes 1, 2 linked ⇐⇒ all nodes linked.
0
26
87
93
1
2
4
5
28
6
Introduction
7
3
Simple example
16
8
9
10
SERGMs
13
12
11
14
SUGMs
15
Illustrations
18
33
17
34
19
Conclusion
35
37
20
36
38
39
22
40
24
23
41
25
27
42
21
43
29
45
31
30
98
44
32
46
97
48
47
49
52
53
75
62
50
60
54
51
61
67
55
63
66
56
64
57
72
65
58
84
68
59
71
86
70
69
73
85
88
74
89
76
90
91
77
92
78
94
79
95
80
96
81
82
83
0
26
87
93
1
2
4
5
28
6
Introduction
7
3
Simple example
16
8
9
10
SERGMs
13
12
11
14
SUGMs
15
Illustrations
18
33
17
34
19
Conclusion
35
37
20
36
38
39
22
40
24
23
41
25
27
42
21
43
29
45
31
30
98
44
32
46
97
48
47
49
52
53
75
62
50
60
54
51
61
67
55
63
66
56
64
57
72
65
58
84
68
59
71
86
70
69
73
85
88
74
89
76
90
91
77
92
78
94
79
95
80
96
81
82
83
Framework
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Triangular array of random graphs: {gn : n ∈ N} drawn
Conclusion
g ∼ Pnβn (·)
Want to consistently estimate β n as n → ∞.
Without any restrictions on preferences and action space
ui (g, X; β n )
impossible to proceed.
Literature
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Most models used are not consistent with one draw g
ERGM literature (see Shalizi and Rinaldo ‘13); Christakis,
Fowler, Imbens and Kalyanaraman (‘10); Mele (‘13); Koenig
(‘13); Badev (‘14); Sheng (‘14)
Exponential Random Graphs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Pβ (g) =
P exp(β·S(g)) 0
g 0 exp(β·S(g ))
S(g) a vector of sufficient statistics, usually subgraph counts
e.g., # links, # triangles, # k-stars, ...
Literature
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Most models used are not consistent with one draw g
ERGM literature (see Shalizi and Rinaldo ‘13); Christakis,
Fowler, Imbens and Kalyanaraman (‘10); Mele (‘13); Koenig
(‘13); Badev (‘14); Sheng (‘14)
Infeasible or difficult estimation
ERGM literature; Christakis, Fowler, Imbens and
Kalyanaraman (‘10); Mele (‘13); Koenig (‘13); Badev (‘14);
Sheng (‘14)
Literature
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Most models used are not consistent with one draw g
ERGM literature (see Shalizi and Rinaldo ‘13); Christakis,
Fowler, Imbens and Kalyanaraman (‘10); Mele (‘13); Koenig
(‘13); Badev (‘14); Sheng (‘14)
Infeasible or difficult estimation
ERGM literature; Christakis, Fowler, Imbens and
Kalyanaraman (‘10); Mele (‘13); Koenig (‘13); Badev (‘14);
Sheng (‘14)
Techniques that apply for dense graphs
Bickel and Chen (‘09)
Chatterjee, Diaconis and Sly (‘11); Graham (‘14)
Chatterjee, Diaconis and Sly (2011)
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Unknown β := (β1 , ..., βn ).
pij =
exp(βi + βj )
.
1 + exp(βi + βj )
Show with probability 1 − O(n−2 ),
!
r
log n
b
max βi − βi ≤ O
.
i
n
Literature
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Most models used are not consistent with one draw g
ERGM literature (see Shalizi and Rinaldo ‘13); Christakis,
Fowler, Imbens and Kalyanaraman (‘10); Mele (‘13); Koenig
(‘13); Badev (‘14); Sheng (‘14)
Infeasible or difficult estimation
ERGM literature; Christakis, Fowler, Imbens and
Kalyanaraman (‘10); Mele (‘13); Koenig (‘13); Badev (‘14);
Sheng (‘14)
Techniques that apply for dense graphs
Bickel and Chen (‘09)
Chatterjee, Diaconis and Sly (‘11); Graham (‘14)
Literature
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Most models used are not consistent with one draw g
ERGM literature (see Shalizi and Rinaldo ‘13); Christakis,
Fowler, Imbens and Kalyanaraman (‘10); Mele (‘13); Koenig
(‘13); Badev (‘14); Sheng (‘14)
Infeasible or difficult estimation
ERGM literature; Christakis, Fowler, Imbens and
Kalyanaraman (‘10); Mele (‘13); Koenig (‘13); Badev (‘14);
Sheng (‘14)
Techniques that apply for dense graphs
Bickel and Chen (‘09)
Chatterjee, Diaconis and Sly (‘11); Graham (‘14)
Techniques that apply for near block-diagonal graphs
Random Geometric Graph Approach
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Random Geometric Graph Approach
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
=⇒
Random Geometric Graph Approach
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
=⇒
Random Geometric Graph Approach
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
=⇒
=⇒
Random Geometric Graph Approach
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Representation of n × n adjacency matrix
Contributions of this paper
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Develop new models that:
extend the workhorse model in the literature
emphasize subgraphs as objects of interest; resulting network
is a projection
under certain assumptions, have estimators that are
consistent and asymptotically normally distributed
are feasible to estimate
generate graphs consistent with data
have microfoundations
Summary of Approach
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Summary of Approach
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
=⇒
Microfoundation Examples
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
1. Dynamic (myopic) revision
2. Mutual consent
3. Strategic search
Microfoundation Examples
Dynamic Revision
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
A. Payoffs: Additively separable in subgraphs
X
v(g` , X` ; β` ).
ui (g, X; β) =
g` ,i∈g`
Microfoundation Examples
Dynamic Revision
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
A. Payoffs: Additively separable in subgraphs
X
v(g` , X` ; β` ).
ui (g, X; β) =
g` ,i∈g`
B. Stability concept: Pairwise stability with transfers
ij ∈ g implies that ui (g) + uj (g) ≥ ui (g − ij) + uj (g − ij), and
ij ∈
/ g implies that ui (g) + uj (g) ≥ ui (g + ij) + uj (g + ij).
Microfoundation Examples
Dynamic Revision
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
A. Payoffs: Additively separable in subgraphs
X
v(g` , X` ; β` ).
ui (g, X; β) =
g` ,i∈g`
B. Stability concept: Pairwise stability with transfers
ij ∈ g implies that ui (g) + uj (g) ≥ ui (g − ij) + uj (g − ij), and
ij ∈
/ g implies that ui (g) + uj (g) ≥ ui (g + ij) + uj (g + ij).
Remark: A+B =⇒ existence of a potential function: f (·).
Microfoundation Examples
Dynamic Revision
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
A. Payoffs: Additively separable in subgraphs
X
v(g` , X` ; β` ).
ui (g, X; β) =
g` ,i∈g`
B. Stability concept: Pairwise stability with transfers
ij ∈ g implies that ui (g) + uj (g) ≥ ui (g − ij) + uj (g − ij), and
ij ∈
/ g implies that ui (g) + uj (g) ≥ ui (g + ij) + uj (g + ij).
Remark: A+B =⇒ existence of a potential function: f (·).
C. Meeting process and shocks: ∀t, at most one pair meets;
every ij have some positive probability of being selected.
Microfoundation Examples
Dynamic Revision
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
A. Payoffs: Additively separable in subgraphs
X
v(g` , X` ; β` ).
ui (g, X; β) =
g` ,i∈g`
B. Stability concept: Pairwise stability with transfers
ij ∈ g implies that ui (g) + uj (g) ≥ ui (g − ij) + uj (g − ij), and
ij ∈
/ g implies that ui (g) + uj (g) ≥ ui (g + ij) + uj (g + ij).
Remark: A+B =⇒ existence of a potential function: f (·).
C. Meeting process and shocks: ∀t, at most one pair meets;
every ij have some positive probability of being selected.
Result [Butts ‘09, Mele ‘13, This paper]: Invariant distribution is
exp(f (g; β))
0
g 0 exp(f (g ; β))
g ∼ Pβ (g) := P
Microfoundations
Static Mutual Consent
Introduction
Simple example
A. Payoffs: Payoffs by subgraph
SERGMs
SUGMs
Illustrations
Conclusion
ui (g` , X` ; β` ) = β` hi (g` , X` ) − i,`
Microfoundations
Static Mutual Consent
Introduction
Simple example
A. Payoffs: Payoffs by subgraph
SERGMs
SUGMs
ui (g` , X` ; β` ) = β` hi (g` , X` ) − i,`
Illustrations
Conclusion
B. Stability concept: Mutual consent
g` ⇐⇒ ui (g` , X` ; β) ≥∗ 0, ∀i ∈ g` .
Microfoundations
Static Mutual Consent
Introduction
Simple example
A. Payoffs: Payoffs by subgraph
SERGMs
SUGMs
ui (g` , X` ; β` ) = β` hi (g` , X` ) − i,`
Illustrations
Conclusion
B. Stability concept: Mutual consent
g` ⇐⇒ ui (g` , X` ; β) ≥∗ 0, ∀i ∈ g` .
C. Meeting process and shocks: Subgroups of size m` with
with probability π` and shocks i,` ∼ F` (·):
Y
p` (X` ; β) = π` ·
F` (β` h).
i∈g`
Microfoundations
Static Mutual Consent
Introduction
Simple example
A. Payoffs: Payoffs by subgraph
SERGMs
SUGMs
ui (g` , X` ; β` ) = β` hi (g` , X` ) − i,`
Illustrations
Conclusion
B. Stability concept: Mutual consent
g` ⇐⇒ ui (g` , X` ; β) ≥∗ 0, ∀i ∈ g` .
C. Meeting process and shocks: Subgroups of size m` with
with probability π` and shocks i,` ∼ F` (·):
Y
p` (X` ; β) = π` ·
F` (β` h).
i∈g`
Result [This paper]: Under some assumptions
β consistently estimable
distrubtion is g ∼ Pβ (g) :=
P exp(f (g;β))
0
g 0 exp(f (g ;β))
Microfoundations
Introduction
Static Search Intensity
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Agents put in (costly) search effort to form cliques for a payoff
Whether a given clique forms is stochastic, depending on
efforts of all involved
Resulting graph reflects subgraphs (cliques) of various sizes
occuring with various frequencies
Outline
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
(i) Simple example
(ii) Subgraph Generated Models
(iii) Illustrations
ERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Workhorse: Exponential Random Graph Models
Example: studied extensively by Strauss 86, Wasserman and
Pattison 96, Park and Newman 04, Butts 06, Chatterjee and
Diaconis 11, Mele 14,...
e.g., probability of a graph depends on
how many links present
how many triangles present - e.g., Coleman’s closure, support
in Jackson et al., etc.
ERGMs: An Example
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Want probability of graph g to depend on
θI #isolates(g) + θL #links(g) + θT #triangles(g).
ERGMs: An Example
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Want probability of graph g to depend on
θI #isolates(g) + θL #links(g) + θT #triangles(g).
Set
Pθ (g) ∝ exp (θI #isolates(g) + θL #links(g) + θT #triangles(g))
ERGMs
Introduction
Simple example
SERGMs
SUGMs
More generally
Illustrations
Conclusion
exp (θ1 S1 (g) + ... + θk Sk (g))
0
0
g 0 ∈G n exp (θ1 S1 (g ) + ... + θk Sk (g ))
Pθ (g) = P
where:
g is a graph
Sl (g) is a “statistic” - often the number of copies of some
subgraph within the parent graph
ERGMs
Introduction
Simple example
SERGMs
SUGMs
More generally
Illustrations
Conclusion
exp (θ1 S1 (g) + ... + θk Sk (g))
0
0
g 0 ∈G n exp (θ1 S1 (g ) + ... + θk Sk (g ))
Pθ (g) = P
where:
g is a graph
Sl (g) is a “statistic” - often the number of copies of some
subgraph within the parent graph
MCMC techniques for estimation have led to these becoming
standard (Snijders 02, Handcock 03, ...)
Estimating ERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Bhamidi, Bresler and Sly (2008), Chatterjee and Diaconis (2011)
Conclusion
For “large classes” (dense enough) ERGMs...
MCMC: (Gibbs sampling)
Mixing time is less than exponential only if networks have
asymptotically approximately independent links (ER-like)
So when ERGMs are interesting, then MCMC estimation will
be exponentially slow
Simulation evidence of instability even in the sparse case...
Simple example
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Subgroups of nodes sometimes meet
Make structures at random (e.g., cliques)
Here: triangles, links, isolates
Example: Isolates, Links and Triangles
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
exp (θI SI (g) + θL SL (g) + θT ST (g))
.
0
0
0
g 0 exp (θI SI (g ) + θL SL (g ) + θT ST (g ))
Pθ (g) = P
Example: Isolates, Links and Triangles
Introduction
Simple example
SERGMs
SUGMs
Illustrations
exp (θI SI (g) + θL SL (g) + θT ST (g))
.
0
0
0
g 0 exp (θI SI (g ) + θL SL (g ) + θT ST (g ))
Pθ (g) = P
Conclusion
ST (g): numer of triangles.
SL (g): number of edges.
SI (g): number of isolated nodes.
Example: Isolates, Links and Triangles
Introduction
Simple example
SERGMs
SUGMs
exp (θI SI (g) + θL SL (g) + θT ST (g))
.
0
0
0
g 0 exp (θI SI (g ) + θL SL (g ) + θT ST (g ))
Pθ (g) = P
Illustrations
Conclusion
ST (g): numer of triangles.
SL (g): number of edges.
SI (g): number of isolated nodes.
n = 50 nodes
Example: Isolates, Links and Triangles
Introduction
Simple example
SERGMs
SUGMs
exp (θI SI (g) + θL SL (g) + θT ST (g))
.
0
0
0
g 0 exp (θI SI (g ) + θL SL (g ) + θT ST (g ))
Pθ (g) = P
Illustrations
Conclusion
ST (g): numer of triangles.
SL (g): number of edges.
SI (g): number of isolated nodes.
n = 50 nodes
20 isolates, 45 links, 10 triangles on avg.
Example: Isolates, Links and Triangles
Introduction
Simple example
SERGMs
SUGMs
exp (θI SI (g) + θL SL (g) + θT ST (g))
.
0
0
0
g 0 exp (θI SI (g ) + θL SL (g ) + θT ST (g ))
Pθ (g) = P
Illustrations
Conclusion
ST (g): numer of triangles.
SL (g): number of edges.
SI (g): number of isolated nodes.
n = 50 nodes
20 isolates, 45 links, 10 triangles on avg.
Corresponds to
pI = 0.33, pL = 0.042, pT = 0.0014
Isolates: θbI
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Isolates: usually not awful (sometimes)
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Links: θbL
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Links: usually too dense
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Triangles: θbT
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Triangles: too much closure or empty
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Take-away for this example
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Conclusions:
Even fixed value of S leads to unstable estimates of θ,
with unreasonable chains of graphs being drawn,
and unreasonable inference suggested.
Take-away for this example
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Conclusions:
Even fixed value of S leads to unstable estimates of θ,
with unreasonable chains of graphs being drawn,
and unreasonable inference suggested.
But the model was so easy to describe...
Idea Here
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Many g’s but fewer possible statistics (values for S)
Many networks lead to the same statistics
Probabilities depend on statistics
Networks with same statistics are equally likely
Can collapse all equivalent networks to sufficient statistics
What did we do?
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Exponential random graph model
exp (β · S(g))
0
g 0 ∈G n exp (β · S(g ))
Pβ (g) = P
with N (s) = |{g : S(g) = s}|.
What did we do?
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Rewrite denominator as sum over equivalence classes
exp (β · S(g))
0
0
s0 ∈S n NS (s ) exp (β · s )
Pβ (g) = P
with N (s) = |{g : S(g) = s}|.
What did we do?
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Rewrite denominator as sum over equivalence classes
exp (β · S(g))
0
0
s0 ∈S n NS (s ) exp (β · s )
Pβ (g) = P
with N (s) = |{g : S(g) = s}|.
What did we do?
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Sum LHS and numerator over equivalence classes
NS (s) · exp (β · s)
0
0
s0 ∈S n NS (s ) exp (β · s )
Pβ (s) = P
with N (s) = |{g : S(g) = s}|.
Statistical form
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Sum LHS and numerator over equivalence classes
NS (s) · exp (β · s)
0
0
s0 ∈S n NS (s ) exp (β · s )
Pβ (s) = P
with N (s) = |{g : S(g) = s}|.
SERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
NS (s) · exp (β · s)
0
0
s0 ∈S n NS (s ) exp (β · s )
Pβ (s) = P
Nothing special about NS (s) - just the counting measure.
But typically impossible to compute...
SERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
NS (s) · exp (β · s)
0
0
s0 ∈S n NS (s ) exp (β · s )
Conclusion
Pβ (s) = P
Nothing special about NS (s) - just the counting measure.
But typically impossible to compute...
KS (s) · exp (β · s)
0
0
s0 ∈S n KS (s ) exp (β · s )
Pβ (s) = P
for other KS (·) 6= NS (·).
SERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Pβ (s) = P
KS (s) · exp (β · s)
KS (s0 ) exp (β · s0 )
s0 ∈S n
SERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Pβ (s) = P
KS (s) · exp (β · s)
KS (s0 ) exp (β · s0 )
s0 ∈S n
Natural micro-foundations for other K(·) 6= N (·).
K(·) 6= N (·) may recover desirable statistical properties.
SERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
KS (s) · exp (β · s)
0
0
s0 ∈An KS (s ) exp (β · s )
Pβ (s) = P
S can encode many things:
Links, cliques, k-stars, other subgraphs, friends in common
per link, multigraphs, etc.
Can do preference based models
Allows for node characteristics
Outline
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
(i) Simple example
(ii) Subgraph Generated Models
(iii) Illustrations
Economic Networks
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
What do many economic networks look like?
(i) Interdependency between links:
more than random (conditional on covariates) chance of one’s
friends being friends themselves
(ii) Mostly empty.
SUGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Subgraph Generation Models
Subnetworks generated, network is a by-product/projection
People form links, triangles; some folk are anti-social
(isolates), etc...
SUGM Idea
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Nature forms subgraphs of each type w/ some probability
May intersect, may overlap
Observe resulting graph, infer probabilities
Nodes
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Triangles Form
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Links Form
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Desired Network
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Model
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
subgraphs: G = (G1 , ..., Gk )
Model
Introduction
Simple example
SERGMs
subgraphs: G = (G1 , ..., Gk )
SUGMs
Illustrations
Conclusion
def. nicely ordered:
G` 6⊂ G`0 , `0 > `
g`0 ∈ G0` ⇐⇒ g`0 6⊂ g` , g` ∈ G` , `0 > `
Model
Introduction
Simple example
SERGMs
subgraphs: G = (G1 , ..., Gk )
SUGMs
Illustrations
Conclusion
def. nicely ordered:
G` 6⊂ G`0 , `0 > `
g`0 ∈ G0` ⇐⇒ g`0 6⊂ g` , g` ∈ G` , `0 > `
e.g., G1 = triangles and G2 = {links 6⊂ triangles}
Model
Introduction
Simple example
SERGMs
subgraphs: G = (G1 , ..., Gk )
SUGMs
Illustrations
Conclusion
def. nicely ordered:
G` 6⊂ G`0 , `0 > `
g`0 ∈ G0` ⇐⇒ g`0 6⊂ g` , g` ∈ G` , `0 > `
e.g., G1 = triangles and G2 = {links 6⊂ triangles}
parameters: pn = (pn1 , ..., pnk )
k
or w/ covariates pn (x; γ) = (pn
` (x` ; γ` ))`=1
Model
Introduction
Simple example
SERGMs
subgraphs: G = (G1 , ..., Gk )
SUGMs
Illustrations
Conclusion
def. nicely ordered:
G` 6⊂ G`0 , `0 > `
g`0 ∈ G0` ⇐⇒ g`0 6⊂ g` , g` ∈ G` , `0 > `
e.g., G1 = triangles and G2 = {links 6⊂ triangles}
parameters: pn = (pn1 , ..., pnk )
k
or w/ covariates pn (x; γ) = (pn
` (x` ; γ` ))`=1
formation:
Type 1 subgraphs form with probability pn
1.
Type 2 subgraphs form with probability pn
2.
...
Type ` subgraphs form with probability pn
`.
...
Problem: Incidental Generation
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
def. generating class: for every G` there are classes J of
subgraphs that can generate g` ∈ G` .
e.g., {Triangles, Links}. Triangles have 4 generating classes:
(T, T, T ); (T, L, L); (T, T, L); (L, L, L).
def. relatively sparse: a sequence of models w/ nicely
ordered subgraphs and parameters is relatively sparse if the
expected share of incidentally generated subgraphs to truly
generated subgraph vanishes
Q
n
j∈J Ep [S`j (g)]
→0
M
n J Epn [S` (g)]
Relative Sparsity Reasonable?
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example {Triangles, Links}
(T,T,T):
n3 p3T
= p2T n3 → 0 =⇒ pT = o(n−3/2 )
pT
√
Remark 1: a given node can be in n2 pT = o( n) triangles.
Relative Sparsity Reasonable?
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example {Triangles, Links}
(T,T,T):
n3 p3T
= p2T n3 → 0 =⇒ pT = o(n−3/2 )
pT
√
Remark 1: a given node can be in n2 pT = o( n) triangles.
(L,L,L):
p3L
→ 0 =⇒ pL = o(n−1/2 )
pT
√
Remark 2: a given node can be in (n − 1)pL = o( n) links.
Relative Sparsity Reasonable?
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example: Leider, Mobius, Rosenblatt, Do (‘09)
Harvard social network: n ≈ 4000
Avg number of links per person: 7
Avg % of one’s friends that are friends themselves: 8%
Observe that pT = 3/n2 , pL = 2/n yields
Avg number of links per person: 8
Avg % of one’s friends that are friends themselves: 10.71%
SUGMs behave well
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Theorem: Consider a nicely ordered and relatively sparse SUGM
with statistics on m1 , ..., mk nodes.
k
Letting Dn = diag {p` nm` }`=1 :
0
0
Dn1/2 (b
pn1 , ..., pbnk ) − (pn1 , ..., pnk )
√ N 0, Ik 2 .
SUGMs behave well
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Theorem: Consider a nicely ordered and relatively sparse SUGM
with statistics on m1 , ..., mk nodes.
k
Letting Dn = diag {p` nm` }`=1 :
0
0
Dn1/2 (b
pn1 , ..., pbnk ) − (pn1 , ..., pnk )
√ N 0, Ik 2 .
Covariates:
Trivial to add discrete covariates.
Subgraphs tagged by covariates. [curse of dimensionality]
Straightforward to add continuous covariates: pn` (x` ; γ` ).
√
cf. Hjort and Pollard (‘93), sub- nm` rates due to sparsity.
Relation: SERGM/SUGM
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Q. Can we view SUGMs as SERGMs?
Relation: SERGM/SUGM
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Q. Can we view SUGMs as SERGMs?
Conclusion
Ans. Yes, and it motivates specific K(·)’s.
Relation: SERGM/SUGM
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Q. Can we view SUGMs as SERGMs?
Conclusion
Ans. Yes, and it motivates specific K(·)’s.
Theorem: Suppose that the probability that subgraph of type `
exp β`
forms is given by p` = 1+exp
β` . This form of SUGM can be
represented in a SERGM form:
e exp Se · β
K n (S)
e =P
,
Pnβ (S)
n 0
0
s0 K (s ) exp (s · β)
where K`n (s` ) =
n
S`
s`
and K n (s) =
Q
`
K`n (s` ).
Easy to implement
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example: Researcher has graph g and covariates Xi for all i
Easy to implement
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example: Researcher has graph g and covariates Xi for all i
(i) Pick a nicely ordered set of statistics (e.g., 4cliques, triangles,
and links)
Easy to implement
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example: Researcher has graph g and covariates Xi for all i
(i) Pick a nicely ordered set of statistics (e.g., 4cliques, triangles,
and links)
(ii) Run a regression on all* 4-tuples of whether it is a 4clique,
conditional on covariates
P
0
P(ijkl ∈ K4 |Xijkl ) = Λ(Xijkl
γ4 ) =⇒ γ
b4 −→ γ4 .
Easy to implement
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example: Researcher has graph g and covariates Xi for all i
(i) Pick a nicely ordered set of statistics (e.g., 4cliques, triangles,
and links)
(ii) Run a regression on all* 4-tuples of whether it is a 4clique,
conditional on covariates
P
0
P(ijkl ∈ K4 |Xijkl ) = Λ(Xijkl
γ4 ) =⇒ γ
b4 −→ γ4 .
(iii) Remove links used in (ii).
Easy to implement
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example: Researcher has graph g and covariates Xi for all i
(i) Pick a nicely ordered set of statistics (e.g., 4cliques, triangles,
and links)
(ii) Run a regression on all* 4-tuples of whether it is a 4clique,
conditional on covariates
P
0
P(ijkl ∈ K4 |Xijkl ) = Λ(Xijkl
γ4 ) =⇒ γ
b4 −→ γ4 .
(iii) Remove links used in (ii).
(iv) Run a regression on all* triples of whether it is a triangle,
conditional on covariates.
P
0
P(ijk ∈ K3 |Xijk ) = Λ(Xijk
γ3 ) =⇒ γ
b3 −→ γ3 .
Easy to implement
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example: Researcher has graph g and covariates Xi for all i
(i) Pick a nicely ordered set of statistics (e.g., 4cliques, triangles,
and links)
(ii) Run a regression on all* 4-tuples of whether it is a 4clique,
conditional on covariates
P
0
P(ijkl ∈ K4 |Xijkl ) = Λ(Xijkl
γ4 ) =⇒ γ
b4 −→ γ4 .
(iii) Remove links used in (ii).
(iv) Run a regression on all* triples of whether it is a triangle,
conditional on covariates.
P
0
P(ijk ∈ K3 |Xijk ) = Λ(Xijk
γ3 ) =⇒ γ
b3 −→ γ3 .
(v) Remove links used in (iv).
Easy to implement
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Example: Researcher has graph g and covariates Xi for all i
(i) Pick a nicely ordered set of statistics (e.g., 4cliques, triangles,
and links)
(ii) Run a regression on all* 4-tuples of whether it is a 4clique,
conditional on covariates
P
0
P(ijkl ∈ K4 |Xijkl ) = Λ(Xijkl
γ4 ) =⇒ γ
b4 −→ γ4 .
(iii) Remove links used in (ii).
(iv) Run a regression on all* triples of whether it is a triangle,
conditional on covariates.
P
0
P(ijk ∈ K3 |Xijk ) = Λ(Xijk
γ3 ) =⇒ γ
b3 −→ γ3 .
(v) Remove links used in (iv).
(vi) Run a regression on all* pairs of whether it is an edge,
conditional on covariates.
P
0
P(ij ∈ K2 |Xij ) = Λ(Xij
γ2 ) =⇒ γ
b2 −→ γ2 .
Outline
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
(i) Simple example
(ii) Subgraph Generated Models
(iii) Illustrations
Illustration
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
1. Why need SUGMs/SERGMs and not simply dyadic with rich
covariates?
2. Social pressure and caste example
Need for SUGMs/SERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Examine data from 75 Indian villages from Banerjee,
Chandrasekhar, Duflo, Jackson ‘13
Estimate a model and then use it to generate networks
How well do the model-recreated networks match real
networks on non-modeled characteristics
Need for SUGMs/SERGMs
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Step 1: Estimate models
Dyadic: estimate pL (covariates)
Geographic distance, caste, amenities (housing material,
roofing material, number of beds), etc.
SUGM: estimate pclose
, pfLar , pclose
, pfTar
L
T
Step 2: Randomly generate networks
Dyadic: generate links given pL (covariates)
, pfTar
SUGM: generate subgraphs w/ pclose
, pfLar , pclose
L
T
Recreate Networks
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Data
Data Data
Link-based
Link-based
Link-based
model
model
Link-based
model
SUGM
SUGM
model
with
SUGM
Link-based
with
links
links
with
SUGM
SUGM
model
links
SUGM
with
with
SUGM
links
with
SUG
iso
Data
Data
with
with
covariates
covariates
with covariates
with covariates
andand
triangles
triangles
with
and covariates
triangles
and
links
triangles
links
andand
links
trian
an
t
[1] [1] [1]
[1] [2] [2] [2]
[1]
[2] [3] [3] [3]
[2]
[3] [4] [4]
Unsupported
of
mber
Unsupported
of
Number
Unsupported
Unsupported
Number
Links of Links
Unsupported
160.8
160.8
Links160.8 160.8236.2
236.2 236.2
160.8 236.2161.2
161.2 161.2
236.2 161.2161.8
161.
Models
to
are
fitLinks
toofLinks
Triangles
of
mber
Triangles
of
Number
Triangles
of Triangles
Number of Triangles 39.2
39.2 39.2
39.2 3.13.1 39.2
3.1
3.1 39.7
39.7 39.7
3.1
39.7 39.5
39.5
ifferent
of
ombinations
gree
erage
Degree
Degree
AverageofDegree
Average Degree
2.3243
2.32432.3243 2.3243
2.3260
2.32602.3260
2.3243 2.3260
2.5916
2.59162.5916
2.3260 2.5916
2.5219
2.521
hese
statistics.
of
mber
solates
Isolates
of
Number
Isolates
of Isolates
Number of Isolates 54.9722
54.9722
54.9722 54.9722
25.7222
25.7222
25.7222
54.9722 25.7222
31.4444
31.4444
31.4444
25.7222 31.4444
65.9167
65.91
erage
Clustering
stering
Clustering
Average Clustering
Average Clustering 0.0895
0.08950.0895 0.0895
0.0105
0.01050.0105
0.0895 0.0105
0.1268
0.12680.1268
0.0105 0.1268
0.0829
0.082
Giant
in
ction
Giant
Component
in
Fraction
Giant
Component
Component
in Giant
Fraction
Component
in Giant Component
0.7061
0.70610.7061 0.7061
0.8315
0.83150.8315
0.7061 0.8315
0.7982
0.79820.7982
0.8315 0.7982
0.6718
0.671
None
odels
of
the
models
envalue
st
alue
First fit
Eigenvalue
First Eigenvalue 5.5446
5.54465.5446 5.5446
3.8578
3.85783.8578
5.5446 3.8578
4.6762
4.67624.6762
3.8578 4.6762
5.3025
5.302
to
reEigenvalue
directly
to
ny of Gap
these Gap Spectral Gap
ectral
pGap
Spectral
0.9550
0.95500.9550 0.9550
0.3354
0.33540.3354
0.9550 0.3354
0.6684
0.66840.6684
0.3354 0.6684
1.0617
1.061
tatistics.
Eigenvalue
cond
nvalue
Eigenvalue
Second
of Stochastized
of Stochastized
Eigenvalue
of Stochastized
Second
Matrix
ofMatrix
Eigenvalue
Stochastized
Matrix of
0.9573
Matrix
Stochastized
0.95730.9573Matrix
0.9573
0.9632
0.96320.9632
0.9573 0.9632
0.9559
0.95590.9559
0.9632 0.9559
0.9069
0.906
erage
Path
h Length
Length
Path
Average
Length
Path Average
Length Path Length4.6921
4.69214.6921 4.6921
5.6565
5.65655.6565
4.6921 5.6565
5.1215
5.12155.1215
5.6565 5.1215
4.1180
4.118
Notes:
nts
erage
[1]
average
the
presents
Column
value
average
value
of
the[1]
various
of
value
average
presents
various
ofnetwork
various
value
network
the average
ofcharacteristics
network
various
characteristics
value
characteristics
network
ofacross
various
across
characteristics
theacross
network
the
36 36
villages.
the
villages.
characteristics
across
36 Columns
villages.
the
Columns
36 Columns
[2],
villages.
across
[2],
[3] the
[3]
and
Columns
[2],
and
36
[4]
[3]
villages.
[4]
present
and
[2],
present
[4]
[3]
Columns
simulation
present
and
simulation
[4][2],
simulat
presen
results
[3]
res
te
meters
imulation
rst
ersparameters
estimate
of aofgiven
awe
given
parameters
first
of
model
a model
given
estimate
forof
model
for
a agiven
parameters
given
a given
for
village
model
a village
given
ofand
for
a village
given
and
then
a given
then
model
randomly
and
village
randomly
then
for randomly
draw
aand
given
draw
then
a graph
village
arandomly
draw
graph
from
and
a from
graph
the
draw
then
the
model
from
randomly
amodel
graph
the
with
model
with
from
the
draw
the
estimated
the
with
a estimated
graph
model
the estimated
from
parameters.
with
parameters.
the
the model
estimate
paramet
WewW
r
each
eimulations
es
or
villages
for
each
ofeach
the
offor
villages
the
for
ofeach
the
each
models
models
offor
ofthe
the
and
each
models
and
villages
average
ofaverage
the
and
models
for
across
average
each
across
the
and
ofacross
the
simulations,
the
average
simulations,
models
theacross
simulations,
and
andaverage
the
and
thesimulations,
the
entries
and
entries
across
the
report
entries
report
the
and
these
simulations,
the
these
report
averaged
entries
averaged
these
report
and
across
averaged
the
across
these
entries
theacross
averaged
the
villages.
report
villages.
thethese
across
villages.
avera
the v
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Illustration: Social Pressure
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Cross-caste relationships more likely to occur :
“in private” with no friends in common?
or with the same frequency in embedded relationships?
B B C A A Rela%vely less likely? Social Pressure
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
Need more consent to form triad than dyad.
Need to account for these preferences, otherwise mechanically find
less desired dyad
less desired triad
>
.
more desired dyad
more desired triad
Social Pressure
Introduction
Simple example
SERGMs
SUGMs
Illustrations
preferences:
Conclusion
ui (j) = αL + βL 1{caste(i) 6= caste(j)} − ij,L
ui (jk) = αT + βT (1 − 1{caste(i) = caste(j) = caste(k)}) − ijk,T
frequencies: qL (x), qT (x) x ∈ {Same(1), Dif f (0)}
question:
qL (0)
qT (0)
>
qL (1)
qT (1)
Introduction
Simple example
SERGMs
question:
qT (0)
qL (0)
>
qL (1)
qT (1)
SUGMs
Illustrations
Conclusion
issues:
heterogenous meeting probabilities: πL (x), πT (x)
a link requires 2 consents; a triangle 3 consents
we observe g and therefore:
pL (x) = πL (x) · qL (x)2 and pT (x) = πT (x) · qT (x)3 .
Introduction
we observe g and therefore:
Simple example
SERGMs
SUGMs
pL (x) = πL (x) · qL (x)2 and pT (x) = πT (x) · qT (x)3 .
Illustrations
Conclusion
assume: people spend f > 1/2 share of time mixing with own
caste; 1 − f with other caste
Introduction
we observe g and therefore:
Simple example
SERGMs
SUGMs
pL (x) = πL (x) · qL (x)2 and pT (x) = πT (x) · qT (x)3 .
Illustrations
Conclusion
assume: people spend f > 1/2 share of time mixing with own
caste; 1 − f with other caste, implying
πT (0)
πL (0)
>
.
πT (1)
πL (1)
Introduction
we observe g and therefore:
Simple example
SERGMs
SUGMs
pL (x) = πL (x) · qL (x)2 and pT (x) = πT (x) · qT (x)3 .
Illustrations
Conclusion
assume: people spend f > 1/2 share of time mixing with own
caste; 1 − f with other caste, implying
πT (0)
πL (0)
>
.
πT (1)
πL (1)
Then, a sufficient condition for
pT (0)
<
pT (1)
qL (0)
qL (1)
>
pL (0)
pL (1)
qT (0)
qT (1)
3/2
.
is
1.2
Introduction
Simple example
SERGMs
1
SUGMs
Illustrations
Conclusion
pT(different)/pT(same)
0.8
0.6
0.4
0.2
0
−0.2
−0.2
0
0.2
0.4
0.6
(pL(different)/pL(same))3/2
0.8
1
1.2
1.2
Introduction
Below Median
Above Median
Simple example
SERGMs
1
SUGMs
Illustrations
Conclusion
(pT(different)/pT(same))
0.8
0.6
0.4
0.2
0
−0.2
−0.2
0
0.2
0.4
0.6
(pL(different)/pL(same))3/2
0.8
1
1.2
Social pressure conclusion
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
People show significantly stronger preference for cross-caste
relationships when links are in isolation
Suggestive that it is especially true with more symmetrically
sized castes
Conclusion
Introduction
Simple example
SERGMs
SUGMs
Illustrations
Conclusion
ERGMs face estimation challenges
Computational difficulties
Poor asymptotic properties
Work with subgraphs directly to simplify
Consistency, fast estimation
Natural (though limited) microfoundations
Reflects patterns in empirical data
Applications
Recreate networks, how relationship types are sensitive to
context, easy to extend to multigraphs
Fly UP