Transformed Linear Regression Vs. Copula Regression Rahul A. Parsa

by user

on 15 сентября 2016

Category: Documents

>> Downloads: 2

views

Report

Comments

Description

Download Transformed Linear Regression Vs. Copula Regression Rahul A. Parsa

Transcript

Transformed Linear Regression Vs. Copula Regression Rahul A. Parsa

Transformed Linear Regression
Vs.
Copula Regression
Rahul A. Parsa
Drake University
&
Paul G.Ferrara, PhD FSA CERA
Homesite Insurance
&
Stuart Klugman
Outline of Talk
  Copula Regression, OLS, GLM
  Alternative Methodology
  Examples
Notation
  Notation:
  Y – Dependent Variable
 
X 1 , X 2 , X k Independent Variables
  Assumption
  Y is related to X’s in some functional form
E[Y | X 1 = x1  X n = xn ] = f ( X 1 , X 2 , X n )
OLS Regression
Y is linearly related to X’s
OLS Model
Yi = β0 + β1 X 1i + β 2 X 2i +  + β k X ki + ε i
OLS
Multivariate Normal Distribution
Assume Y , X 1 , X 2 ,  X k
Jointly follow a multivariate normal distribution
Then the conditional distribution of Y | X follows
normal distribution with mean and variance given by
E(Y | X = x) = µ y + ΣYX Σ ( x − µ x )
−1
XX
−1
XX
Variance = ΣYY − ΣYX Σ ΣYX
GLM
  Y belongs to an exponential family of distributions
E (Y | X = x) = g ( β 0 + β1 x1 +  + β k xk )
−1
  g is called the link function
  x's are not random
  Y|x belongs to the exponential family
  Conditional variance is no longer constant
  Parameters are estimated by MLE using numerical
methods
Copula Regression
  Y can have any distribution
  Each Xi can have any distribution
  The joint distribution is described by a Copula
  Estimate Y by E(Y|X=x) – conditional mean
MVN Copula
  CDF for MVN is Copula is
−1
−1
F ( x1 , x2 ,, xn ) = G (Φ [ F ( x1 )], Φ [ F ( xn )])
  Where G is the multivariate normal cdf with zero
mean, unit variance, and correlation matrix R.
  Density of MVN Copula is
⎧ vT ( R −1 − I )v ⎫
−0.5
f ( x1 , x2 ,, xn ) = f ( x1 ) f ( x2 )  f ( xn ) exp⎨−
*
R
⎬
2
⎩
⎭
Where v is a vector with ith element vi = Φ −1[ F ( xi )]
Conditional Distribution in
MVN Copula
  The conditional distribution of xn given x1 ….xn-1 is
⎧
⎫
⎡{F ( xn ) − r T Rn−−11vn−1}2
−1
2⎤
T −1
−0.5
f ( xn | x1  xn−1 ) = f ( xn ) * exp⎨− 0.5 * ⎢
−
{
Φ
[
F
(
x
)]}
*
(
1
−
r
R
r
)
⎬
n
n −1
⎥
T −1
(
1
−
r
R
r
)
n
−
1
⎣
⎦⎭
⎩
Where
vn−1 = (v1 ,vn−1 )
⎡ Rn −1
R=⎢ T
⎣r
r⎤
1⎥⎦
Alternative Method
  Convert
Y , X 1 , X 2 ,......., X k to standard normal
Random Variable using
−1
U = Φ ( FY ( y))
−1
Vi = Φ ( FX ( xi ))
  Note: U and V’s jointly follow Multivariate Normal
Distribution if Y and X’s ~ MVN Copula
Alternative Method
  Regress U on the V’s.
  Obtain U-hat.
  Convert U-hat to Y-hat using
−1
ˆ
ˆ
YA = ( FY  Φ)(U )
Alternative Method
Advantages of this method.
  Easy to implement – can be done in Excel
  Easy to understand
  Transformations are well understood in Regression
Difference in Approaches
  Let
YˆC be the Copula Estimate.
  Let
YˆA be the Alternative Method Estimate
  Question: What is the difference between these two
estimates?
Jensen’s Inequality
−1
ˆ
YA = FY (Φ( E (U | V1 ,V2 ,.....,Vk ))
YˆC = E (Y | X 1 , X 2 ,....., X k )
Jensen’s Inequality:
[
]


E ( FY−1  Φ)(U | V ) ≥ ( FY−1  Φ )( E (U | V ))
Jensen’s Inequality
  We considered the case of two variables
  We show that (see the handout) that
[
]


−1
E ( FY  Φ)(U | V ) = E (Y | X )
Convexity Problem
−1
y
=
F
(Φ( x)) is a convex function for
  Show that
Jensen’s inequality to hold (handout for proof).
  That is
2
d
−1
F (Φ( x)) ≥ 0
2
dx
  Or
  Where
f 2 ( y) f ' ( y)
−
≥0
2
φ ( x) φ ' ( x)
−1
y = F (Φ( x))
Examples
  F ~ Pareto Distribution:
α +1
f ' ( y) = − f ( y) *
y +θ
  Convexity Condition:
α *θ α
f ( y) =
( y + θ )α +1
φ ' ( x) = − x * φ ( x)
f ( y)
α +1
≥
φ ( x) x * ( y + θ )
Graph - Pareto
y=F^-1(phi(x))
1000
900
800
700
600
500
y=F^-1(phi(x))
400
300
200
100
0
-6
-4
-2
0
2
4
6
Example
  F~ Gamma
y
−
1
α −1
f ( y) =
* y *e θ
α
Γ(α ) *θ
⎡α − 1 1 ⎤
f ' ( y) = f ( y) * ⎢
− ⎥
θ⎦
⎣ y
  Convexity Condition:
⎡α − 1 1 ⎤
− ⎥
⎢
y
θ⎦
f ( y)
⎣
≥−
φ ( x)
x
Graph - Gamma
y=F^-1(phi(x))
2500
2000
1500
y=F^-1(phi(x))
1000
500
0
-6
-4
-2
0
2
4
6
Example 1
  Data was simulated
  Y ~ Pareto (3,8) and X ~ Gamma (2,4)
  2000 observations were generated
  MLE’s were:
  Error:
Alpha
Theta
Y~Pareto
2.849075
7.48509
X~Gamma
1.906755
4.234371
Copula
SSE
40,508.92
OLS
Transformed
42,844.31
45,337.45
Example 1
80
70
60
50
Y
40
Cop-Yhat
Yhat-Method 3
30
20
10
0
0
5
10
15
20
25
30
35
40
45
Example 2
  Taken from Copula Regression Paper (Example 1)
  Dependent – X3 - Gamma
  Though X2 is simulated from Pareto, parameter
estimates do not converge, gamma model fit
Variables
X1-Pareto
X2-Pareto
X3-Gamma
Parameters
3, 100
4, 300
3, 100
3.44, 161.11
1.04, 112.003
3.77, 85.93
MLE
  Error:
Copula
OLS
Transformed
590,000.5
637,172.8
597,552.6
Example From Copula Paper
800
700
600
500
Y
400
Yhat-Cop
Yhat-Trans
300
200
100
0
0
50
100
150
200
250
300
350
400
450
Example – Copula Paper
800
700
600
500
Y
400
Yhat-Cop
Yhat-Trans
300
200
100
0
0
50
100
150
200
250
300
350
400
450
500