...

MATH 3795 Lecture 6. Sensitivity of the Solution of a Linear System Goals

by user

on
Category: Documents
70

views

Report

Comments

Transcript

MATH 3795 Lecture 6. Sensitivity of the Solution of a Linear System Goals
MATH 3795
Lecture 6. Sensitivity of the Solution of a Linear
System
Dmitriy Leykekhman
Fall 2008
Goals
I
Understand how does the solution of Ax = b changes when A or b
change.
I
Condition number of a matrix (with respect to inversion).
I
Vector and matrix norms.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
1
Linear Systems
I
Given A ∈ Rn×n and b ∈ Rn we are interested in the solution
x ∈ Rn of
Ax = b.
I
Suppose that instead of A, and b we are given A + ∆A and b + ∆b,
where ∆A ∈ Rn×n and ∆b ∈ Rn . How do these perturbations in
the data change the solution of the linear system?
I
First we need to understand how to measure the size of vectors and
of matrices. This leads to vector norms and matrix norms.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
2
Vector Norms
Definition
A (vector) norm on Rn is a function
k · k : Rn → R
x → kxk
which for all x, y ∈ Rn and α ∈ R satisfies
1. kxk ≥ 0, kxk = 0 ⇔ x = 0,
2. kαxk = |α|kxk,
3. kx + yk ≤ kxk + kyk,
(triangle inequality).
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
3
Vector Norms
The most frequently used norms on Rn are given by
kxk2 =
n
X
!1/2
x2i
,
2-norm
i=1
The MATLAB’s build in function norm(x) or norm(x, 2).
More generally for any p ∈ [1, ∞)
kxkp =
n
X
!1/p
p
|xi |
,
p-norm.
i=1
The MATLAB’s build in function norm(x, p) and
kxk∞ = max |xi |,
i=1,...,n
∞-norm.
The MATLAB’s build in function norm(x, inf )
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
4
Vector Norms
Example
Let x = (1, −2, 3, −4)T . Then
kxk1 = 1 + 2 + 3 + 4 = 10,
√
√
kxk2 = 1 + 4 + 9 + 16 = 30 ≈ 5.48,
kxk∞ = max {1, 2, 3, 4} = 4.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
5
Vector Norms
The boundaries of the unit balls defined by
{x ∈ Rn : kxkp ≤ 1}.
One can show the following useful inequalities:
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
6
Vector Norms
The boundaries of the unit balls defined by
{x ∈ Rn : kxkp ≤ 1}.
One can show the following useful inequalities:
I
kxk∞ ≤ kxk2 ≤ kxk1 .
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
6
Vector Norms
The boundaries of the unit balls defined by
{x ∈ Rn : kxkp ≤ 1}.
One can show the following useful inequalities:
I
kxk∞ ≤ kxk2 ≤ kxk1 .
I
Let k · k is any vector norm on Rn , then
kx + yk ≥ kxk − kyk for all
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
x, y ∈ Rn .
Sensitivity of the Solution
–
6
Vector Norms
The boundaries of the unit balls defined by
{x ∈ Rn : kxkp ≤ 1}.
One can show the following useful inequalities:
I
kxk∞ ≤ kxk2 ≤ kxk1 .
I
I
Let k · k is any vector norm on Rn , then
kx + yk ≥ kxk − kyk for all
x, y ∈ Rn .
Cauchy-Schwarz inequality,
xT y ≤ kxk2 kyk2
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
for all
x, y ∈ Rn .
Sensitivity of the Solution
–
6
Vector Norms
Theorem
Vector norms on Rn are equivalent, i.e. for every two vector norms k · ka
and k · kb on Rn there exist constants cab , Cab (depending on the vector
norms k · ka and k · kb , but not on x) such that
cab kxkb ≤ kxka ≤ Cab kxkb
∀x ∈ Rn .
In particular, for any x ∈ Rn we have the inequalities
1
√ kxk1 ≤ kxk2 ≤ kxk1
n
√
kxk∞ ≤ kxk2 ≤ nkxk∞
kxk∞ ≤ kxk1 ≤ nkxk∞ .
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
7
Matrix Norms
Definition
A matrix norm on Rm×n is a function
k · k : Rm×n → R
A → kAk,
which for all A, B ∈ Rm×n and α ∈ R satisfies
1. kAk ≥ 0, kAk = 0 ⇔ A = 0 (zero matrix),
2. kαAk = |α|kAk,
3. kA + Bk ≤ kAk + kBk,
(triangle inequality).
Warning:Matrix- and vector-norms are denoted by the same symbol k · k.
However, as we will see shortly, vector-norms and matrix-norms are
computed very differently. Thus, before computing a norm we need to
examine carefully whether it is applied to a vector or to a matrix. It
should be clear from the context which norm, a vector-norm or a
matrix-norm, is used.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
8
Matrix Norms. First Approach.
I
View a matrix A ∈ Rm×n as a vector in Rmn , by stacking the
columns of the matrix into a long vector.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
9
Matrix Norms. First Approach.
I
View a matrix A ∈ Rm×n as a vector in Rmn , by stacking the
columns of the matrix into a long vector.
I
Apply the vector-norms to this vectors of length mn.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
9
Matrix Norms. First Approach.
I
View a matrix A ∈ Rm×n as a vector in Rmn , by stacking the
columns of the matrix into a long vector.
I
Apply the vector-norms to this vectors of length mn.
I
This will give matrix norms. For example if we apply the
2-vector-norm, then

1/2
n X
m
X
kAkF = 
a2ij  .
i=1 j=1
This is called the Frobenius norm.
(We will use kAk2 to denote a different matrix norm.)
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
9
Matrix Norms. First Approach.
I
View a matrix A ∈ Rm×n as a vector in Rmn , by stacking the
columns of the matrix into a long vector.
I
Apply the vector-norms to this vectors of length mn.
I
This will give matrix norms. For example if we apply the
2-vector-norm, then

1/2
n X
m
X
kAkF = 
a2ij  .
i=1 j=1
This is called the Frobenius norm.
(We will use kAk2 to denote a different matrix norm.)
I
This approach is not very useful.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
9
Matrix Norms. Second Approach.
I
We want to solve linear systems Ax = b.
Find a vector x such that if we multiply A by this vector (we apply
A to this vector), then we obtain b.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
10
Matrix Norms. Second Approach.
I
We want to solve linear systems Ax = b.
Find a vector x such that if we multiply A by this vector (we apply
A to this vector), then we obtain b.
I
View a matrix A ∈ Rm×n as a linear mapping, which maps a vector
x ∈ Rn into a vector Ax ∈ Rm
A : R n → Rm
x → Ax.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
10
Matrix Norms. Second Approach.
I
We want to solve linear systems Ax = b.
Find a vector x such that if we multiply A by this vector (we apply
A to this vector), then we obtain b.
I
View a matrix A ∈ Rm×n as a linear mapping, which maps a vector
x ∈ Rn into a vector Ax ∈ Rm
A : R n → Rm
x → Ax.
I
How do we define the size of a linear mapping?
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
10
Matrix Norms. Second Approach.
I
We want to solve linear systems Ax = b.
Find a vector x such that if we multiply A by this vector (we apply
A to this vector), then we obtain b.
I
View a matrix A ∈ Rm×n as a linear mapping, which maps a vector
x ∈ Rn into a vector Ax ∈ Rm
A : R n → Rm
x → Ax.
I
How do we define the size of a linear mapping?
I
Compare the size of the image Ax ∈ Rm with the size of x. This
leads us to look at
kAxk
sup
x6=0 kxk
Here Ax ∈ Rm and x ∈ Rn are vectors and k · k are vector norms (in
Rm and Rn ).
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
10
Matrix Norms
I
Let p ∈ [1, ∞]. The following identities re valid
sup
x6=0
kAxkp
kAxkp
= sup kAxkp = max
= max kAxkp
x6
=
0
kxkp
kxkp
kxkp =1
kxkp =1
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
11
Matrix Norms
I
Let p ∈ [1, ∞]. The following identities re valid
sup
x6=0
I
kAxkp
kAxkp
= sup kAxkp = max
= max kAxkp
x6
=
0
kxkp
kxkp
kxkp =1
kxkp =1
One can show
kAkp = max
x6=0
kAxkp
.
kxkp
(1)
Note that on the left hand side in (1) the symbol k · kp refers to the
p-matrix-norm, while on the right hand side in (1) the symbol k · kp
refers to the p-vector-norm applied to the vectors Ax ∈ Rm and
x ∈ Rn , respectively.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
11
Matrix Norms
For the most commonly used matrix-norms (1) with p = 1, p = 2, or
p = ∞, there exist rather simple representations.
Let k · kp be the matrix norm defined in (1), then
kAk1 = max
j=1,...,n
kAk∞ = max
i=1,...,m
kAk2 =
m
X
i=1
n
X
|aij | (maximum column norm);
|aij | (maximum row norm);
j=1
q
λmax (AT A) (spectral norm).
where λmax (AT A) is the largest eigenvalue of AT A.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
12
Matrix Norms
Example
Let

1
A =  −2
2
Then

3 −6
4
2 .
1 −1
kAk1 = max 5, 8, 9 = 9,
kAk∞ = max 10, 8, 4 = 10,
p
kAk2 = max {3.07, 23.86, 49.06} ≈ 7.0045,
√
kxkF = 76 ≈ 8.718.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
13
Matrix Norms
Two important inequalities.
Theorem
For any A ∈ Rm×n , B ∈ Rn×k and x ∈ Rn , the following inequalities
hold.
kAxkp ≤ kAkp kxkp
(compatibility of matrix and vector norm)
and
kABkp ≤ kAkp kBkp
(submultiplicativity of matrix norms)
Note that for the identity matrix I,
kIkp = max
x6=0
kIxkp
= 1.
kxkp
Compare this with the first approach in which we view I as a vector of
length n2 . For example the Frobenius norm (2-vector norm) is
√
kIkF = n.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
14
Error Analysis
I
Let
Ax = b
(2)
be the original system, where A ∈ Rn×n and b ∈ Rn .
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
15
Error Analysis
I
Let
Ax = b
(2)
be the original system, where A ∈ Rn×n and b ∈ Rn .
I
Let
(A + ∆A)x̃ = b + ∆b
n×n
be the perturbed system, where ∆A ∈ R
the perturbations in A and b, respectively.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
(3)
n
and ∆b ∈ R represent
Sensitivity of the Solution
–
15
Error Analysis
I
Let
Ax = b
(2)
be the original system, where A ∈ Rn×n and b ∈ Rn .
I
Let
(A + ∆A)x̃ = b + ∆b
n×n
be the perturbed system, where ∆A ∈ R
the perturbations in A and b, respectively.
(3)
n
and ∆b ∈ R represent
I
What is the error ∆x = x̃ − x between the solution x of the exact
linear system (7) and the solution ex perturbed linear system (8).
I
Use a representation
x̃ = x + ∆x.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
15
Error Analysis. Perturbation in b only
The original linear system,
Ax = b,
where A ∈ R
n×n
n
and b ∈ R . The perturbed linear system
A(x + ∆x) = b + ∆b,
where ∆b ∈ Rn represents the perturbations in b.
Subtracting we get
A∆x = ∆b,
or
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
∆x = A−1 ∆b.
Sensitivity of the Solution
–
16
Error Analysis. Perturbation in b only
The original linear system,
Ax = b,
where A ∈ R
n×n
n
and b ∈ R . The perturbed linear system
A(x + ∆x) = b + ∆b,
where ∆b ∈ Rn represents the perturbations in b.
Subtracting we get
A∆x = ∆b,
or
∆x = A−1 ∆b.
Take norms:
k∆xk = kA−1 ∆bk ≤ kA−1 kk∆bk.
(4)
To estimate relative error, note that Ax = b and as a result
1
1
kbk = kAxk ≤ kAkkxk ⇒
≤ kAk
.
kxk
kbk
(5)
Combining (4) and (5) we get
k∆xk
k∆bk
≤ kAkkA−1 k
.
kxk
kbk
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
(6)
–
16
Error Analysis. Perturbation in b only
Definition
The (p-) condition number κp (A) of a matrix A (with respect to
inversion) is defined by
κp (A) = kAkp kA−1 kp .
Set κp (A) = ∞ is A is not invertible. MATLAB’s build in function
cond(A).
If
Ax = b,
and
A(x + ∆x) = b + ∆b,
then the relative error between the solutions obeys
k∆bk
k∆xk
≤ κp (A)
.
kxk
kbk
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
17
Error Analysis. General Case.
I
Let
Ax = b
(7)
be the original system, where A ∈ Rn×n and b ∈ Rn .
I
Let
(A + ∆A)(∆x + x) = b + ∆b
(8)
be the perturbed system, where ∆A ∈ Rn×n and ∆b ∈ Rn represent
the perturbations in A and b, respectively.
I
If kA−1 kp k∆Akp < 1, then
κp (A)
k∆xkp
≤
k∆Ak
kxkp
1 − κp (A) kAkpp
k∆Akp
k∆bk
+
kAkp
kbk
.
If κp (A) is small, we say that the linear system is well conditioned.
Otherwise, we say that the linear system is ill conditioned.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
18
(9)
Error Analysis. Example. Hilbert Matrix
Example
Hilbert Matrix H ∈ Rn×n with entries
Z 1
hij =
xi+j−2 dx =
0
For n = 4,

H −1

1

H=

1
2
1
3
1
4
16
 −120
=
 240
−140
1
2
1
3
1
4
1
5
−120
1200
−2700
1680
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
1
3
1
4
1
5
1
6
1
.
i+j−1
1
4
1
5
1
6
1
7


.

240
−2700
6480
−4200

−140
1680 
.
−4200 
2800
Sensitivity of the Solution
–
19
Error Analysis. Example. Hilbert Matrix.
Example
We compute that the condition number of a Hilbert matrix grows very
fast with n. For n = 4
25
kH −1 k1 = 13620, κ1 (H) = 28375,
12
25
=
kH −1 k∞ = 13620, κ∞ (H) = 28375,
12
kHk1 =
kHk∞
kHk2 ≈ 1.5
kH −1 k2 ≈ 1.03 ∗ 104 ,
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
κ2 (H) ≈ 1.55 ∗ 104 .
Sensitivity of the Solution
–
20
Error Analysis. Example. Hilbert Matrix.
Example
We consider the linear systems
Hx = b.
For given n we set xex = (1, . . . , 1)T ∈ Rn , and compute b = Hxex .
Then we compute the solution of the linear system Hx = b using the
LU-decomposition and compute the relative error between exact solution
xex and computed solution x.
n
4
5
6
7
8
9
10
κ∞ (H)
2.837500e + 004
9.436560e + 005
2.907028e + 007
9.851949e + 008
3.387279e + 010
1.099651e + 012
3.535372e + 013
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
kxex −xk∞
kxex k∞
2.958744e − 013
5.129452e − 012
5.096734e − 011
2.214796e − 008
1.973904e − 007
4.215144e − 005
5.382182e − 004
Sensitivity of the Solution
–
21
Error Analysis.
I
If we use finite precision arithmetic, then rounding causes errors in
the input data. Using m-digit floating point arithmetic it holds that
|x − f l(x)|
≤ 0.5 ∗ 10−m+1 .
|x|
I
Thus, if we solve the linear system in m-digit floating point
arithmetic, then, as rule of thumb, we may approximate the the
input errors due to rounding by
k∆Ak
≈ 0.5 ∗ 10−m+1 ,
kAk
I
k∆bk
≈ 0.5 ∗ 10−m+1
kb|
If the condition number of A is κ(A) = 10α , then
k∆xk
10α
≤
(0.5 ∗ 10−m + 0.5 ∗ 10−m ) ≈ 10α−m .
kx|
1 − 10α−m+1
I
Provided 10α−m+1 < 1.
Rule of thumb: If the linear system is solved in m-digit floating
point arithmetic and if the condition number of A is of the order
10α , then only m − α − 1 digits in the solution can be trusted.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
22
Summary.
I
I
If the condition number of a matrix A is large, then small errors in
the data may lead to large errors in the solution.
Rule of thumb: If the linear system is solved in m-digit floating point
arithmetic and if the condition number of A is of the order 10α ,
then only m − α − 1 digits in the solution can be trusted.
D. Leykekhman - MATH 3795 Introduction to Computational Mathematics
Sensitivity of the Solution
–
23
Fly UP