The fast Gauss transform with all the proofs. VIKAS C. RAYKAR

by user

on 15 сентября 2016

Category: Documents

>> Downloads: 2

views

Report

Comments

Description

Download The fast Gauss transform with all the proofs. VIKAS C. RAYKAR

Transcript

The fast Gauss transform with all the proofs. VIKAS C. RAYKAR

The fast Gauss transform with all the proofs.
VIKAS C. RAYKAR
University of Maryland, CollegePark
[email protected]
In this report we discuss some technical aspects of the fast Gauss transform (FGT). The FGT is
a special case of the single level FMM for the Gaussian potential. We give detailed derivation of
the correct error bounds which is rather terse in the original paper. We also briefly discuss about
the data structures to implement the space subdivision in higher dimensional spaces.
[ FGT ]: April 8, 2006
Contents
1 Fast Gauss Transform
1
2 Hermite polynomials
2
3 Multi-index notation
3
4 Taylor expansion of the Hermite function
4
5 Factorization [S- and R-expansion]
4
6 Fast Gauss Transform
6.1 Direct Evaluation . . . . . . . . . . . . . . . . . . . . .
6.2 S-expansion (Hermite) at the center of each source box
6.3 R-expansion (Taylor) at the center of each target box
6.4 S|R translation . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
6
6
7
8
7 Algorithm
9
8 Data Structures
9
9 Appendices
12
9.1 Appendix 1 [Hermite series truncation error] . . . . . . . . . . . . . . 12
9.2 Appendix 2 [Taylor series truncation error] . . . . . . . . . . . . . . 13
9.3 Appendix 3 [S|R truncation error] . . . . . . . . . . . . . . . . . . . 14
1. FAST GAUSS TRANSFORM
Here we give a detailed analysis of the fast Gauss transform (FGT) based on the
techniques and the notation used in the FMM primer [Raykar 2005]. The FGT is
a special case of the single level FMM for the Gaussian potential. For each target
Fast Gauss Transform April 8, 2006
2
·
Vikas Chandrakant Raykar
point {yj ∈ Rd }j=1,...,M the discrete Gauss transform is defined as,
Φ(yj ) =
N
X
qi e−kyj −xi k
2
/h2
.
(1)
i=1
where {qi ∈ R+ }i=1,...,N are the source weights, {xi ∈ Rd }i=1,...,N are the source
points, i.e., the center of the Gaussians, and h ∈ R+ is the source scale or bandwidth.
Φ(yj ) is the total contribution at yj of N Gaussians centered at xi each with
bandwidth h. Each Gaussian is weighted by the term qi . The computational
complexity to evaluate the discrete Gauss transform at M target points is O(M N ).
The Fast Gauss Transform [Greengard and Strain 1991] is an approximation
algorithm that reduces the computational complexity to O(M + N ). The constant
depends on the desired precision. Given any ² > 0, it computes an approximation
Φ̂(yj ) to Φ(yj ) such that the maximum absolute error relative to the total weight
PN
Q = i=1 qi ,
"
#
|Φ̂(yj ) − Φ(yj )|
max
≤ ².
(2)
yj
Q
The algorithm is based on the single level FMM described above. The Gaussian is
factorized via Hermite and Taylor series. The error bound derived in the original
paper was shown to be incorrect and a new bound was derived in [Baxter and
Roussos 2002].
The space is subdivided into a number of boxes and the sources and targets are
assigned to different boxes. For all sources belonging to a particular box, the Sexpansion (Hermite series) coefficients are consolidated at the center of each source
box. For each target box the S-expansion (Hermite series) coefficients at each source
box are translated via the S|R translation operator as R-expansion (Taylor series)
coefficients at the center of the target box. Since the Gaussian decays very rapidly
only a few neighboring source boxes will have influence on the target box.
2. HERMITE POLYNOMIALS
The Hermite polynomial Hn (y) is defined by the Rodrigues formula
´
n ³
2 d
−y 2
Hn (y) = (−1)n ey
e
, y ∈ R.
dy n
(3)
Following are the first few Hermite polynomials.
H0 (y) = 1
H1 (y) = 2y
H2 (y) = 4y 2 − 2
H3 (y) = 8y 3 − 12y
(4)
The generating function for the Hermite polynomials is
2
e2yx−x =
FMM tutorial April 8, 2006
∞
X
xn
Hn (y).
n!
n=0
(5)
Writing TRs is a character building process!
·
3
2
Multiplying both sides of the preceding equation by e−y yields
2
e−(y−x) =
∞
X
xn
hn (y),
n!
n=0
(6)
where the Hermite functions hn (y) are defined as,
2
hn (y) = e−y Hn (y).
(7)
We need a shifted and the scaled version of this formula.
2
e−(y−x)
/h2
2
2
= e−[(y−x0 )−(x−x0 )] /h
µ
¶n
µ
¶
∞
X
1 x − x0
y − x0
=
hn
.
n!
h
h
n=0
(8)
This formula tells us how to evaluate the Gaussian field at the target y due to a
source at x, as a Hermite expansion centered at a nearby source x0 . This series
converges rapidly and for a given precision, only a certain number of p terms need
be retained.
Interchanging x and y we have the following expansion.
µ
¶n
µ
¶
∞
X
2
2
1 y − y0
x − y0
e−(y−x) /h =
hn
.
(9)
n!
h
h
n=0
This formula tells us how to evaluate the Gaussian field at the target y due to a
source at x, as a Taylor expansion centered at a nearby target y0 .
The following recurrence relation will be useful in computation of the Hermite
functions.
hn+1 (y) = 2yhn (y) − 2nhn−1 (y), y ∈ R.
(10)
The Cramer’s inequality [Hille 1926] for Hermite polynomials will be useful in deriving the error bounds.
√
2
|Hn (y)| ≤ K2n/2 n!ey /2 ,
(11)
where K < 1.09. Based on the Cramer’s inequality we have the following useful
bound for Hermite functions.
2
1
1
|hn (y)| ≤ K2n/2 √ e−y /2 .
(12)
n!
n!
[Baxter and Roussos 2002] use a slightly tighter version of this
2
1
1
|hn (y)| ≤ 2n/2 √ e−y /2 .
n!
n!
(13)
3. MULTI-INDEX NOTATION
In order to generalize the above relations to higher dimensions we use the multiindex notation. A multi-index α = (α1 , α2 . . . , αd ) is a d-tuple of nonnegative
integers. The length of the multi-index α is defined as |α| = α1 + α2 + . . . + αd . If
p is an integer, we say α ≥ p of αi ≥ p for 1 ≤ i ≤ d. The factorial of α is defined
as α! = α1 !α2 ! . . . αd !. For any multi-index α ∈ Nd and x = (x1 , x2 , . . . , xd ) ∈ Rd
αd
α
1 α2
the d-variate monomial xα is defined as xα = xα
1 x2 . . . xd . x is of degree n if
FMM tutorial April 8, 2006
4
·
Vikas Chandrakant Raykar
|α| = n. Let x, y ∈ Rd and v = x·y = x1 y1 +. . .+xd yd . Then using the multi-index
notation v n can be written as,
X n!
vn =
xα y α .
(14)
α!
|α|=n
The αth derivative with respect to x is defined as,
∂ αd
dα
∂ α1 ∂ α2
.
=
α1
α2 . . .
d
α
dx
∂x1 ∂x2
∂xα
d
(15)
The multidimensional Hermite function is defined as
2
hα (y) = e−kyk Hα (y) = hα1 (y1 )hα2 (y2 ) . . . hαd (yd ).
The Hermite expansion of the multivariate Gaussian can be written as,
µ
¶
X 1 µ x − x0 ¶α
y − x0
−(y−x)2 /h2
e
=
hα
.
α!
h
h
(16)
(17)
α≥0
The Taylor expansion of the multivariate Gaussian can be written as,
µ
¶µ
¶β
X 1
2
2
x − y0
y − y0
e−(y−x) /h =
hβ
.
β!
h
h
(18)
β≥0
4. TAYLOR EXPANSION OF THE HERMITE FUNCTION
For the translation operators we will also need the Taylor expansion of the Hermite
function hα (y) about an arbitrary point y0 ∈ Rd .
X (y − y0 )β
hα (y) =
Dβ hα (y0 ).
(19)
β!
β≥0
2
hα (y) = (−1)α Dα e−kyk .
(20)
Dβ hα (y) = (−1)β hα+β (y).
(21)
Hence,
hα (y) =
X (y − y0 )β
(−1)β hα+β (y).
β!
(22)
β≥0
5. FACTORIZATION [S- AND R-EXPANSION]
The potential or field at y due to a source at xi is given by
Φ(y, xi ) = e−ky−xi k
2
/h2
.
(23)
Note that Φ(y, xi ) is a regular potential. All the sources and targets are assumed
d
to lie in the
smaller boxes with sides
√ unit box B0 = [0, 1] . B0 is subdivided into √
of length 2rh, parallel to the axes, with a fixed r ≤ 1/ 2. The spatial domain
enclosed by the lth box is denoted as I1 (l) and the center of the box is denoted
by xlc . I1c (l) is the complement of I1 (l). Since the potential is regular we need not
worry about singular points.
FMM tutorial April 8, 2006
Writing TRs is a character building process!
·
5
For the lth box ∀xi ∈ I1 (l) and ∀y ∈ I1c (l) we need an S-expansion (far-field) of the
form
X
aα (xi , xlc )Sα (y − xlc ).
(24)
Φ(y, xi ) =
α≥0
The FGT uses the Hermite expansion of the Gaussian as the S-expansion.
µ
¶
X 1 µ xi − xl ¶α
y − xlc
c
Φ(y, xi ) =
hα
.
α!
h
h
(25)
α≥0
aα (xi , xlc )
1
=
α!
µ
xi − xlc
h
¶α
µ
, Sα (y −
xlc )
= hα
y − xlc
h
¶
.
(26)
For the lth box ∀xi ∈ I1c (l) and ∀y ∈ I1 (l) we need an R-expansion (local) of the
form
X
Φ(y, xi ) =
bβ (xi , xlc )Rβ (y − xlc ).
(27)
β≥0
The FGT uses the Taylor expansion of the Gaussian (which is obtained by interchanging y and xi in the Hermite expansion) as the R-expansion.
µ
¶µ
¶β
X 1
xi − xlc
y − xlc
Φ(y, xi ) =
hβ
.
(28)
β!
h
h
β≥0
bβ (xi , xlc ) =
1
hβ
β!
µ
xi − xlc
h
¶
µ
, Rβ (y − xlc ) =
y − xlc
h
¶β
.
(29)
6. FAST GAUSS TRANSFORM
Let the target point y belong to the nth box, i.e., y ∈ I1 (n). We need to evaluate
the total field at y due sources in all boxes. Since the Gaussian decays very rapidly
only a few boxes close to the target box can contribute more than Q² to the field
at y. If we include only (2n + 1)d nearest boxes, then the error
q due to ignoring
2
2
ln 1/²
all other boxes is bounded by Qe−2r n . This implies n ≥
2r 2 . Suppose we
want compute the field at all yj ∈ I1 (n) due to all sources xi ∈ I1 (l). There are
four different ways of doing this. Let there be NB sources in I1 (l) and MC targets
in I1 (n). Given the sources in one box and the targets in a neighboring box, the
computation is performed using one of the following four methods depending on
the number of sources and targets in these boxes: Direct evaluation is used if the
number of sources and targets are small. If the sources are clustered in a box then
they can be transformed into Hermite expansion about the center of the box. This
expansion is directly evaluated at each target in the target box if the number of
the targets is small. If the targets are clustered then the sources or their expansion
are converted to a local Taylor series which is then evaluated at each target in the
box. In practice a cutoff NF = O(pd−1 ) and ML = O(pd−1 ) is introduced.
—If NB < NF then the source box I1 (l) sends out NB Gaussians.
—If NB ≥ NF then the source box I1 (l) sends out a Hermite expansion.
FMM tutorial April 8, 2006
6
·
Vikas Chandrakant Raykar
—If MC < ML then the target box I1 (n) evaluates all the fields sent to it immediately.
—If MC ≥ ML then the target box I1 (n) transforms all fields sent to it into Taylor
series, accumulates the coefficients, and then evaluates the Taylor series.
6.1 Direct Evaluation
For the lth box let Φl (y) be the potential at y ∈ I1c (l) due to all sources xi ∈ I1 (l).
For all y ∈ I1 (n)
X
X
2
2
Φl (y) =
(30)
qi Φ(y, xi ) =
qi e−ky−xi k /h .
xi ∈I1 (l)
xi ∈I1 (l)
The computational cost is O(NB MC ).
6.2 S-expansion (Hermite) at the center of each source box
We form S-expansion (Hermite) of the potential about the source box center xlc .
For each box we consolidate the S-expansion coefficients due to all sources in that
box into a single rapidly converging Hermite expansion about the center of the box.
For all y ∈ I1 (n)
X
Φl (y) =
qi Φ(y, xi )
xi ∈I1 (l)
X
=

qi 

=
α≥0
=
X

aα (xi , xlc )Sα (y − xlc )
α≥0
xi ∈I1 (l)
X

X
X

qi aα (xi , xlc ) Sα (y − xlc )
xi ∈I1 (l)
Alα Sα (y
−
xlc )
=
α≥0
X
µ
Alα hα
α≥0
where
Alα
X
=
qi aα (xi , xlc )
xi ∈I1 (l)
1
=
α!
X
xi ∈I1 (l)
µ
qi
y − xlc
h
xi − xlc
h
¶
(31)
¶α
.
(32)
We retain only the first pd coefficients. The computational cost is O(pd NB ) +
O(pd MC ). The error due to truncation of the Hermite series is bounded by
¯
¯
¯
µ
¶¯
µ p ¶d−k
d−1 µ ¶
n ¯
X
¯X l
y
−
x
Ql
d
r
c ¯
p k
¯
√
EH (p) = ¯
(33)
Aα hα
≤
(1 − r )
¯
d
h
p!
¯α≥p
¯ (1 − r) k=0 k
P
where Ql = xi ∈I1 (l) qi and r < 1. See Appendix 1 for the derivation.
The total potential at y due to all sources is
µ
¶
XX
X
y − xlc
b l (y) =
b
Alα hα
Φ
.
(34)
Φ(y)
=
h
α<p
∀l
FMM tutorial April 8, 2006
∀l
Writing TRs is a character building process!
·
7
Since there are at most (2n + 1)d source boxes within range the total computational
complexity will be O(pd N ) + O((2n + 1)d pd M ). The error
µ p ¶d−k
d−1 µ ¶
X
b
|Φ(y)
− Φ(y)|
1
d
r
p k
√
≤
(1 − r )
d
Q
(1 − r)
k
p!
(35)
k=0
where r < 1.
6.3 R-expansion (Taylor) at the center of each target box
We form R-expansion (Taylor) of the potential about the target box center xnc . For
each source box I1 (l) we consolidate the R-expansion coefficients due to all sources
in that box into a single Taylor expansion about the center of the target box I1 (n).
For all y ∈ I1 (n)


X
X
X
Φl (y) =
qi Φ(y, xi ) =
qi 
bβ (xi , xnc )Rβ (y − xnc )
xi ∈I1 (l)
=
X
β≥0
=
X

β≥0
xi ∈I1 (l)


X
qi bβ (xi , xnc ) Rβ (y − xnc )
xi ∈I1 (l)
Bβln Rβ (y − xnc ) =
X
µ
Bβln
β≥0
β≥0
y − xnc
h
¶β
where,
Bβln =
X
qi bβ (xi , xnc ) =
xi ∈I1 (l)
1
β!
X
(36)
µ
qi hβ
xi ∈I1 (l)
xi − xnc
h
¶
.
(37)
d
We retain only the first p coefficients. The computational cost is O(pd NB ) +
O(pd MC ). The error due to truncation of the Taylor series is bounded by
¯
¯
¯
µ
¶ ¯
µ p ¶d−k
d−1 µ ¶
X
¯ X ln y − xnc β ¯
d
r
Ql
p k
¯
¯
√
ET (p) = ¯
(38)
Bβ
(1 − r )
≤
¯
d
h
p!
¯β≥p
¯ (1 − r) k=0 k
P
where Ql = xi ∈I1 (l) qi and r < 1. See Appendix 2 for the derivation.
The total potential at y ∈ I1 (n) due to all sources is
#µ
"
µ
¶
¶β
n β
X
XX
X X
y
−
x
y − xnc
c
l
ln
ln
b
b (y) =
Φ(y)
=
Φ
Bβ
=
Bβ
. (39)
h
h
∀l
∀l β<p
β<p
∀l
Since there are at most (2n + 1)d source boxes within range the total computational
complexity will be O((2n + 1)d pd N ) + O(pd M ). The error
µ p ¶d−k
d−1 µ ¶
X
b
1
|Φ(y)
− Φ(y)|
d
r
p k
√
≤
(1
−
r
)
Q
(1 − r)d
k
p!
(40)
k=0
where r < 1.
FMM tutorial April 8, 2006
8
·
Vikas Chandrakant Raykar
6.4 S|R translation
We accumulate all sources in a box via a truncated Hermite expansion and then
translate the S-expansion coefficients into R-expansion Taylor coefficients at the
center of the target box containing the evaluation point. Let y ∈ I1 (n), i.e., the
target point belongs to the nth box with center xnc . Now we want to write the
potential at y due to all sources in the lth box Φl (y) (expanded using an S expansion
around xlc ) as an R expansion around xnc .
¶β
µ
X
X
y − xnc
Φl (y) =
Cβln Rβ (y − xnc ) =
Cβln
(41)
h
β≥0
β≥0
P
where Cβln = α≥0 (S|R)βα (xnc − xlc )Alα and Alα are the S-expansion coefficients
around xlc . The S|R-translation operator can be derived using the Taylor expansion
of the Hermite function [See Section 4].
µ
¶ X
µ
¶β
µ n
¶
(−1)|β| y − xnc
y − xlc
xc − xlc
hα
=
hα+β
.
(42)
h
β!
h
h
β≥0
The S-expansion is given by
Φl (y) =
=
=
=

µ
¶
¶
µ n
|β|
n β
l
X
(−1)
y − xc
y−
xc − xc 
Alα hα
=
Alα 
hα+β
h
β!
h
h
α≥0
α≥0
β≥0


µ n
¶ µ
¶β
X (−1)|β| X
xc − xlc  y − xnc

Alα hα+β
β!
h
h
α≥0
β≥0


µ l
¶ µ
¶
n
n β
X (−1)|β| X
x
−
x
c
c  y − xc

Alα (−1)|α|+|β| hα+β
β!
h
h
α≥0
β≥0


µ l
¶ µ
¶β
X 1 X
xc − xnc  y − xnc

Alα (−1)|α| hα+β
β!
h
h
µ
X
X
β≥0
¶

X
α≥0
β≥0
=
xlc
µ
Cβln
y − xnc
h
where,
Cβln
¶β
(43)
µ l
¶
xc − xnc
1 X l
|α|
Aα (−1) hα+β
=
.
β!
h
(44)
α≥0
There are two sources of truncation. First we truncate the Hermite series coefficients
Alα and then we truncate the coefficients Cβln . The error due to truncation of both
the Taylor series and the Hermite series is bounded by

Ã √
!d−k 2
d−1 µ ¶ ³
X
√ p ´k ( 2r)p
d
Ql

 (45)
√
√
1 − ( 2r)
ET H (p) ≤ ET (p) +
p!
(1 − 2r)2d k=0 k
FMM tutorial April 8, 2006
Writing TRs is a character building process!
√
P
where Ql = xi ∈I1 (l) qi and r < 1/ 2. See Appendix 3 for the derivation.
The total potential at y ∈ I1 (n) due to all sources is
"
#µ
µ
¶
¶β
n β
X
XX
X X
y
−
x
y − xnc
c
l
ln
ln
b
b (y) =
bβ
bβ
Φ(y)
=
=
.
Φ
C
C
h
h
∀l
where,
∀l β<p
β<p
·
9
(46)
∀l
¶
µ l
1 X l
xc − xnc
ln
|α|
b
Cβ =
A (−1) hα+β
.
β! α<p α
h
(47)
The total computational complexity will be O((2n + 1)d dpd+1 Nbox ) + O(pd M ) +
O(pd N ).
7. ALGORITHM
A formal description of the algorithm and detailed analysis of the computational
cost can be seen in the original paper [Greengard and Strain 1991].
8. DATA STRUCTURES
The first step of the√
algorithm is the spatial subdivision of the unit hypercube into
d
d
Nside
boxes of side 2rh were r < 1/2. The boxes are numbered from 1 to Nside
in the order of increasing dimensionality d. Fig. 1 shows an example of spatial
ordering in two dimensions.
12
13
14
15
8
9
10
11
4
5
6
7
0
1
2
3
(a)
Fig. 1.
Example of spatial ordering in two dimensions.
We need the following operations to implement the FGT.
(1) Given x = (x1 , x2 , . . . , xd ) ∈ Rd find the corresponding box number to which
x belongs.


d
X
j−1 
BoxN umber =  bxj Nside cNside
.
(48)
j=1
(2) Given the box number find the center of the box. First write BoxN umber in
base Nside .
(BoxN umber)(10) = (ad ad−1 . . . a2 a1 )(Nside ) .
The j
th
coordinate of the box center is given by
√
1
BoxCenterj = ( 2rh)(aj + ).
2
(49)
(50)
FMM tutorial April 8, 2006
10
·
Vikas Chandrakant Raykar
(3) List the (2n + 1)d neighbors of a given box. First write BoxN umber − 1 in
base Nside .
(BoxN umber)(10) = (ad ad−1 . . . a2 a1 )(Nside ) .
(51)
Increment or decrement each aj as follows
bji = aj + i for i = 0, ±1, ±2, . . . , ±n.
(52)
The base Nside representation of the (2n + 1)d neighbors is as follows
(bdid . . . b2i2 b1i1 )(Nside ) .
(53)
d
where (id , . . . , i2 , i1 ) ∈ {0, ±1, ±2, . . . , ±n} . The box number is given by converting it into base 10 representation.
(bdid . . . b2i2 b1i1 )(Nside ) = (BoxN umber)(10)
(54)
REFERENCES
Baxter, B. J. C. and Roussos, G. 2002. A new error estimate of the fast gauss transform. SIAM
Journal of Scientific and Statistical Computing 24, 1, 257–259. 2, 3, 13
Greengard, L. and Strain, J. 1991. The fast Gauss transform. SIAM Journal of Scientific and
Statistical Computing 12, 1, 79–94. 2, 9, 13, 14
Hille, E. 1926. A class of reciprocal functions. The Annals of Mathematics, 2nd Ser. 27, 4 (Jun.),
427–464. 3
Raykar, V. C. 2005. A short primer on the fast multipole method. Available online on the
author’s website. 1
FMM tutorial April 8, 2006
Writing TRs is a character building process!
·
11
FMM tutorial April 8, 2006
12
·
Vikas Chandrakant Raykar
9. APPENDICES
9.1 Appendix 1 [Hermite series truncation error]
The error EH (p) due to truncating the series after pd terms is
¯
¯
¯X
¯ µ
µ
¶¯ X
¶¯
l ¯
¯
¯ l ¯¯
y
−
x
y − xlc ¯¯
c ¯
l
¯
¯
¯
¯
EH (p) = ¯
Aα hα
Aα ¯hα
¯≤
¯
h
h
¯α≥p
¯ α≥p
µ
¶¯
d ¯
l
X¯ ¯Y
¯
¯
¯hαj (y)j − (xc )j ¯ [(y)j is the j th coordinate of y.]
¯Alα ¯
=
¯
¯
h
j=1
α≥p
¯
¯
µ
¶ ¯ d ¯
µ
¶¯
l α¯ Y
X ¯¯ 1 X
¯
x
−
x
(y)j − (xlc )j ¯¯
i
c
¯
¯
¯
=
qi
¯ α!
¯
¯hαj
¯
h
h
¯ j=1
α≥p ¯
xi ∈I1 (l)
¯
¯
¯µ
¯
¶ d
µ
¶¯
X ¯¯ X
√
¯ r αY
1 ¯¯
(y)j − (xlc )j ¯¯
¯
¯
≤
qi ¯ √
hα j
[Since the box side is 2rh]
¯
¯
¯
α
!
h
2
j=1 j
α≥p ¯xi ∈I1 (l) ¯


¯
¶¯
µ
d
Y
X
X
¯
(y)j − (xlc )j ¯¯
αj −αj /2 1 ¯


r 2
≤
|qi |
hα
¯
αj ! ¯ j
h
α≥p
= Ql
j=1
xi ∈I1 (l)
X
d
Y
rαj 2−αj /2
α≥p j=1
¯
¶¯
µ
1 ¯¯
(y)j − (xlc )j ¯¯
h
α
¯ [Define Ql =
αj ! ¯ j
h
X
|qi |]
xi ∈I1 (l)
d
XY
l
2
2
1
p rαj 2−αj /2 2−αj /2 e−((y)j −(xc )j ) /2h [Cramer’s inequality]
α
!
j
α≥p j=1


d
d
d
XY
XY
XY
1
1
1
p rαj = Ql 
p r αj −
p r αj 
≤ Ql
α
!
α
!
αj !
j
j
α<p j=1
α≥p j=1
α≥0 j=1





d X
d X
 Y

Y
1
1
p r αj −
p rαj 
= Ql 




αj !
αj !
αj <p
j=1
j=1
αj ≥0

d 
d 
X 1



X
X
1
1


p r αj +
p r αj
p r αj 
= Ql 
−
α <p αj !



αj !
αj !
αj <p
α ≥p
j
≤ Ql
j
k 
d−k
X 1
1

p r αj  
p rαj 
= Ql
[Binomial theorem]
k
α
!
αj !
j
αj <p
αj ≥p
k=0
k 

d−k
d−1 µ ¶
X
d  X αj   r p X αj 
√
r
≤ Ql
r
k
p! α ≥0
α <p
k=0
d−1 µ ¶
X
d

X
j
j
¶d−k
rp 1
1−r
√
[Geometric series r < 1.]
1−r
p! 1 − r
k=0
µ p ¶d−k
d−1 µ ¶
X
FMM tutorial April
Ql 8, 2006
d
r
p k
√
=
(1 − r )
d
(1 − r)
k
p!
≤ Ql
d−1 µ
X
d
k
¶µ
k=0
¶ µ
p k
(55)
Writing TRs is a character building process!
·
13
The error bound in the original FGT paper [Greengard and Strain 1991] was shown
to be incorrect and a new bound was proposed by [Baxter and Roussos 2002]. This
is the new error estimate as in [Baxter and Roussos 2002] but derived in a slightly
different way.
9.2 Appendix 2 [Taylor series truncation error]
The error ET (p) due to truncating the series after pd terms is (derivation very
similar to EH (p))
ET (p) =
≤
=
≤
≤
¯
¯
¯
¶ ¯
µ
¯ X ln y − xnc β ¯
¯
¯
Bβ
¯
¯
h
¯β≥p
¯
¯µ
¶ ¯
X¯
¯ ¯¯ y − xnc β ¯¯
ln
¯Bβ ¯ ¯
¯
¯
¯
h
β≥p
¯
¯
¯
¶¯ ¯µ
¶ ¯
µ
n ¯¯
n β¯
X 1 ¯ X
x
−
x
y
−
x
¯
¯
i
c ¯
c
¯
q i hβ
¯ [Eq. 37.]
¯
¯
¯
¯
¯
β! ¯
h
h
¯
β≥p
xi ∈I1 (l)


½ ¯ µ
¶¯¾ ¯¯µ
¶ ¯
n ¯
n β¯
X
X
¯
1
x
−
x
y
−
x
¯
¯
i
c ¯ 
c
¯hβ

|qi |
¯
¯
¯
¯
¯
β! ¯
h
h
β≥p xi ∈I1 (l)


¶¯¾
½ ¯ µ
n ¯
X
X
¯
x
−
x
1
i
c ¯  β −β/2
¯hβ

r 2
|qi |
¯
β! ¯
h
β≥p
xi ∈I1 (l)


¯
µ
¶¯
d
Y
n
¯
¯
1
(x
)
−
(x
)
i
j
j
c
¯h β j
¯  rβ 2−β/2

=
|qi |
¯
¯

β
!
h
j
j=1
β≥p xi ∈I1 (l)



d
Y

X
X
1

p 2βj /2  rβ 2−β/2
≤
|qi |


βj !
j=1
β≥p x ∈I (l)

X
X
i
X
1
d
Y
1
p rβj
βj !
β≥p j=1
µ p ¶d−k
d−1 µ ¶
X
Ql
d
r
p k
√
≤
(1
−
r
)
(1 − r)d
k
p!
= Ql
(56)
k=0
FMM tutorial April 8, 2006
14
·
Vikas Chandrakant Raykar
9.3 Appendix 3 [S|R truncation error]
Using the expression for Alα , Cβln can be simplified as follows.
¶
µ l
1 X l
xc − xnc
Aα (−1)|α| hα+β
β!
h
α≥0


µ
¶α
µ l
¶
1 X 1 X
xi − xlc 
xc − xnc
|α|
qi
(−1) hα+β
=
β!
α!
h
h
α≥0
xi ∈I1 (l)


µ l
¶
n
X 1 µ xi − xl ¶α
1 X
x
−
x
c
c
c 
=
qi 
(−1)|α| hα+β
β!
α!
h
h
α≥0
xi ∈I1 (l)
¶
µ
X
1
xi − xnc
.
=
qi hβ
β!
h
Cβln =
(57)
xi ∈I1 (l)
Note that this is exactly the same as Bβln the R-expansion coefficients, which should
be the case. There is a error in the original paper [Greengard and Strain 1991] where
there is an extra (−1)|β| term.
The error ET H (p) due to truncating both the series after pd terms is
ET H (p) =
=
=
≤
=
¯
¯
¯
µ
¶ ¯
¯ X ln y − xnc β ¯
bβ
¯
¯
C
¯
¯
h
¯β≥p
¯
¯
¯
¯X ³
´ µ y − xn ¶β ¯¯
¯
c
bβln − Cβln )
¯
¯
Cβln + (C
¯
¯
h
¯
¯β≥p
¯
¯
¯
µ
¶
¶ ¯
µ
n β¯
¯ X ln y − xnc β X ln
y
−
x
c
bβ − Cβln )
¯
¯
+
(C
Cβ
¯
¯
h
h
¯
¯β≥p
β≥p
¯ ¯
¯
¯
¯
¶ ¯ ¯
¶ ¯
µ
µ
n β¯
¯ X ln y − xnc β ¯ ¯ X ln
y
−
x
c
b − C ln )
¯+¯
¯ [Triangle inequality.]
¯
Cβ
(C
β
β
¯ ¯
¯
¯
h
h
¯ ¯β≥p
¯
¯β≥p
¯
¯
¯
¶β ¯
µ
¯
¯ X ln
y − xnc
ln
b
¯
¯
ET (p) + ¯
(58)
( Cβ − Cβ )
¯
h
¯
¯β≥p
FMM tutorial April 8, 2006
Writing TRs is a character building process!
·
15
b ln can be simplified as follows
C
β
¶
µ l
1 X l
xc − xnc
ln
|α|
b
Cβ =
A (−1) hα+β
β! α<p α
h


µ
µ l
¶α
¶
1 X 1 X
xi − xlc 
xc − xnc
=
qi
(−1)|α| hα+β
β! α<p α!
h
h
xi ∈I1 (l)
"
¶#
µ l
X 1 µ xi − xl ¶α
1 X
xc − xnc
c
|α|
=
qi
(−1) hα+β
β!
α!
h
h
α<p
xi ∈I1 (l)



µ
¶
¶
µ l
l α
n
X
X
X
1
xc − xc 
 1 xi − xc
=
qi 
−
(−1)|α| hα+β
β!
α!
h
h
α≥0
α≥p
xi ∈I1 (l)


µ
¶
¶
µ l
l α
n
X
X
1
1 xi − xc
xc − xc 
= Cβln −
qi 
(−1)|α| hα+β
β!
α!
h
h
xi ∈I1 (l)
=
Cβln
+
bβln
(C
−
Cβln )
α≥p
(59)
FMM tutorial April 8, 2006
16
=
≤
≤
≤
·
Vikas Chandrakant Raykar
¯
¯
¯
¯X
¶ ¯ X¯
¶ ¯
µ
¯ ¯µ
n β¯
n β¯
¯
y
−
x
¯ b ln
¯
c
ln ¯ ¯ y − xc
ln
ln
b
¯
¯
(Cβ − Cβ )
¯(Cβ − Cβ )¯ ¯
¯
¯
¯≤
¯
¯
h
h
¯β≥p
¯ β≥p
¯

¯
µ l
¶ ¯ ¯¯µ
¶ ¯
n
n β¯
X 1 ¯¯ X
X 1 µ xi − xl ¶α
¯
x
−
x
¯
c
c
c ¯ ¯ y − xc
|α|
¯

qi
(−1) hα+β
¯
¯ ¯¯
¯
β! ¯¯
α!
h
h
h
¯
α≥p
β≥p
xi ∈I1 (l)
¯ ¯
¯

¯X
µ
¶α
¶¯ µ
¶β ¯¯
µ l
X
X 1
¯
1 xi − xlc
xc − xnc ¯¯ ¯¯ y − xnc
¯
|α|
¯

(−1) hα+β
|qi | ¯
¯
¯ ¯¯
¯
β!
α!
h
h
h
¯
¯
α≥p
β≥p
xi ∈I1 (l)

¯
¯
µ l
¶¯ µ
¶β ¯¯
X 1
X
X 1 ¯¯µ xi − xl ¶α ¯¯ ¯
xc − xnc ¯¯ ¯¯ y − xnc
¯
c
¯
¯
¯

hα+β
|qi |
¯
¯
¯
¯
¯
¯
¯
¯
β!
α!
h
h
h
α≥p
β≥p
xi ∈I1 (l)


¯
µ l
¶¯
n ¯
X 1
X
X 1
¯
x
−
x
c ¯ β −β/2
c

|qi |
rα 2−α/2 ¯¯hα+β
¯ r 2
β!
α!
h
β≥p
xi ∈I1 (l)
α≥p


¶¯
µ l
d ¯
Y
n
X
X 1
X 1
¯
¯
(x
)
−
(x
)
j
j
c
c
¯hα +β
¯  rβ 2−β/2

|qi |
≤
rα 2−α/2
¯ j j
¯

β!
α!
h
j=1
α≥p
β≥p
xi ∈I1 (l)



¯
d ¯q
Y
X
X 1
X 1
¯
l
n
2
2¯
¯ (αj + βj )!2αj /2 2βj /2 e−((xc )j −(xc )j ) /2h ¯  rβ 2−β/2

|qi |
≤
rα 2−α/2
¯
¯

β!
α!
j=1
α≥p
β≥p
xi ∈I1 (l)

 
d q
Y

XX 1 1
rα
≤ Ql
(αj + βj )!  rβ


α! β!

j=1
α≥p β≥p
≤
=
≤
≤
≤
≤
p
XX
(αj + βj )!
αj +βj
Ql
r
αj !βj !
α≥p β≥p j=1
s
s
d
XXY
1
(αj + βj )!
αj +βj
Ql
r
α
!β
!
αj !βj !
j
j
α≥p β≥p j=1
s
s
d
d
XXY
XXY
√ αj +βj
1
1 ³√ ´αj +βj
αj +βj
Ql
2
≤ Ql
2r
r
αj !βj !
αj !βj !
α≥p β≥p j=1
α≥p β≥p j=1


s
s
d
d
´
´
³
³
Y
X
XY
α
β
√
√
j
j
1
1


Ql
2r
2r
α
!
β
j
j!
α≥p j=1
β≥p j=1

s
Ã √
!d−k 
d
d−1 µ ¶ ³
XY
X
√ p ´k ( 2r)p
1 ³√ ´αj 
1
d

√
√
Ql
2r
1 − ( 2r)
αj !
p!
(1 − 2r)d k=0 k
α≥p j=1

Ã √
!d−k 2
d−1 µ ¶ ³
X
√ p ´k ( 2r)p
√
d
Ql

 [r < 1/ 2]
√
√
1 − ( 2r)
p!
(1 − 2r)2d k=0 k
d
Y
FMM tutorial April 8, 2006
(60)