...

An FPGA Implementation of the Two-Dimensional Finite-Difference Time-Domain (FDTD) Algorithm Wang Chen

by user

on
Category: Documents
9

views

Report

Comments

Transcript

An FPGA Implementation of the Two-Dimensional Finite-Difference Time-Domain (FDTD) Algorithm Wang Chen
An FPGA Implementation of the
Two-Dimensional Finite-Difference
Time-Domain (FDTD) Algorithm
Wang Chen
Panos Kosmas
Miriam Leeser
Carey Rappaport
Northeastern University
Boston, MA
1
FDTD Algorithm and Implementation
?
Finite Difference Time-Domain
?Method for solving Maxwell’s equations
?Used for buried object detection
?
Hardware Implementation
?3D to 2D model simplification
?Data dependency analysis
?Fixed-point quantization
2
Finite-Difference Time-Domain Method
?A direct time-domain
Maxwell’s Equations
solution of Maxwell's
equations
?Accurate and flexible for
solving electromagnetic
problems
?Discretize time and
electromagnetic space
3
FDTD Method (cont’d)
Yee Cell
Taylor Series Expansion
?Y
Z-Axis
?X
Ey
Hz
Ex
Ex
Ey
Ez
?Z
Ez
Ez
Hx
(i,j,k)
xi s
X-A
Hy
(i,j+ 1/2,k+ 1/2)
Ey
One FDTD Equation
Ex
Y-Axis
Adjacent Cells
4
FDTD Applications
?
Antenna Design
?
Discrete Scattering Studies
?
Medical Studies
?The study of the cell phone
electromagnetic waves' effect
on human brain
?The study of breast cancer
detection using electromagnetic
antenna
5
Buried Object Detection Forward Model
Initialization
Buried Object
Detection Model Space
Excitation
Z
Calculate E Field
n=n+1
t=n
Transmitting Antenna
Receiving Antenna
Exterior
Boundary
Conditions
t = n + 0.5
Calculate H Field
Time over?
Yes
No, Go to Next
Time Step
Object
X
Mine
Y
End
6
FDTD Simulated Model Space
7
FDTD Simulated Model Space (cont’d)
8
Related Work
?
?
Software acceleration of FDTD
?Parallel computers do not
provide significant speedup
FPGA implementations of FDTD
?1D FDTD on hardware: architecture is too simple
?Full 3D FDTD on hardware developed at UDel
?
Design is slower than software:
uses complex floating-point representation
? no parallelism or pipelining
?
?
Our 2D FDTD hardware implementation
?
24 times speedup compare to 3.0G PC:
?
?
fixed-point representation
expandable structure
9
3D to 2D Model Simplification
Initialization
Initialization
? Initialize parameters of model space
and time step
? Build parameters of soil and buried
object
? Load all the EM space data into memory
? Initialize parameters of model space
and time step
? Build parameters of soil and buried
object
? Load all the EM space data into memory
Z
Simplify
Transmitting
Antenna
Receiving
Antenna
Excitation
Excitation
Mine
Calculate E Field
Calculate E Field
? Update Eys field
? Update Exs field
? Update Eys field
? Update Ezs field
t=n
Y
t=n
Z
Exterior Boundary
Conditions
Boundary of EYX
Boundary of EZX
Boundary of EZY
X
n=n+1
Boundary of EXY
Boundary of EXZ
Boundary of EYZ
Exterior Boundary
Conditions
Receiving
Antenna
Boundary of EYX
Transmitting
Antenna
t = n + 0.5
Calculate H Field
n=n+1
Boundary of EYZ
t = n + 0.5
Calculate H Field
? Update Hxs field
? Update Hzs field
Mine
X
? Update Hxs field
? Update Hys field
? Update Hzs field
Time over?
Y
Yes
No, Go to Next
Time Step
End
Time over?
Yes
No, Go to Next
Time Step
End
10
Exterior Boundary Conditions
Mur-type Absorbing Boundary Condition
3D Model Space
6 Faces and 12 Edges
2D Model Space
4 Edges
11
Data Dependency Analysis
Initialization
? Initialize parameters of model space
and time step
? Build parameters of soil and buried
object
? Load all the EM space data into memory
T-3
B
A
T-2
B
A
T-1
Excitation
B
A
Mine
Memory Space for
Electric Field Data
2
Rows
Calculate Hzs Field
B
A
B
A
N ce lls
Sequence of t he process in g
B
A
Calculate Hxs Field
M cells
T
Calculate Eys Field
Time step
n=n+1
Exterior Boundary
Conditions
Memory Space for
Magnetic Field Data
Boundary of EYX
Boundary of EYZ
Time over?
Yes
No, Go to Next
Time Step
End
12
Hardware Acceleration
?
Smart memory interface
?
Parallelism
?
Pipelining
?
Quantized fixed point representation
?Less area in datapath -- more parallelism
?Careful error analysis to ensure accurate
results
S A AA . B BBBBBBBBBBBBBBBBBBBBBBBBB
2 … 0
3 bits
-1
……
-26
26 bits
13
Fixed-point quantization
Average relative error (%)
2
1.8
1.6
1.4
Electric Field Value at R1
1.2
Electric Field Value at R2
1
Magnetic Field Value at R1
0.8
Magnetic Field Value at R2
0.6
Source Data
0.4
0.2
0
24bits
25bits
26bits
Bit-width
27bits
28bits
14
Design Flow
15
Firebird FPGA Board from Annapolis
?
?
?
?
?
A Xilinx VIRTEX-E XCV2000E
with 2.5 million system gates
Processing clock up to 150MHz
FDTD runs at 70 MHz
Five independent memory banks
(4 x 64-bit, 1 x 32-bit)
288Mbytes in total
6.6Gbytes/sec of memory
bandwidth
3Gbytes/sec of I/O bandwidth
Utilization of Xilinx XCV2000E FPGA Chip
Slices
BlockRAM
Number Available
19200
160
Number Used
8837
86
Percentage Used
46%
54%
16
FDTD on Firebird Board
Simulated Electromagnetic Space
DESIGN
Memory Interface
On-Board
MEMORY
Electric
Field
Pipeline
Module
Magnetic
Field
Pipeline
Module
Boundary
Conditions Module
On-Board
MEMORY
Memory in PC
PCI BUS
Memory in PC
PC
HOST
FPGA
FIREBIRD BOARD
17
Memory Interface
HXS
HZS
0
EYS
HXS
HZS
1
D
HZS
EYS
EYS
HXS
HXS
HZS
0
HZS
3
EYS
HXS
HZS
EYS
EYS
HXS
HXS
HZS
0
HZS
3
EYS
HXS
HZS
HZS
2
C
B
1
R
2
EYS
EYS
HXS
HXS
HZS
0
HZS
3
EYS
HXS
HZS
1
2
HXS
HZS
1
EYS
HXS
HZS
2
EYS
EYS
HXS
HXS
HZS
0
HZS
3
EYS
HXS
HZS
1
EYS
HXS
HZS
2
EYS
EYS
HXS
HXS
HZS
HZS
0
EYS
HXS
HZS
1
EYS
HXS
HZS
2
EYS
EYS
HXS
HXS
HZS
HZS
0
EYS
HXS
HZS
1
A
EYS
HXS
HZS
2
EYS
HXS
HZS
3
3
es
ul
t
Boundar y
HXS
Pi p eli n ed
EYS
HZS
EYS
C
B
HXS
0
DESIGN
P ipe lin ed
HXS
EYS
HZS
1
EYS
HZS
HXS
D
2
Ele ctr ical
Fie ld
M o du le
O N-BOARD
MEMOR IES
? ???
HXS
M ag n et ic
F ield
Mod ul e
EYS
EYS
A
????
O N-BOARD
MEMOR IES
EYS
Result
3
Res
ult
EYS
HXS
HZS
3
Input BlockRAMs
Ouput BlockRAMs
FPGA CHIP
18
Pipelining and Parallelism
Read Read
Read
Hxs_A Hzs_A Eys_A
Read Read
DTin_1
Hxs_B Hzs_B
Read
Read Read
Read
EXS
EysCo Hxs_C EysDo
Hzs_C
Read
Read
Ezs
DTin_1
EXS
EysBo
EysBo
Ezs
DTin_1
DTin_2
-
-
DTin_2
-
-
DTin_2
-
-
0
1
2
3
x
x
x
x
x
x
4
5
6
7
-
+
+
8
+
-
-
Write to Eys
Write to Hxs
Write to Hzs
9
19
Data Flow
Electric Field
On-board
Memory ?
Magnetic Field
On-board
Memory ?
Source Data
On-board
Memory ?
Pipeline Hzs
BlockRam
Pipeline
Boundary
Pipeline Hxs
P ipeline E ys
Memory Interface Module
BlockRam
Memory Interface Module
BlockRam
Electric Field
On-board
Memory ?
Source
Adder
BlockRam
Magnetic Field
On-board
Memory ?
20
Results and Performance
Executing Tim e (Second)
Performance Result
25
A Software Floating-point
~~ 25s
Fortran code at 440 MHz Sun Workstation
B Software Fixed-point
20
~~ 3.375s
C code at 3.0 GHz PC
15
C Hardware
~~ 0.145s
Design working at 70MHz
10
5
0
A
B
C
Model space 100*100 cells
Iterate 200 time steps
21
Conclusions
?
FPGA Implementation of FDTD exhibits
significant speedup compared to software:
24 times faster than 3GHz PC
?
With larger FPGA, more parallelism will be
available, hence more speedup
?
Current design easily extendible to handle
multiple types of materials, 3D space
22
Future Work
?
?
Upgrade curent design to handle multiple
types of materials
Upgrade to 3D model space
?Add three more field updating algorithms:
same structure as the original three algorithms
?Upgrade boundary condition updating
algorithm
?Redesign memory interface
?
Apply FDTD Hardware to other
applications
23
Fly UP