...

ppt

by user

on
Category: Documents
34

views

Report

Comments

Description

Transcript

ppt
WP2 - Data Management
L.M.Barone
Università di Roma & INFN
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
WP Goals
“...to permit the secure access of massive
amounts of data...to move and replicate data at
high speed from one site to another and to
manage the synchronisation of remote data
copies” (dal Technical Annex di DataGrid)
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
Keywords
•
•
•
•
•
•
•
Automation
Caching
Generic Interface
MetaData
Data Mover
Replica Manager
Security
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
People
SEDE NOME
FTE
Bari:
L.Silvestris
G.Zito
0.3
0.5 (0.3)
Pisa:
S.Arezzini
A.Controzzi
F.Donno
F.Schifano
0.3 (0.3)
0.5
0.2 (0.2)
0.2
Roma1:L.M.Barone
A.Lonardo
A.Michelotti
G.Organtini
D.Rossetti
0.3 (0.3)
0.3
0.3
0.2
0.2 (0.2)
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
Deliverables
•Requirements for Data Location Broker 5/2001
•Definition of a metadata syntax
7/2001
•Replica Management at file level
12/2001
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
An Example
• Ideas for a Replica Manager:
– Management of production in a distributed
environment:
•
•
•
•
Data produced in many sites
Data collected in a single reference site
Data analyzed in many sites
Data sometimes are moved, sometimes may be
accessed via network
• A case study with Objectivity/DB
– can be extended to any kind of file
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
Cloning federations
Clone FD
RC1 Boot
CERN Boot
RC1
FD
CERN
FD
DB_a
DB_b
RC2 Boot
DB1 DB2
DB3
L.M.Barone – INFN Rome
DBn
RC2
FD
Commissione Nazionale I
13 Settembre 2000
Productions
RC1 Boot
CERN Boot
GDMP
CERN
FD
RC1
FD
DBn+1
GDMP
DBn+m
GDMP
RC2 Boot
GDMP
DB1 DB2
DB3
L.M.Barone – INFN Rome
DBn
RC2
FD
Commissione Nazionale I
DBn+m+1
DBn+m+k
13 Settembre 2000
Analysis
CERN Boot
RC1 Boot
CERN
FD
RC1
FD
DBn+1
DBn+m
RC2 Boot
DB1 DB2
DB3
DBn
RC2
FD
DBn+m+1
DBn+m
DBn+m+k
Commissione Nazionale I
L.M.Barone – INFN Rome
DBn+1
DBn+m+1
DBn+m+k
13 Settembre 2000
Logical vs Physical Datasets
Dataset: H 2
pccms1.bo.infn.it::/data1/Hmm1.hits.DB
Hmm.1.hits.DB
id=12345
Hmm.2.hits.DB
id=12346
shift23.cern.ch::/db45/Hmm1.hits.DB
pccms1.bo.infn.it::/data1/Hmm2.hits.DB
shift23.cern.ch::/db45/Hmm2.hits.DB
pccms3.pd.infn.it::/data3/Hmm2.hits.DB
Hmm.3.hits.DB
id=12347
Dataset: H 2e
shift23.cern.ch::/db45/Hmm3.hits.DB
pccms5.roma1.infn.it::/data/Hee1.hits.DB
Hee.1.hits.DB
id=5678
Hee.2.hits.DB
id=5679
shift49.cern.ch::/db123/Hee1.hits.DB
pccms5.roma1.infn.it::/data/Hee2.hits.DB
shift49.cern.ch::/db123/Hee2.hits.DB
pccms5.roma1.infn.it::/data/Hee3.hits.DB
Hee.3.hits.DB
L.M.Barone – INFN Rome
id=5680
shift49.cern.ch::/db123/Hee3.hits.DB
Commissione Nazionale I
13 Settembre 2000
Logical vs Physical Datasets
• Each dataset is composed by one or
more databases
– datasets are managed by application-sw
• Each DB is univocally identified by a DBid
– DBid assignment is a logical-db creation
• The physical-db is the file
– zero, one or more instancies
• The GIS manages the link between
a dataset, its logical-dbs and its physicaldbs
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
Database creation
shift.cern.ch
pc.rc1.net
RC1
Prod
DB4
DB5
CERN
FD
RC1
Ref
DB1
DB4
DB2
DB3
DB5
L.M.Barone – INFN Rome
0001
0001 DB1.DB
DB1.DB
shift.cern.ch::/shift/data
shift.cern.ch::/shift/data
0002
0002 DB2.DB
DB2.DB
shift.cern.ch::/shift/data
shift.cern.ch::/shift/data
0003
0003 DB3.DB
DB3.DB
shift.cern.ch::/shift/data
shift.cern.ch::/shift/data
0004
0004 DB4.DB
DB4.DB
pc.rc1.net::/pc/data
pc.rc1.net::/pc/data
shift.cern.ch::/shift/data
shift.cern.ch::/shift/data
0005
0005 DB5.DB
DB5.db
Commissione Nazionale
I
13 Settembre 2000
pc.rc1.net::/pc.data
pc.rc1.net::/ps.data
shift.cern.ch::/shift/data
Replica Management
shift.cern.ch
pc1.bo.infn.it
DB1
CERN
FD
BO
Ref
DB2
DB1
DB3
DB2
0001 DB1.DB
shift.cern.ch::/shift/data
pc1.bo.infn.it::/data
0002 DB2.DB
shift.cern.ch::/shift/data
0003 pc1.bo.infn.it::/data
DB3.DB
0003 shift.cern.ch::/shift/data
DB3.DB
shift.cern.ch::/shift/data
L.M.Barone – INFN Rome
pc1.pd.infn.it
PD
Ref
Commissione Nazionale I
13 Settembre 2000
Example Summary
• Basic functionalities of a Replica Manager
for production will be tested by end of 2000
on CMS production (GDMP)
• Next comes an Information Server to allow
easy synchronization of federations and
optimized data access during analysis
• The same functionalities shown for
Objectivity/DB may/should be
implemented for other kind of files
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
Conclusions
• Data Management Tools are needed to
face the complexity of new generation
experiments (not only LHC)
• The GRID projects (INFN and EU) are
already providing solutions to real life
problems
• Milestones and objectives are well defined
(to meet them will not be trivial)
L.M.Barone – INFN Rome
Commissione Nazionale I
13 Settembre 2000
Fly UP