...

Slides

by user

on
Category: Documents
14

views

Report

Comments

Description

Transcript

Slides
DB
ES
Experiment Support
AliEn v2-20 and beyond
Workshop dei Tier-2 italiani di ALICE
A. Abramyan, S. Bagnasco, L. Betev, D. Goyal,
A. Grigoras, C. Grigoras, M. Litmaath,
N. Manukyan, M. Martinez, J. Porter,
P. Saiz, S. Sankar, S. Schreiner
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
ES
Content
• What is AliEn
• New features on v2.20
– TaskQueue
– Catalogue
– Service communication
• What is next?
• Summary
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
2
ES
AliEn
• All components to create a GRID
• File Catalogue
–
–
–
–
UNIX-like file system
Mapping to physical files
Metadata information
SE discovery
• Transfer Model
– With different plugins
• TaskQueue
– Job Agent & pull model
– Automatic installation of software packages
– Simulation, reconstruction, analysis...
• Developed by ALICE
– Used by several communities
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
3
ES
AliEn File Catalogue
• Global Unique name space
– Mapping from LFN to PFN
•
•
•
•
•
•
•
•
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
UNIX-like file system interface
Powerful metadata catalogue
Automatic SE selection
Integrated quota system
Multiple storage protocols: xrootd, torrent, srm, file
Collections of files
Physical file archival
Roles and users
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
4
ES
Job execution
JOB
JobJOB
JOB
Manager
JOB
TASKQUEUE
Site C
Job
Broker
CE
Site B
JA
MonALISA
CE
File catalogue
JA
xrootdCREAMCE
MonALISA
LFN GUID Meta
data
Site A
xrootd CREAMCE
JA
CE
MonALISA
xrootdCREAMCE
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
5
ES
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
New in v2.20
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
6
ES
TaskQueue database layout
• Single DB
• Innodb tables
– Row locking
– Foreign keys
– Transactions
• not used…
• Lookup tables
• 2 JDLs per job
• JDL fields mapped to
columns
• Link to full graph
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
7
ES
Brokering
• Avoid Classad matching
– Less fields to parse
• Match in a single SQL statement.
• Four attempts at matching:
–
–
–
–
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
With packages already installed
With any packages
With remote data and packages already installed
With remote data, any packages
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
8
ES
File brokering
Site A
Site B
Site C
File 1
Current schema
Submit 4 jobs:
File 2
File1
File 4
File 3
File2
File3
File 5
File 4
File 5
Broker per file
Submit 3 empty subjobs
If nothing left,
just exit
File1,
2,4,5
When a job starts,
analyze as much as possible
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
File 3
Workshop dei Tier-2 italiani di ALICE
9
ES
More TaskQueue
• MaxWaitingTime: amount of time that job
can stay in ‘WAITING’
– If time exceeded, job ends up in error
– New state: ERROR_EW (Expired Waiting)
• Retrial:
– Number of times that a single job can be
resubmitted
– Resubmission done by central services
• Reusing JobId in resubmission
• Direct removal of KILLED jobs
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
10
ES
Some results…
• DB time to insert a job, and 8 change status:
Time to process all
250M ALICE jobs:
4.8 days
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
11
ES
Service communication
• Replacing SOAP with JSON
– Less overhead (no XML encoding)
– Easier to interact with other clients
• Backward incompatible change 
To be deployed in
ALICE…
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
12
ES
SOAP vs JSON
• Apache web
server
• 32 hosts for
clients
– 16 cores
– 8000 calls
per client
• Without SSL
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
13
ES
Catalogue
• Innodb tables
– Row locking
– Transactions
– Foreign keys
To be deployed in
ALICE…
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
14
ES
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
What is next?
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
15
ES
And for the next versions…
•
•
•
•
•
•
•
•
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
Trust model
File popularity
Interactive jobs
Correlate Monitoring data
Multi core jobagents
Catalogue crawler
Error classification
Distributed brokering
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
16
ES
File catalogue
• Removal of GUID
– Decrease size of the catalogue
– Storage on the sites based on lfn+timestamp
• Using file system instead of Database
– Keep database for metadata, quotas, SE.
• Improve handling of zip archives
– More than 80% of the lfn are inside an archive
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
17
ES
TaskQueue
• Compression of JDL
– And/or storing diffs
• Brokering alternatives:
– 2-level brokering
• JA ask CM, CM asks in bulk the CS
– Combining jobs with similar input
• And dispatch them together
• Multicore jobagent
– One agent per core or per machine?
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
18
ES
Human grid
Scotland,
VO to VO
USA,
JA memory
Germany,
ORACLE
Switzerland,
Main dev.
Armenia,
XML model
File Popularity
China,
Trust Model
Italy,
CREAMCE
India,
File deletion
South Korea,
Quota system
Chile,
Trust model
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
19
ES
Summary
• Parts of AliEn v2.20 already deployed for
ALICE!
– Needs another intervention, with 48h downtime
– PANDA runs all the latest components
• TaskQueue speed improved drastically
– 40 times insertion rate
– 20 times resubmission time
– Improved concurrency
• Plenty of areas to develop and contribute
CERN IT Department
CH-1211 Geneva 23
Switzerland
www.cern.ch/it
19 Dec 2012
Pablo Saiz
Workshop dei Tier-2 italiani di ALICE
20
Fly UP