50 %
50 %
Information about BaroneComm1

Published on October 29, 2007

Author: Goldie


WP2 - Data Management :  WP2 - Data Management L.M.Barone Università di Roma & INFN WP Goals:  WP Goals “ permit the secure access of massive amounts of move and replicate data at high speed from one site to another and to manage the synchronisation of remote data copies” (dal Technical Annex di DataGrid) Keywords:  Keywords Automation Caching Generic Interface MetaData Data Mover Replica Manager Security People:  People SEDE NOME FTE Bari: L.Silvestris 0.3 G.Zito 0.5 (0.3) Pisa: S.Arezzini 0.3 (0.3) A.Controzzi 0.5 F.Donno 0.2 (0.2) F.Schifano 0.2 Roma1: L.M.Barone 0.3 (0.3) A.Lonardo 0.3 A.Michelotti 0.3 G.Organtini 0.2 D.Rossetti 0.2 (0.2) Deliverables:  Deliverables Requirements for Data Location Broker 5/2001 Definition of a metadata syntax 7/2001 Replica Management at file level 12/2001 An Example:  An Example Ideas for a Replica Manager: Management of production in a distributed environment: Data produced in many sites Data collected in a single reference site Data analyzed in many sites Data sometimes are moved, sometimes may be accessed via network A case study with Objectivity/DB can be extended to any kind of file Cloning federations:  Cloning federations Clone FD Productions:  Productions GDMP GDMP GDMP GDMP Analysis:  Analysis CERN FD DB1 DB2 DB3 DBn CERN Boot RC1 FD RC1 Boot DBn+1 DBn+m DBn+m+k DBn+m+1 DBn+m DBn+1 Logical vs Physical Datasets:  Logical vs Physical Datasets Dataset: H 2 Dataset: H 2e Hmm.1.hits.DB Hmm.2.hits.DB Hmm.3.hits.DB Hee.1.hits.DB id=12345 id=12346 id=12347 id=5678 Hee.2.hits.DB id=5679 Hee.3.hits.DB id=5680 Logical vs Physical Datasets:  Logical vs Physical Datasets Each dataset is composed by one or more databases datasets are managed by application-sw Each DB is univocally identified by a DBid DBid assignment is a logical-db creation The physical-db is the file zero, one or more instancies The GIS manages the link between a dataset, its logical-dbs and its physical-dbs Database creation:  CERN FD DB1 DB2 DB3 DB4 0001 DB1.DB 0002 DB2.DB 0003 DB3.DB 0004 DB4.DB 0005 Database creation 0001 DB1.DB 0002 DB2.DB 0003 DB3.DB 0004 DB4.DB 0005 DB5.DB 0001 DB1.DB 0002 DB2.DB 0003 DB3.DB 0004 DB4.DB 0005 DB5.db Replica Management:  0001 DB1.DB 0002 DB2.DB 0003 DB3.DB 0001 DB1.DB 0002 DB2.DB 0003 DB3.DB Replica Management CERN FD DB1 DB2 DB3 BO Ref PD Ref DB1 Example Summary:  Example Summary Basic functionalities of a Replica Manager for production will be tested by end of 2000 on CMS production (GDMP) Next comes an Information Server to allow easy synchronization of federations and optimized data access during analysis The same functionalities shown for Objectivity/DB may/should be implemented for other kind of files Conclusions:  Conclusions Data Management Tools are needed to face the complexity of new generation experiments (not only LHC) The GRID projects (INFN and EU) are already providing solutions to real life problems Milestones and objectives are well defined (to meet them will not be trivial)

Add a comment

Related presentations