Published on October 24, 2007
Slide1: Data Grids for Next Generation Experiments Harvey B Newman California Institute of Technology ACAT2000 Fermilab, October 19, 2000 http://l3www.cern.ch/~newman/grids_acat2k.ppt Physics and Technical Goals: Physics and Technical Goals The extraction of small or subtle new “discovery” signals from large and potentially overwhelming backgrounds; or “precision” analysis of large samples Providing rapid access to event samples and subsets from massive data stores, from ~300 Terabytes in 2001 Petabytes by ~2003, ~10 Petabytes by 2006, to ~100 Petabytes by ~2010. Providing analyzed results with rapid turnaround, by coordinating and managing the LIMITED computing, data handling and network resources effectively Enabling rapid access to the data and the collaboration, across an ensemble of networks of varying capability, using heterogeneous resources. Four LHC Experiments: The Petabyte to Exabyte Challenge: Four LHC Experiments: The Petabyte to Exabyte Challenge ATLAS, CMS, ALICE, LHCB Higgs + New particles; Quark-Gluon Plasma; CP Violation Data written to tape ~25 Petabytes/Year and UP (CPU: 6 MSi95 and UP) 0.1 to 1 Exabyte (1 EB = 1018 Bytes) (~2010) (~2020 ?) Total for the LHC Experiments LHC Vision: Data Grid Hierarchy: LHC Vision: Data Grid Hierarchy Tier 1 Online System Offline Farm, CERN Computer Ctr > 30 TIPS FranceCenter FNAL Center Italy Center UK Center Institute Institute Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec ~2.5 Gbits/sec 100 - 1000 Mbits/sec 1 Bunch crossing; ~17 interactions per 25 nsecs; 100 triggers per second. Event is ~1 MByte in size Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels Physics data cache ~PBytes/sec ~0.6-2.5 Gbits/sec ~622 Mbits/sec Tier 0 +1 Tier 3 Tier 4 Tier 2 Experiment Why Worldwide Computing?Regional Center Concept: Advantages: Why Worldwide Computing? Regional Center Concept: Advantages Managed, fair-shared access for Physicists everywhere Maximize total funding resources while meeting the total computing and data handling needs Balance between proximity of datasets to appropriate resources, and to the users Tier-N Model Efficient use of network: higher throughput Per Flow: Local > regional > national > international Utilizing all intellectual resources, in several time zones CERN, national labs, universities, remote sites Involving physicists and students at their home institutions Greater flexibility to pursue different physics interests, priorities, and resource allocation strategies by region And/or by Common Interests (physics topics, subdetectors,…) Manage the System’s Complexity Partitioning facility tasks, to manage and focus resources SDSS Data Grid (In GriPhyN): A Shared Vision: SDSS Data Grid (In GriPhyN): A Shared Vision Three main functions: Raw data processing on a Grid (FNAL) Rapid turnaround with TBs of data Accessible storage of all image data Fast science analysis environment (JHU) Combined data access + analysis of calibrated data Distributed I/O layer and processing layer; shared by whole collaboration Public data access SDSS data browsing for astronomers, and students Complex query engine for the public US-CERN BW RequirementsProjection (PRELIMINARY): US-CERN BW Requirements Projection (PRELIMINARY) [#] Includes ~1.5 Gbps Each for ATLAS and CMS, Plus Babar, Run2 and Other [*] D0 and CDF at Run2: Needs Presumed to Be to be Comparable to BaBar Daily, Weekly, Monthly and Yearly Statistics on the 45 Mbps US-CERN Link : Daily, Weekly, Monthly and Yearly Statistics on the 45 Mbps US-CERN Link Regional Center Architecture(I. Gaines): Regional Center Architecture (I. Gaines) Tape Mass Storage & Disk Servers Database Servers Physics Software Development R&D Systems and Testbeds Info servers Code servers Web Servers Telepresence Servers Training Consulting Help Desk Production Reconstruction Raw/Sim ESD Scheduled, predictable experiment/ physics groups Production Analysis ESD AOD AOD DPD Scheduled Physics groups Individual Analysis AOD DPD and plots Chaotic Physicists Desktops Tier 2 Local institutes CERN Tapes Support Services MONARC Architectures WG: Regional Centre Services Required: [*] See http://monarc.web.cern.ch/MONARC/docs/phase2report/Phase2Report.pdf MONARC Architectures WG: Regional Centre Services Required All data and technical services required to do physics analysis [*] All Physics Objects, Tags and Calibration data Significant fraction of raw data Excellent network connectivity to CERN and the region’s users A fair share of post- and re-reconstruction processing Manpower to share in the development of common validation and production software Manpower to share in ongoing work on Common (Grid and Other) R&D Projects Excellent support services for training, documentation, troubleshooting at the Centre or remote sites served by it Service to members of other regions Long Term Commitment: staffing, hardware evolution, support LHC Tier 2 Center In 2001: LHC Tier 2 Center In 2001 OC-12 Tier2 Prototype (CMS) Distributed: Caltech/UCSD, Over CALREN (+NTON) UC Davis, Riverside, UCLA Clients University (UC) Fund Sharing 2 X 40 Dual Nodes: 160 CPUs Rackmounted 2U ~2 TB RAID Array Multi-Scheduler GDMP Testbed Startup By End October 2000 (CMS HLT Production) Roles of Projectsfor HENP Distributed Analysis: RD45, GIOD Networked Object Databases Clipper/GC High speed access to Objects or File data FNAL/SAM for processing and analysis SLAC/OOFS Distributed File System + Objectivity Interface NILE, Condor: Fault Tolerant Distributed Computing MONARC LHC Computing Models: Architecture, Simulation, Strategy, Politics ALDAP OO Database Structures & Access Methods for Astrophysics and HENP Data PPDG First Distributed Data Services and Data Grid System Prototype GriPhyN Production-Scale Data Grids EU Data Grid Roles of Projects for HENP Distributed Analysis Grid Services Architecture [*]: Grid Services Architecture [*] Grid Fabric Grid Services Appln Toolkits Applns Data stores, networks, computers, display devices,… ; associated local services Protocols, authentication, policy, resource management, instrumentation, discovery,etc. ... Remote viz toolkit Remote comp. toolkit Remote data toolkit Remote sensors toolkit Remote collab. toolkit A Rich Set of HEP Data-Analysis Related Applications [*] Adapted from Ian Foster: there are computing grids, access (collaborative) grids, data grids, ... The Particle Physics Data Grid (PPDG): The Particle Physics Data Grid (PPDG) First Round Goal: Optimized cached read access to 10-100 Gbytes drawn from a total data set of 0.1 to ~1 Petabyte Site to Site Data Replication Service 100 Mbytes/sec ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, U.Wisc/CS Multi-Site Cached File Access Service Matchmaking, Co-Scheduling: SRB, Condor, Globus services; HRM, NWS PPDG WG1: Request Manager: PPDG WG1: Request Manager tape system HRM Replica catalog Network Weather Service Physical file transfer requests GRID Disk Cache Event-file Index DRM Disk Cache Logical Set of Files DRM Disk Cache CLIENT CLIENT Logical Request REQUEST MANAGER Earth Grid System Prototype Inter-communication Diagram: LLNL Earth Grid System Prototype Inter-communication Diagram Disk Client Request Manager SDSC GSI-pftpd HPSS LBNL GSI-wuftpd Disk NCAR GSI-wuftpd Disk LBNL Disk on Clipper HPSS HRM ANL Replica Catalog GIS with NWS GSI-ncftp GSI-ncftp GSI-ncftp LDAP Script LDAP C API or Script GSI-ncftp GSI-ncftp GSI-ncftp CORBA Grid Data Management Prototype (GDMP): Grid Data Management Prototype (GDMP) Distributed Job Execution and Data Handling: Goals Transparency Performance Security Fault Tolerance Automation Submit job Replicate data Replicate data Site A Site B Site C Jobs are executed locally or remotely Data is always written locally Data is replicated to remote sites Job writes data locally Slide18: EU-Grid Project Work Packages GriPhyN: PetaScale Virtual Data Grids: GriPhyN: PetaScale Virtual Data Grids Build the Foundation for Petascale Virtual Data Grids Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, computers, and network) Resource Management Services Resource Management Services Security and Policy Services Security and Policy Services Other Grid Services Other Grid Services Interactive User Tools Production Team Individual Investigator Workgroups Raw data source Data Grids: Better Global Resource Use and Faster Turnaround: Data Grids: Better Global Resource Use and Faster Turnaround Build Information and Security Infrastructures Across Several World Regions Authentication: Prioritization, Resource Allocation Coordinated use of computing, data handling and network resources through: Data caching, query estimation, co-scheduling Network and site “instrumentation”: performance tracking, monitoring, problem trapping and handling Robust Transactions Agent Based: Autonomous, Adaptive, Network Efficient, Resilient Heuristic, Adaptive Load-Balancing E.g. Self-Organzing Neural Nets (Legrand) GRIDs In 2000: Summary: GRIDs In 2000: Summary Grids will change the way we do science and engineering: computation to large scale data Key services and concepts have been identified, and development has started Major IT challenges remain An Opportunity & Obligation for HEP/CS Collaboration Transition of services and applications to production use is starting to occur In future more sophisticated integrated services and toolsets (Inter- and IntraGrids+) could drive advances in many fields of science & engineering HENP, facing the need for Petascale Virtual Data, is both an early adopter, and a leading developer of Data Grid technology
Monday October 16, 2000 . 08:00 Registration. Plenary Session, Ramsey Auditorium - Chair: Matthias Kasemann. 09:00 Welcome. Michael S. Witherell (Director ...
Innovative Software Algorithms & Tools Session V ... ISAT501: Worldwide Distributed Analysis and Data Grids for Next-Generation Physics Experiments