advertisement

AOLVisit hbn092303

50 %
50 %
advertisement
Information about AOLVisit hbn092303
Travel-Nature

Published on March 25, 2008

Author: Renato

Source: authorstream.com

advertisement

Slide1:  Caltech HEP: Next Generation Networks, Grids and Collaborative Systems for Global VOs Harvey B. Newman California Institute of Technology AOL Visit to Caltech September 23, 2003 Slide2:  First Beams: April 2007 Physics Runs: from Summer 2007 TOTEM pp, general purpose; HI LHCb: B-physics ALICE : HI pp s =14 TeV L=1034 cm-2 s-1 27 km Tunnel in Switzerland & France CMS Design Reports: Computing Fall 2004; Physics Fall 2005 ATLAS CMS: Higgs at LHC:  CMS: Higgs at LHC Higgs to Two Photons Higgs to Four Muons General purpose pp detector; well-adapted to lower initial lumi Caltech Work on Crystal ECAL for precise e and g measurements; Higgs Physics Precise All-Silicon Tracker: 223 m2 Excellent muon ID and precise momentum measurements (Tracker + Standalone Muon) Caltech Work on Forward Muon Reco. & Trigger, XDAQ for Slice Tests FULL CMS SIMULATION LHC: Higgs Decay into 4 muons (Tracker only); 1000X LEP Data Rate:  109 events/sec, selectivity: 1 in 1013 (1 person in a thousand world populations) LHC: Higgs Decay into 4 muons (Tracker only); 1000X LEP Data Rate The CMS Collaboration Is Progressing:  The CMS Collaboration Is Progressing 2000+ Physicists & Engineers 36 Countries 159 Institutions Slovak Republic CERN France Italy UK Switzerland USA Austria Finland Greece Hungary Belgium Poland Portugal Spain Pakistan Georgia Armenia Ukraine Uzbekistan Cyprus Croatia China Turkey Belarus Estonia India Germany Korea Russia Bulgaria China (Taiwan) USA NEW in US CMS FIU YALE South America: UERJ Brazil LHC Data Grid Hierarchy: Developed at Caltech:  LHC Data Grid Hierarchy: Developed at Caltech Emerging Vision: A Richly Structured, Global Dynamic System Next Generation Networks and Grids for HEP Experiments:  Next Generation Networks and Grids for HEP Experiments Providing rapid access to event samples and analyzed physics results drawn from massive data stores From Petabytes in 2003, ~100 Petabytes by 2007-8, to ~1 Exabyte by ~2013-5. Providing analyzed results with rapid turnaround, by coordinating and managing large but LIMITED computing, data handling and NETWORK resources effectively Enabling rapid access to the Data and the Collaboration Across an ensemble of networks of varying capability Advanced integrated applications, such as Data Grids, rely on seamless operation of our LANs and WANs With reliable, monitored, quantifiable high performance Worldwide Analysis: Data explored and analyzed by thousands of globally dispersed scientists, in hundreds of teams 2001 Transatlantic Net WG Bandwidth Requirements [*]:  2001 Transatlantic Net WG Bandwidth Requirements [*] [*] See http://gate.hep.anl.gov/lprice/TAN. The 2001 LHC requirements outlook now looks Very Conservative in 2003 Production BW Growth of Int’l HENP Network Links (US-CERN Example):  Production BW Growth of Int’l HENP Network Links (US-CERN Example) Rate of Progress >> Moore’s Law. (US-CERN Example) 9.6 kbps Analog (1985) 64-256 kbps Digital (1989 - 1994) [X 7 – 27] 1.5 Mbps Shared (1990-3; IBM) [X 160] 2 -4 Mbps (1996-1998) [X 200-400] 12-20 Mbps (1999-2000) [X 1.2k-2k] 155-310 Mbps (2001-2) [X 16k – 32k] 622 Mbps (2002-3) [X 65k] 2.5 Gbps (2003-4) [X 250k] 10 Gbps  (2005) [X 1M] A factor of ~1M over a period of 1985-2005 (a factor of ~5k during 1995-2005) HENP has become a leading applications driver, and also a co-developer of global networks HEP is Learning How to Use Gbps Networks Fully: Factor of 25-100 Gain in Max. Sustained TCP Thruput in 15 Months, On Some US+TransAtlantic Routes:  * 9/01 105 Mbps 30 Streams: SLAC-IN2P3; 102 Mbps 1 Stream CIT-CERN 1/09/02 190 Mbps for One stream shared on Two 155 Mbps links 5/20/02 450-600 Mbps SLAC-Manchester on OC12 with ~100 Streams 6/1/02 290 Mbps Chicago-CERN One Stream on OC12 (mod. Kernel) 9/02 850, 1350, 1900 Mbps Chicago-CERN 1,2,3 GbE Streams, 2.5G Link 11/02 [LSR] 930 Mbps in 1 Stream California-CERN, and California-AMS FAST TCP 9.4 Gbps in 10 Flows California-Chicago 2/03 [LSR] 2.38 Gbps in 1 Stream California-Geneva (99% Link Utilization) 5/03 [LSR] 0.94 Gbps IPv6 in 1 Stream Chicago- Geneva Fall 2003 Goal: 6-10 Gbps in 1 Stream over 7-10,000 km (10G Link); LSRs HEP is Learning How to Use Gbps Networks Fully: Factor of 25-100 Gain in Max. Sustained TCP Thruput in 15 Months, On Some US+TransAtlantic Routes FAST TCP: Baltimore/Sunnyvale :  FAST TCP: Baltimore/Sunnyvale 1 flow 2 flows 7 flows 9 flows 10 flows Average utilization 95% 92% 90% 90% 88% Measurements 11/03 Std Packet Size Utilization averaged over > 1hr 4000 km Path RTT estimation: fine-grain timer Delay monitoring in equilibrium Pacing: reducing burstiness Fast convergence to equilibrium Fair Sharing Fast Recovery 8.6 Gbps; 21.6 TB in 6 Hours 9G 10G On Feb. 27-28, a Terabyte of data was transferred in 3700 seconds by S. Ravot of Caltech between the Level3 PoP in Sunnyvale near SLAC and CERN through the TeraGrid router at StarLight from memory to memory As a single TCP/IP stream at average rate of 2.38 Gbps. (Using large windows and 9kB “Jumbo frames”) This beat the former record by a factor of ~2.5, and used the US-CERN link at 99% efficiency.:  On Feb. 27-28, a Terabyte of data was transferred in 3700 seconds by S. Ravot of Caltech between the Level3 PoP in Sunnyvale near SLAC and CERN through the TeraGrid router at StarLight from memory to memory As a single TCP/IP stream at average rate of 2.38 Gbps. (Using large windows and 9kB “Jumbo frames”) This beat the former record by a factor of ~2.5, and used the US-CERN link at 99% efficiency. 10GigE Data Transfer: Internet2 LSR European Commission 10GigE NIC Slide13:  “Private” Grids”: Structured P2P Sub-Communities in Global HEP HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps:  HENP Major Links: Bandwidth Roadmap (Scenario) in Gbps Continuing the Trend: ~1000 Times Bandwidth Growth Per Decade; We are Rapidly Learning to Use Multi-Gbps Networks Dynamically HENP Lambda Grids: Fibers for Physics:  HENP Lambda Grids: Fibers for Physics Problem: Extract “Small” Data Subsets of 1 to 100 Terabytes from 1 to 1000 Petabyte Data Stores Survivability of the HENP Global Grid System, with hundreds of such transactions per day (circa 2007) requires that each transaction be completed in a relatively short time. Example: Take 800 secs to complete the transaction. Then Transaction Size (TB) Net Throughput (Gbps) 1 10 10 100 100 1000 (Capacity of Fiber Today) Summary: Providing Switching of 10 Gbps wavelengths within ~3-5 years; and Terabit Switching within 5-8 years would enable “Petascale Grids with Terabyte transactions”, to fully realize the discovery potential of major HENP programs, as well as other data-intensive fields. The Move to OGSA and then Managed Integration Systems:  The Move to OGSA and then Managed Integration Systems Increased functionality, standardization Time Custom solutions Open Grid Services Arch GGF: OGSI, … (+ OASIS, W3C) Multiple implementations, including Globus Toolkit Web services + … Globus Toolkit Defacto standards GGF: GridFTP, GSI X.509, LDAP, FTP, … App-specific Services ~Integrated Systems Stateful; Managed Dynamic Distributed Services Architecture (DDSA):  “Station Server” Services-engines at sites host “Dynamic Services” Auto-discovering, Collaborative Servers interconnect dynamically; form a robust fabric in which mobile agents travel, with a payload of (analysis) tasks Service Agents: Goal-Oriented, Autonomous, Adaptive Maintain State: Automatic “Event” notification Adaptable to Web services: OGSA; many platforms & working environments (also mobile) Dynamic Distributed Services Architecture (DDSA) Caltech/UPB (Romania)/NUST (Pakistan) Collaboration See http://monalisa.cacr.caltech.edu http://diamonds.cacr.caltech.edu Slide18:  By I. Legrand (Caltech) et al. Monitors Clusters, Networks Agent-based Dynamic information / resource discovery mechanisms Implemented in Java/Jini; SNMP WDSL / SOAP with UDDI Global System Optimizations > 50 Sites and Growing Being deployed in Abilene; through the Internet2 E2EPi MonALISA (Java) 3D Interface MonaLisa: A Globally Scalable Grid Monitoring System UltraLight Collaboration: http://ultralight.caltech.edu:  First Integrated packet switched and circuit switched hybrid experimental research network; leveraging transoceanic R&D network partnerships NLR Wave: 10 GbE (LAN-PHY) wave across the US; (G)MPLS managed Optical paths transatlantic; extensions to Japan, Taiwan, Brazil End-to-end monitoring; Realtime tracking and optimization; Dynamic bandwidth provisioning, Agent-based services spanning all layers of the system, from the optical cross-connects to the applications. UltraLight Collaboration: http://ultralight.caltech.edu Caltech, UF, FIU, UMich, SLAC,FNAL, MIT/Haystack, CERN, UERJ(Rio), NLR, CENIC, UCAID, Translight, UKLight, Netherlight, UvA, UCLondon, KEK, Taiwan Cisco, Level(3) Grid Analysis Environment: R&D Led by Caltech HEP:  Grid Analysis Environment: R&D Led by Caltech HEP Building a GAE is the “Acid Test” for Grids; and is crucial for LHC experiments Large, Diverse, Distributed Community of users Support for hundreds to thousands of analysis tasks, shared among dozens of sites Widely varying task requirements and priorities Need for Priority Schemes, robust authentication and Security Operation in a severely resource-limited and policy- constrained global system Dominated by collaboration policy and strategy, for resource-usage and priorities GAE is where the physics gets done Where physicists learn to collaborate on analysis, across the country, and across world-regions Grid Enabled Analysis: User View of a Collaborative Desktop :  Grid Enabled Analysis: User View of a Collaborative Desktop Physics analysis requires varying levels of interactivity, from “instantaneous response” to “background” to “batch mode” Requires adapting the classical Grid “batch-oriented” view to a services-oriented view, with tasks monitored and tracked Use Web Services, leveraging wide availability of commodity tools and protocols: adaptable to a variety of platforms Implement the Clarens Web Services layer as mediator between authenticated clients and services as part of CAIGEE architecture Clarens presents a consistent analysis environment to users, based on WSDL/SOAP or XML RPCs, with PKI-based authentication for Security PDA ROOT Clarens External Services MonaLisa Browser Iguana VO Management Authentication Authorization Logging Key Escrow File Access Shell Storage Resource Broker CMS ORCA/COBRA Cluster Schedulers ATLAS DIAL Griphyn VDT MonaLisa Monitoring VRVS on Windows:  VRVS on Windows VRVS (Version 3) Meeting in 8 Time Zones 73 Reflectors Deployed Worldwide Users in 83 Countries Caltech HEP Group CONCLUSIONS:  Caltech HEP Group CONCLUSIONS Caltech has been a leading inventor/developer of systems for Global VOs, spanning multiple technology generations International Wide Area Networks Since 1982; Global role from 2000 Collaborative Systems (VRVS) Since 1994 Distributed Databases since 1996 The Data Grid Hierarchy and Dynamic Distributed Systems Since 1999 Work on Advanced Network Protocols from 2000 A Focus on the Grid-enabled Analysis Environment for Data Intensive Science Since 2001 Strong HEP/CACR/CS-EE Partnership [Bunn, Low] Driven by the Search for New Physics at the TeV Energy Scale at the LHC Unprecedented Challenges in Access, Processing, and Analysis of Petabyte to Exabyte Data; and Policy-Driven Global Resource Sharing Broad Applicability Within and Beyond Science: Managed, Global Systems for Data Intensive and/or Realtime Applications AOL Site Team: Many Apparent Synergies with Caltech Team: Areas of Interest, Technical Goals and Development Directions Some Extra Slides Follow:  Some Extra Slides Follow U.S. CMS is Progressing: 400+ Members, 38 Institutions:  U.S. CMS is Progressing: 400+ Members, 38 Institutions New in 2002/3: FIU, Yale Caltech has Led the US CMS Collaboration Board Since 1998; 3rd Term as Chair Through 2004 + Physics Potential of CMS: We Need to Be Ready on Day 1:  At L0=2x1033 cm-2s-1 1 day ~ 60 pb-1 1 month ~ 2 fb-1 1 year ~ 20 fb-1 3 months 1 year MH = 130 GeV LHCC: CMS detector is well optimized for LHC physics. To fully exploit the physics potential of the LHC for discovery we will start with a “COMPLETE”* CMS detector. In particular a complete ECAL from the beginning for the low mass Hgg channel. Physics Potential of CMS: We Need to Be Ready on Day 1 Caltech Role: Precision e/g Physics With CMS:  H 0  gg In the CMS Precision ECAL Caltech Role: Precision e/g Physics With CMS Crystal Quality in Mass Production Precision Laser Monitoring Study of Calibration Physics Channels Inclusive J,U, W, Z Realistic H 0  gg Background Studies: 2.5 M Events Signal/Bgd Optimization: g/Jet Separation Vertex Reconstruction with Associated Tracks Photon Reconstruction: Pixels + ECAL + Tracker Optimization of Tracker Layout Higher Level Trig. On Isolated g ECAL Design: Crystal Sizes Cost- Optimized for g/Jet Separation CMS SUSY Reach:  The LHC could establish the existence of SUSY; study the masses and decays of SUSY particles The cosmologically interesting region of the SUSY space could be covered in the first weeks of LHC running. The 1.5 to 2 TeV mass range for squarks and gluinos could be covered within one year at low luminosity. CMS SUSY Reach HCAL Barrels Done: Installing HCAL Endcap and Muon CSCs in SX5:  HCAL Barrels Done: Installing HCAL Endcap and Muon CSCs in SX5 36 Muon CSCs successfully installed on YE-2,3. Avg. rate 6/day (planned 4/day). Cabling+commissioning. HE-1 complete, HE+ will be mounted in Q4 2003 UltraLight: Proposed to the NSF/EIN Program:  UltraLight: Proposed to the NSF/EIN Program http://ultralight.caltech.edu First “Hybrid” packet-switched and circuit-switched optical network Trans-US wavelength riding on NLR: LA-SNV-CHI-JAX Leveraging advanced research & production networks USLIC/DataTAG, SURFnet/NLlight, UKLight, Abilene, CA*net4 Dark fiber to CIT, SLAC, FNAL, UMich; Florida Light Rail Intercont’l extensions: Rio de Janeiro, Tokyo, Taiwan Three Flagship Applications HENP: TByte to PByte “block” data transfers at 1-10+ Gbps eVLBI: Real time data streams at 1 to several Gbps Radiation Oncology: GByte image “bursts” delivered in ~1 second A traffic mix presenting a variety of network challenges UltraLight: An Ultra-scale Optical Network Laboratory for Next Generation Science:  UltraLight: An Ultra-scale Optical Network Laboratory for Next Generation Science http://ultralight.caltech.edu Ultrascale protocols and MPLS: Classes of service used to share primary 10G  efficiently Scheduled or sudden “overflow” demands handled by provisioning additional wavelengths: GE, N*GE, and eventually 10 GE Use path diversity, e.g. across the Atlantic, Canada Move to multiple 10G ’s (leveraged) by 2005-6 Unique feature: agent-based, end-to-end monitored, dynamically provisioned mode of operation Agent services span all layers of the system; Communication application characteristics and requirements to The protocol stacks, MPLS class provisioning and the optical cross-connects Dynamic responses help manage traffic flow History – One large Research Site:  History – One large Research Site Current Traffic to ~400 Mbps; Projections: 0.5 to 24 Tbps by ~2012 Much of the Traffic: SLAC  IN2P3/RAL/INFN; via ESnet+France; Abilene+CERN VRVS Core Architecture:  VRVS Core Architecture VRVS combined the best of all standards and products in one unique architecture Multi-platform and multi-protocol architecture MONARC/SONN: 3 Regional Centres Learning to Export Jobs (Day 9):  MONARC/SONN: 3 Regional Centres Learning to Export Jobs (Day 9) NUST 20 CPUs CERN 30 CPUs CALTECH 25 CPUs 1MB/s ; 150 ms RTT 1.2 MB/s 150 ms RTT 0.8 MB/s 200 ms RTT Day = 9 <E> = 0.73 <E> = 0.66 <E> = 0.83 Building the LHC Computing Model: Focus on New Persistency Simulations for Strategy and System Services Development I. Legrand, F. van Lingen GAE Collaboration Desktop Example:  GAE Collaboration Desktop Example Four-screen Analysis Desktop 4 Flat Panels: 5120 X 1024 Driven by a single server and single graphics card Allows simultaneous work on: Traditional analysis tools (e.g. ROOT) Software development Event displays (e.g. IGUANA) MonALISA monitoring displays; Other “Grid Views” Job-progress Views Persistent collaboration (e.g. VRVS; shared windows) Online event or detector monitoring Web browsing, email GAE Workshop: Components and Services; GAE Task Lifecycle:  GAE Workshop: Components and Services; GAE Task Lifecycle GAE Components & Services VO authorization/management Software Install/Config. Tools Virtual Data System Data Service Catalog (Metadata) Replica Management Service Data Mover/Delivery Service [NEW] Planners (Abstract; Concrete) Job Execution Service Data Collection Services – couples analysis selections/expressions to datasets/replicas Estimators Events; Strategic Error Handling; Adaptive Optimization Grid-Based Analysis Task’s Life: Authentication DATA SELECTION Query/Dataset Selection/?? Session Start Establish Slave/server config. Data Placement Resource Broker for resource assignment Or static configuration Availability/Cost Estimation Launch masters/slaves/Grid Execution services ESTABLISH TASK – Initiate & Software Specification/Install Execute (with dynamic Job Control) Report Status (Logging/Metadata/partial results) Task Completion (Cleanup, data merge/archive/catalog) Task End Task Save LOOP to ESTABLISH TASK LOOP to DATA SELECTION Grid Enabled Analysis Architecture:  Grid Enabled Analysis Architecture Michael Thomas July, 2003 HENP Networks and Grids; UltraLight:  HENP Networks and Grids; UltraLight The network backbones and major links used by major HENP projects advanced rapidly in 2001-2 To the 2.5-10 G range in 15 months; much faster than Moore’s Law Continuing a trend: a factor ~1000 improvement per decade Network costs continue to fall rapidly Transition to a community-owned and operated infrastructure for research and education is beginning with (NLR, USAWaves) HENP (Caltech/DataTAG/SLAC/LANL Team) is learning to use 1-10 Gbps networks effectively over long distances Unique Fall Demos: to 10 Gbps flows over 10k km A new HENP and DOE Roadmap: Gbps to Tbps links in ~10 Years UltraLight: A hybrid packet-switched and circuit-switched network: ultrascale protocols, MPLS and dynamic provisioning Sharing, augmenting NLR and internat’l optical infrastructures May be a cost-effective model for future HENP, DOE networks

Add a comment

Related presentations