Published on June 18, 2007
Tactical Storage:Simple, Secure, and SemanticAccess to Remote Data: Tactical Storage: Simple, Secure, and Semantic Access to Remote Data Prof. Douglas Thain University of Notre Dame http://www.cse.nd.edu/~dthain Slide2: Slide3: Plentiful Computing Power: As of 25 April 2006... Condor Worldwide: 56,682 CPUs / ??? TB / 1758 sites Teragrid 15,328 CPUs / 220 TB / 6 sites Open Science Grid 21,156 CPUs / 83 TB / 61 sites EGEE Grid Lots??? http://www.cs.wisc.edu/condor/map Plentiful Computing Power Complex Ecology of Storage: Complex Ecology of Storage shared disk private disk HTTP, FTP, RFIO, gLite, SRB, SCP, RSYNC, HTTP... Independent Cluster Disks Problems Accessing Data: Problems Accessing Data Large Burden on the User User may not be able/willing to state files in advance. Different services/protocols available at different sites. Programs not modified to take advantage of services. Different access modes for different purposes. File transfer: preparing system for intended use. File system: access to data for running jobs. Resources go unused. Disks on each node of a cluster. Unorganized resources in a department/lab. Would like to combine disks into larger structures. A global file system can’t satisfy everyone! (Global means different things to different people.) Both a technical and social problem. What’s the Problem?: What’s the Problem? We often assume that the site administrator is responsible for making the site comfortable for the user. (Not possible on the grid!) Rather, the user should be able to bring along a mechanism to access multiple independent (remote?) data sources. Of course, we have to make it easy! Tactical Storage Systems (TSS): Tactical Storage Systems (TSS) A TSS allows any node to serve as a file server or as a file system client. All components can be deployed without special privileges – but with security. Users can build up complex structures. Filesystems, databases, caches, ... Admins need not know/care about larger structures. Two Independent Concepts: Resources – The raw storage to be used. Abstractions – The organization of storage. Slide9: file system file system file system file system file system file system file system App ??? Parrot Key Properties: Key Properties Tactical Storage is Simple: Appears as an ordinary filesystem. Applies to unmodified applications and data w/out code changes, relinking, kernel modules, etc... Tactical Storage is Secure: Authentication with standard GSI or Kerberos. Rich distributed access control system. Tactical Storage is Semantic: Name data by meaning, not by location. Supports external name resolution mechanisms. Slide11: Slide12: Slide13: Slide14: Slide15: Slide16: Access Control in File Servers: Access Control in File Servers Unix Security is not Sufficient No global user database possible/desirable. Mapping external credentials to Unix gets messy. Instead, Make External Names First-Class Perform access control on remote, not local, names. Types: Globus, Kerberos, Unix, Hostname, Address Each directory has an ACL: globus:/O=NotreDame/CN=DThain RWLA kerberos:email@example.com RWL hostname:*.cs.nd.edu RL address:192.168.1.* RWLA Distributed Group ACLs: file system file system file system file system file system file system file system UNIX UNIX UNIX UNIX UNIX UNIX UNIX file server file server file server file server file server file server file server Distributed Group ACLs Semantic Data Access: Semantic Data Access Appl Parrot /usr/local = /chirp/host5.nd.edu/software /tmp = /chirp/host9.nd.edu/scratch /data = /gsiftp/ftp.nd.edu/mydata /db = resolver:find_db find_db Slide20: Remote Database Access: Remote Database Access script Parrot file server file system DB data libdb.so sim.exe WAN Simple FS HEP Simulation Needs Direct DB Access App linked against Objectivity DB. Objectivity accesses filesystem directly. How to distribute application securely? Solution: Remote Root Mount via Parrot: parrot –M /=/chirp/fileserver/rootdir DB code can read/write/lock files directly. GSI Auth GSI Credit: Sander Klous @ NIKHEF Remote Application Loading: Remote Application Loading appl Parrot HTTP server file system liba.so libb.so libc.so Credit: Igor Sfiligoi @ Fermi National Lab HTTP Modular Simulation Needs Many Libraries Devel. on workstations, then ported to grid. Selection of library depends on analysis tech. Constraint: Must use HTTP for file access. Solution: Dynamic Link with TSS+HTTP: /home/cdfsoft -andgt; /http/dcaf.fnal.gov/cdfsoft select several MB from 60 GB of libraries proxy proxy Technical Problem: Technical Problem HTTP is not a filesystem! (No directories) Advantages: Firewalls, caches, admins. Appl Parrot HTTP Module HTTP Server root etc home bin alice cms babar Technical Problem: Technical Problem Solution: Turn the directories into files. Can be cached in ordinary proxies! Hierarchical SHA1 integrity check. Appl Parrot HTTP Module HTTP Server root etc home bin alice cms babar make httpfs Logical Access to Bio Data: Logical Access to Bio Data Many databases of biological data in different formats around the world: Archives: Swiss-Prot, TreMBL, NCBI, etc... Replicas: Public, Shared, Private, ??? Users and applications want to refer to data objects by logical name, not location! Access the nearest copy of the non-redundant protein database, don’t care where it is. Solution: EGEE data management system maps logical names (LFNs) to physical names (SFNs). Credit: Christophe Blanchet, Bioinformatics Center of Lyon, CNRS IBCP, France http://gbio.ibcp.fr/cblanchet, Christophe.Blanchet@ibcp.fr Logical Access to Bio Data: Logical Access to Bio Data BLAST Parrot RFIO gLite HTTP FTP Chirp Server FTP Server gLite Server EGEE File Location Service nr.data nr.data nr.data Performance of Bio Apps on EGEE: Performance of Bio Apps on EGEE Expandable Filesystemfor Experimental Data: Expandable Filesystem for Experimental Data Credit: John Poirer @ Notre Dame Astrophysics Dept. Project GRAND http://www.nd.edu/~grand Expandable Filesystemfor Experimental Data: Expandable Filesystem for Experimental Data Credit: John Poirer @ Notre Dame Astrophysics Dept. Project GRAND http://www.nd.edu/~grand file server Current Work: Current Work Credit: Jesus Izaguirre and Aaron Striegel @ Notre Dame Now that we can easily use any storage... Much easier to arrange data/jobs arbitrarily. Idea: combine cluster storage / cluster comp! Goal: keep jobs close to data that they need. PINS: Processing in STorage Example: GEMS Distributed Databank Facility for creating, storing, and analyzing molecular dynamics data in a cluster. Goal: Be able to easily scale both CPU and storage capacity by adding commodity nodes. Slide31: file system file system file system file system file system file system file system meta-data database F F(D1) Compute F(D1) Query (Mol=='CH4') andamp;andamp; (Tandgt;300K) More Open Problems: More Open Problems Resource Management How to prevent overcommitment -andgt; badput? Security How to easily express complex policies for sharing and controlling combined cpu/disk? Reliability How to deal with disconnection, erasure, rejection, unexpected performance, etc... Garbage Collection What’s to prevent me from filling every disk everywhere with computations that I might need? Debugging How do we dig out of numerous, noisy, distributed logs that state relevant to a complex workflow? Conclusion: Conclusion Tactical storage allows end users to build large structures out of simple building blocks without getting stuck on the ugly details. Acknowledgments: Acknowledgments Science Collaborators: Christophe Blanchet Patrick Flynn Sander Klous Peter Kunzst Erwin Laure John Poirier Igor Sfiligoi CS Collaborators: Jesus Izaguirre Aaron Striegel CS Students: Paul Brenner James Fitzgerald Jeff Hemmes Paul Madrid Chris Moretti Gerhard Niederwieser Phil Snowberger Justin Wozniak For more information...: For more information... Cooperative Computing Lab http://www.cse.nd.edu/~ccl Cooperative Computing Tools http://www.cctools.org Douglas Thain firstname.lastname@example.org http://www.cse.nd.edu/~dthain Slide36: Problem: Shared Namespace: Problem: Shared Namespace file server globus:/O=NotreDame/* RWLAX Solution: Reservation (V) Right: Solution: Reservation (V) Right file server O=NotreDame/CN=* V(RWLA) mkdir only!