EnabGrid

71 %
29 %
Information about EnabGrid
Entertainment

Published on February 7, 2008

Author: Ulisse

Source: authorstream.com

Enabling Grid Computer for HEP: James Cunha Werner jamwer@hep.man.ac.uk Enabling Grid Computer for HEP Babar Team at University of Manchester Resources: www.hep.man.ac.uk/u/jamwer Human resource strategy: James Cunha Werner jamwer@hep.man.ac.uk Human resource strategy * Jobs with 5 events instead Millions. Resources Strategy: James Cunha Werner jamwer@hep.man.ac.uk Resources Strategy Grid Test Bed: James Cunha Werner jamwer@hep.man.ac.uk Grid Test Bed Slide 5: James Cunha Werner jamwer@hep.man.ac.uk Slide 6: James Cunha Werner jamwer@hep.man.ac.uk Software: 850 packages. Tau Datasets: range between 60 files 1GB and 150 files 1GB Total 4,000 GB ~ 10,000 files Analysis Submission to Grid : James Cunha Werner jamwer@hep.man.ac.uk Analysis Submission to Grid Single command: ./easygrid dataset_name Perform Handlers management and submission Software based in State-machine Verify skimdata available: If not available perform BbkDatasetTCL to generate skimData. Each file will be a job. Verify if there are handlers pending If not, script generation (gera.c) with edg-job-submit and ClassAdds, and script execution. Nest for submission policy and optimisation. If yes, verify job status. When the all jobs ended, recover results in user folder. (Prototype) Generation and submission: James Cunha Werner jamwer@hep.man.ac.uk Generation and submission [jamwer@bfb babar]$ ./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy ......................................................... Done Creating proxy .................................................... Done Searching pre selected skimdata. Searching previous handlers. Handlers not found. Submiting to GRID . Wait end of process... Job Status : James Cunha Werner jamwer@hep.man.ac.uk Job Status [jamwer@bfb babar]$ ./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy ............................ Done Creating proxy ............................... Done Searching pre selected skimdata. Searching previous handlers. Checking if jobs finished. ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/foRHhWyeDBnbqA9JkDADLg Current Status: Scheduled https://lcgrb01.gridpp.rl.ac.uk:9000/foRHhWyeDBnbqA9JkDADLg still pendent. ### Handle -> https://lxn1188.cern.ch:9000/8DdK3xruxtevNpei3zZbaA Current Status: Scheduled https://lxn1188.cern.ch:9000/8DdK3xruxtevNpei3zZbaA still pendent. 4 jobs did not finished ! Try again later. Job Status and recovery: James Cunha Werner jamwer@hep.man.ac.uk Job Status and recovery [jamwer@bfb babar]$ ./easygrid SP-1005-Tau11-R14 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy .......................................... Done Creating proxy ........................................................... Done Searching pre selected skimdata. Searching previous handlers. Checking if jobs finished. ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/foRHhWyeDBnbqA9JkDADLg Current Status: Done Exit code: 0 ### Handle -> https://lxn1188.cern.ch:9000/8DdK3xruxtevNpei3zZbaA Current Status: Done Exit code: 0 0 jobs did not finished ! Try again later. All jobs done. Recovering results in your folder. Results in the following folders: /home/jamwer/grid_sub/babar/jamwer_foRHhWyeDBnbqA9JkDADLg /home/jamwer/grid_sub/babar/jamwer_8DdK3xruxtevNpei3zZbaA Monte Carlo Submission to Grid: James Cunha Werner jamwer@hep.man.ac.uk Monte Carlo Submission to Grid Single Command: ./mcgrid JobName num_copies Perform Handlers management and submission. Software based in State-Machine: Verify if there are handlers pending If not, script generation (geramc.c) with edg-job-submit and ClassAdds for each copy, and script execution. Nest for submission policy and optimisation. If yes, verify job status. When the all jobs ended, recover results in user folder. (Prototype) MC Submission : James Cunha Werner jamwer@hep.man.ac.uk MC Submission [jamwer@bfb mcgrid1]$ ./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy ................................. Done Creating proxy ....................................................... Done Searching previous handlers. Handlers not found. Submiting to GRID . Wait end of process... Job Status: James Cunha Werner jamwer@hep.man.ac.uk Job Status [jamwer@bfb mcgrid1]$ ./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy ........................................ Done Creating proxy ....................................... Done Searching previous handlers. Checking if jobs finished. ### Handle -> https://lxn1188.cern.ch:9000/9WzceoIMEQoTK24a-UvOmw Current Status: Scheduled https://lxn1188.cern.ch:9000/9WzceoIMEQoTK24a-UvOmw still pendent. ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/c4iCB8vioozaGteI9hybIg Current Status: Ready https://lcgrb01.gridpp.rl.ac.uk:9000/c4iCB8vioozaGteI9hybIg still pendent. ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/L5BD1OE--eckTm5RXkp2nA Current Status: Ready https://lcgrb01.gridpp.rl.ac.uk:9000/L5BD1OE--eckTm5RXkp2nA still pendent. 3 jobs did not finished ! Try again later. Job status and recovery: James Cunha Werner jamwer@hep.man.ac.uk Job status and recovery [jamwer@bfb mcgrid1]$ ./mcgrid MCteste 3 Invalid configuration filename: /opt/edg/etc/vomses Your identity: /C=UK/O=eScience/OU=Manchester/L=HEP/CN=james werner Enter GRID pass phrase for this identity: Creating temporary proxy .................................................. Done Creating proxy .................................................... Done Searching previous handlers. Checking if jobs finished. ### Handle -> https://lxn1188.cern.ch:9000/9WzceoIMEQoTK24a-UvOmw Current Status: Done Exit code: 0 ### Handle -> https://lcgrb01.gridpp.rl.ac.uk:9000/c4iCB8vioozaGteI9hybIg Current Status: Done Exit code: 0 0 jobs did not finished ! Try again later. All jobs done. Recovering results in your folder. Results in the following folders: /home/jamwer/grid_sub/mcgrid1/jamwer_9WzceoIMEQoTK24a-UvOmw /home/jamwer/grid_sub/mcgrid1/jamwer_c4iCB8vioozaGteI9hybIg /home/jamwer/grid_sub/mcgrid1/jamwer_L5BD1OE--eckTm5RXkp2nA Testing Submission Script: James Cunha Werner jamwer@hep.man.ac.uk Testing Submission Script Load Range: Worker load x #Files 16 x 60 files = 960 jobs pendent 16 x 150 files = 2400 jobs pendent Test with Submission script * sslv3 alert handshake failure ** Please wait job enter the “Done” status. This never happens! Resource Broker not reliable or robust. Sometimes failure 3 days a week or takes hours to submit/dispatch to CE (empty!). Pending Infrastructure => Course of action: James Cunha Werner jamwer@hep.man.ac.uk Pending Infrastructure => Course of action Babar Software Know How is not available at Manchester => Web Page & Network skills. Quality Assurance => We are OK! from benchmark (E x P) Real Application to perform complete cycle, acquire know how, and grid prof-of-concept is missing => Partnership with physicists CERN does NOT recognise Babar Community => Lets reduce their priority! RB at Manchester => 60MB binaries and policies freedom. SE/RC at Manchester => policies and submission jobs freedom. Mass storage (10TB) for Babar purposes => CAP! UI in the AFS => wide access to Manchester farms. Apprenticeship at RAL and later at SLAC – production and experiment => improve where others fail Configuration for optimal job performance/submission at Tear 2 (1 Ce x 50 WN? Performance dCache with Babar Software? Why 10TB if Liverpool bought 80TB? Electricity bill? => analyse procedures to improve QoS and better Site Configuration Update (software and data) and operational policies => operational standards to achieve high QoS Aimed Hardware Architecture : James Cunha Werner jamwer@hep.man.ac.uk Aimed Hardware Architecture (Redundant RB with alternate access) Aimed Software Architecture: James Cunha Werner jamwer@hep.man.ac.uk Aimed Software Architecture Production Job Submission Package: James Cunha Werner jamwer@hep.man.ac.uk Production Job Submission Package Operational policies/integration with RB (application level). Recovery of aborted status. Resources optimisation. Integration with RC (application level) for replicas policies development. Interactive data visualisation (Useful?) Integration with GridSite (Data visualisation, analysis, performance monitor, and submission) Professional version. Integrate LCG2 and Job Submission with Babar/CM2 at University of Manchester for Tau Physics modelling, analysis and MC generation.: James Cunha Werner jamwer@hep.man.ac.uk Integrate LCG2 and Job Submission with Babar/CM2 at University of Manchester for Tau Physics modelling, analysis and MC generation. We aim to be soon… The largest site in UK. Leader in grid computing and HEP Summary Conclusion: James Cunha Werner jamwer@hep.man.ac.uk Conclusion Babar CM2 is running at Manchester! LCG2 Grid is running with real world experiment! Babar submission prototype to Grid is running ! LCG is not LHC software only! It is Babar’s. We are doing today what will take years to you to achieve. Lets work together!

#files presentations

Add a comment

Related presentations

Related pages

www.hep.manchester.ac.uk

Hier sollte eine Beschreibung angezeigt werden, diese Seite lässt dies jedoch nicht zu.
Read more