Vouk Sdm Ahm

50 %
50 %
Information about Vouk Sdm Ahm

Published on October 19, 2008

Author: guest3bd2a12

Source: slideshare.net

Implementing Scientific Process Automation - from Art to Commodity Mladen A. Vouk and Terence Critchlow

Overview From Art to Commodity Component-based System Engineering Kepler vs. CCA vs ??? Domain Specific Virtualization and Service-based System Engineering

From Art to Commodity

Component-based System Engineering

Kepler vs. CCA vs ???

Domain Specific

Virtualization and Service-based System Engineering

Team ( the artists? ) Ilkay Altinas Zhengang Cheng Terence Chritchlow Bertram Ludaesche Brent Marinello Pierre Moualem Steve Parker Elliot Peele Mladen A. Vouk Anthony Wilson Others …(including the R&D of the Kepler and Ptolemy community, and W/F developers) John Blondin Doug Swesty Scott Klasky … . Numerous Kepler and Ptolemy users and apps Development and Support (Active) End-Users Art Commodity Scientist W/F Dev Support Developer IT Science

Ilkay Altinas

Zhengang Cheng

Terence Chritchlow

Bertram Ludaesche

Brent Marinello

Pierre Moualem

Steve Parker

Elliot Peele

Mladen A. Vouk

Anthony Wilson

Others …(including the R&D of the Kepler and Ptolemy community, and W/F developers)

John Blondin

Doug Swesty

Scott Klasky

… .

Numerous Kepler and Ptolemy users and apps

Cyberinfrastructure “ Cyberinfrastructure makes applications dramatically easier to [use,] develop and deploy , thus expanding the feasible scope of applications possible within budget and organizational constraints, and shifting the [educator’s,] scientist’s and engineer’s effort away from information technology (development) and concentrating it on [knowledge transfer, and] scientific and engineering research . Cyberinfrastructure also increases efficiency, quality, and reliability by capturing commonalities among application needs, and facilitates the efficient sharing of equipment and services.” (from the Appendix of the Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure, Jan 2003)

“ Cyberinfrastructure makes applications dramatically easier to [use,] develop and deploy , thus expanding the feasible scope of applications possible within budget and organizational constraints, and shifting the [educator’s,] scientist’s and engineer’s effort away from information technology (development) and concentrating it on [knowledge transfer, and] scientific and engineering research . Cyberinfrastructure also increases efficiency, quality, and reliability by capturing commonalities among application needs, and facilitates the efficient sharing of equipment and services.”

Rising Expectations Delivery of Service with Minimization of Information Technology (IT) Overhead Move away from the specific resources (e.g., h/w, s/w, net, storage, and app), to the ability to achieve one’s basic mission (learning, teaching, research, outreach, administration, etc.) Utility-like (appliance-like) on-demand access to needed IT. Use of IT-based solutions moves from a fixed location (such as an specific lab) and fixed resources (e.g., particular operating system) to a (mobile) personal access device a scientist (e.g., laptop or a PDA or a cell phone) and service-based delivery Business model behind IT virtualization & services needs to conform to the mission of the institution, as well as realistic resource and/or personnel constraints.

Delivery of Service with Minimization of Information Technology (IT) Overhead

Move away from the specific resources (e.g., h/w, s/w, net, storage, and app), to the ability to achieve one’s basic mission (learning, teaching, research, outreach, administration, etc.)

Utility-like (appliance-like) on-demand access to needed IT. Use of IT-based solutions moves from a fixed location (such as an specific lab) and fixed resources (e.g., particular operating system) to a (mobile) personal access device a scientist (e.g., laptop or a PDA or a cell phone) and service-based delivery

Business model behind IT virtualization & services needs to conform to the mission of the institution, as well as realistic resource and/or personnel constraints.

Overview From Art to Commodity Component-based System Engineering Kepler vs. CCA vs ??? Domain Specific Virtualization and Service-based System Engineering

From Art to Commodity

Component-based System Engineering

Kepler vs. CCA vs ???

Domain Specific

Virtualization and Service-based System Engineering

Component-Based System Engineering* Composition of systems (e.g., practical workflows) from (existing) components Systems as assemblies of components Development of components as reusable units Facilitation of the maintenance and evolution of systems by customizing and replacing their components Methods and tools for support of different aspects of component-based approach Open – source and process issues, organizational and management issues, coupling, domain, technologies (e.g., component models), component composition issues, tools… Building Reliable Component-Based Software Systems, Ivica Crnkovic and Magnus Larsson (editors), Artech House Publishers , ISBN 1-58053-327-2, http://www.idt.mdh.se/cbse-book/

Composition of systems (e.g., practical workflows) from (existing) components

Systems as assemblies of components

Development of components as reusable units

Facilitation of the maintenance and evolution of systems by customizing and replacing their components

Methods and tools for support of different aspects of component-based approach

Open – source and process issues, organizational and management issues, coupling, domain, technologies (e.g., component models), component composition issues, tools…

Advantages Business: Shorter time-to-market, lower development and maintenance costs Technical: Increased understandability of (complex) systems; Increased usability, interoperability, flexibility, adaptability, dependability… Strategic: Increasing (software) market share Scope: Web- and internet-based applications, Desktop and office applications, Mathematical and other libraries, Graphical tools, GUI-based applications, etc. Practical: De-facto standards, e.g., MS COM, .NET, Sun EJB, J2SEE, CORBA Component Model… Focus on functionality

Business: Shorter time-to-market, lower development and maintenance costs

Technical: Increased understandability of (complex) systems; Increased usability, interoperability, flexibility, adaptability, dependability…

Strategic: Increasing (software) market share

Scope: Web- and internet-based applications, Desktop and office applications, Mathematical and other libraries, Graphical tools, GUI-based applications, etc.

Practical: De-facto standards, e.g., MS COM, .NET, Sun EJB, J2SEE, CORBA Component Model…

Focus on functionality

Some Issues Standards Non-functional requirements Performance including timing, Resource management Dependability, fault-tolerance Domain specific requirements? Skill level needed and IT overhead Changing expectations and scalability … Coupling and synchronization (synchronous vs. asynchronous processing, loose vs. tight coupling, parallelization, data movement, bottlenecks …) Provisioning / Security challenges Marketing hype/over-expectations Customer confusion / skepticism Service quality/control issues Demonstrating benefits/ROI Software Licensing Vendor Factor Other …

Standards

Non-functional requirements

Performance including timing,

Resource management

Dependability, fault-tolerance

Domain specific requirements?

Skill level needed and IT overhead

Changing expectations and scalability …

Coupling and synchronization (synchronous vs. asynchronous processing, loose vs. tight coupling, parallelization, data movement, bottlenecks …)

Provisioning / Security challenges

Marketing hype/over-expectations

Customer confusion / skepticism

Service quality/control issues

Demonstrating benefits/ROI

Software Licensing

Vendor Factor

Other …

What is it? (*) The basis is the Component Components can be assembled according to the rules specified by the component model Components are assembled through their interfaces A Component Composition is the process of assembling components to form an assembly, a larger component or an application Component are performing in the context of a component framework All parts conform to the component model A component technology is a concrete implementation of a component model (*) From the talk entitled “ Component-based development challenges in building reliable systems”, by Crnkovic at IIS 2005, Sept 2005. c 1 c 2 Middleware Run-time system framework Component Model

The basis is the Component

Components can be assembled according to the rules specified by the component model

Components are assembled through their interfaces

A Component Composition is the process of assembling components to form an assembly, a larger component or an application

Component are performing in the context of a component framework

All parts conform to the component model

A component technology is a concrete implementation of a component model

Component Framework Platform Components Repository Supporting Tool (*) From the talk entitled “ Component-based development challenges in building reliable systems”, by Crnkovic at IIS 2005, Sept 2005.

Component A unit of composition Contractually specified interfaces Explicit context dependencies (only?). Can be deployed independently Subject to composition by third party. Confirms a component model which defines interaction and composition standards Composed without modification according to a composition standard. Specified as a) black-box (signal/response), b) gray-box (access to internal states), c) white-box (access to all internals), or d) display-box (see internals, but cannot touch internals).

A unit of composition

Contractually specified interfaces

Explicit context dependencies (only?).

Can be deployed independently

Subject to composition by third party.

Confirms a component model which defines interaction and composition standards

Composed without modification according to a composition standard.

Specified as a) black-box (signal/response), b) gray-box (access to internal states), c) white-box (access to all internals), or d) display-box (see internals, but cannot touch internals).

Principles Reusability (docs, re-use process, architecture, framework, V&V, …) Substitutability (alternative implementations, functional equivalence, equivalence on other issues, very precise interfaces and specs, run-time replacement mechanism, V&V, …) Extensibility (extending system component pool, increasing capabilities of individual components – extensible architecture, resource and new functionality discovery, V&V, …) Composability (functional, extra-functional, reasoning about compositions, V&V, …)

Reusability (docs, re-use process, architecture, framework, V&V, …)

Substitutability (alternative implementations, functional equivalence, equivalence on other issues, very precise interfaces and specs, run-time replacement mechanism, V&V, …)

Extensibility (extending system component pool, increasing capabilities of individual components – extensible architecture, resource and new functionality discovery, V&V, …)

Composability (functional, extra-functional, reasoning about compositions, V&V, …)

Scientific Workflow Automation (e.g., Astrophysics, v-Desk) In conjunction with John Blondin, NC State University Automate data acquisition, transfer and visualization of a large-scale simulation at ORNL Input Data Highly Parallel Compute Output ~500x500 files Aggregate to ~500 files (< 10+GB each) HPSS archive Data Depot Logistic Network L-Bone Local Mass Storage 14+TB) Aggregate to one file (~1 TB each) Viz Wall Viz Client Local 44 Proc. Data Cluster - data sits on local nodes for weeks Viz Software

Scientific Workflow Automation (e.g., Astrophysics, v-Desk) In conjunction with John Blondin, NC State University Automate data acquisition, transfer and visualization of a large-scale simulation at ORNL Input Data Highly Parallel Compute Output ~500x500 files Aggregate to ~500 files (< 10+GB each) HPSS archive Data Depot Logistic Network L-Bone Local Mass Storage 14+TB) Aggregate to one file (~1 TB each) Viz Wall Viz Client Local 44 Proc. Data Cluster - data sits on local nodes for weeks Viz Software

Scientific Workflow Automation (e.g., Astrophysics, v-Desk) In conjunction with John Blondin, NC State University Automate data acquisition, transfer and visualization of a large-scale simulation at ORNL Input Data Highly Parallel Compute Output ~500x500 files Aggregate to ~500 files (< 10+GB each) HPSS archive Data Depot Logistic Network L-Bone Local Mass Storage 14+TB) Aggregate to one file (~1 TB each) Viz Wall Viz Client Local 44 Proc. Data Cluster - data sits on local nodes for weeks Viz Software

Overview From Art to Commodity Component-based System Engineering Kepler vs. CCA vs ??? Domain Specific Virtualization and Service-based System Engineering

From Art to Commodity

Component-based System Engineering

Kepler vs. CCA vs ???

Domain Specific

Virtualization and Service-based System Engineering

CCA and Kepler CCA is probably more suitable for tightly coupled applications, especially when all the components are ready within a machine or a local cluster. CCA focuses on high performance parallel and distributed processing. Kepler is probably more suitable for loosely coupled and diverse components. The component can reside on different and widely separated machines. It is great for service and data that resides with its owner, and are exposed as services. Kepler focuses on process orchestration and control. More conducive of IP protection.

CCA is probably more suitable for tightly coupled applications, especially when all the components are ready within a machine or a local cluster. CCA focuses on high performance parallel and distributed processing.

Kepler is probably more suitable for loosely coupled and diverse components. The component can reside on different and widely separated machines. It is great for service and data that resides with its owner, and are exposed as services. Kepler focuses on process orchestration and control. More conducive of IP protection.

CCA Component Architecture Model CCA components interact with each other with a specific CCA framework implementation through standard CCA interfaces. Each component defines its inputs and outputs in Scientific IDL ; these definitions are deposited in, and can be retrieved from a repository by using the CCA Repository API . In addition, these definitions serve as input to a proxy generator which generates component stubs : the component-specific parts of GPorts (white box in the picture). The components can also use framework services directly through the CCA Framework Services Interface. The CCA Configuration API ensures that the the components can collaborate with different builders associated with different frameworks. Chasm - an F90 interoperability library from Los Alamos. Babel/SIDL - an object oriented language interoperabilty interface definition language. CCA Specification - The Common Component Architecture Specification for high performance components. Ccaffeine - a CCA framework compliant with the CCA specification. Ccaffeine GUI - A Graphical User Interface that works with Ccaffeine.

CCA components interact with each other with a specific CCA framework implementation through standard CCA interfaces. Each component defines its inputs and outputs in Scientific IDL ; these definitions are deposited in, and can be retrieved from a repository by using the CCA Repository API . In addition, these definitions serve as input to a proxy generator which generates component stubs : the component-specific parts of GPorts (white box in the picture). The components can also use framework services directly through the CCA Framework Services Interface. The CCA Configuration API ensures that the the components can collaborate with different builders associated with different frameworks.

Kepler

Option A: CCA-Aware Actor Actor interacts with CCA components. Kepler director only needs to pass relevant parameters. If such actor is not available, each CCA component may require a customized stub. Modification only in Kepler space. Maintains tight coupling among CCA components, e.g., for performance. Kepler Director CCA aware actor CCA Com. CCA Com. CCA Com.

Actor interacts with CCA components. Kepler director only needs to pass relevant parameters.

If such actor is not available, each CCA component may require a customized stub.

Modification only in Kepler space.

Maintains tight coupling among CCA components, e.g., for performance.

Option B: CCA Component Service CCA component exposes an interface (service) that can be directly orchestrated by Kepler. Requires extra work on all CCA components. This might not be possible on all machines. Kepler Director Service actor CCA Com. Service CCA Com. Service CCA Com. Service

CCA component exposes an interface (service) that can be directly orchestrated by Kepler.

Requires extra work on all CCA components. This might not be possible on all machines.

Option C: CCA-Aware Service Proxy Kepler and CCA bridged through a Service Proxy. It translates service requests into interactions. Advantage: Decouples CCA and Kepler Service Proxy can be very flexible No Special CCA execution module is required on the client. Disadvantage: Single point vulnerability as all Kepler users are depends on the Proxy. Extra overhead? Kepler Director Service actor CCA Com. CCA Com. CCA Com. CCA Aware Service Proxy

Kepler and CCA bridged through a Service Proxy. It translates service requests into interactions.

Advantage:

Decouples CCA and Kepler

Service Proxy can be very flexible

No Special CCA execution module is required on the client.

Disadvantage:

Single point vulnerability as all Kepler users are depends on the Proxy. Extra overhead?

Q&A Q: I understand that I need to wrap CCA components into actors in order to use Kepler. But what new capabilities will I obtain by doing that compared to using CCA components and CCaffeine framework? In fact, what I like about CCA, is that one can use components written in different languages. How do I do this using Kepler? A: By wrapping CCA components, Kepler will able to use them. This allows CCA components and diverse Kepler services to work together, thus broaden application area. Kepler is in Java. It is used primarily for service orchestration, not execution. Each service runs in its specific environment. Programs written in other language can be enacted through the actor of that specific language. For example Perl actor to run Perl scripts.

Q: I understand that I need to wrap CCA components into actors in order to use Kepler. But what new capabilities will I obtain by doing that compared to using CCA components and CCaffeine framework? In fact, what I like about CCA, is that one can use components written in different languages. How do I do this using Kepler?

A: By wrapping CCA components, Kepler will able to use them. This allows CCA components and diverse Kepler services to work together, thus broaden application area.

Kepler is in Java. It is used primarily for service orchestration, not execution. Each service runs in its specific environment. Programs written in other language can be enacted through the actor of that specific language. For example Perl actor to run Perl scripts.

Q&A (2) Q: How do I distinguish more loosely coupled applications from more tightly coupled (to choose between CCA and Kepler)? Both use ports (interfaces), so how can I quantify an interface (loosely vs tightly coupled)? A: Tightly coupled components usually means synchronous and intense communication and control. For example, frequent amount of timing sensitive data exchanged among the components. CCA is more suitable in these case. Loosely coupled services are usually more focused on control and distributed services and data. May not be suitable for intense data-flows unless they are virtualized. Services from different service providers. Kepler is suitable for this area.

Q: How do I distinguish more loosely coupled applications from more tightly coupled (to choose between CCA and Kepler)? Both use ports (interfaces), so how can I quantify an interface (loosely vs tightly coupled)?

A: Tightly coupled components usually means synchronous and intense communication and control. For example, frequent amount of timing sensitive data exchanged among the components. CCA is more suitable in these case.

Loosely coupled services are usually more focused on control and distributed services and data. May not be suitable for intense data-flows unless they are virtualized. Services from different service providers. Kepler is suitable for this area.

Overview From Art to Commodity Component-based System Engineering Kepler vs. CCA vs ??? Domain Specific Virtualization and Service-based System Engineering

From Art to Commodity

Component-based System Engineering

Kepler vs. CCA vs ???

Domain Specific

Virtualization and Service-based System Engineering

Domain Specific Klasky FSP W/F Swesty TSI W/F Coleman W/F ChemInfo W/F SciRun Blondin TSI W/F Other …

Klasky FSP W/F

Swesty TSI W/F

Coleman W/F

ChemInfo W/F

SciRun

Blondin TSI W/F

Other …

Fusion Simulation Project Workflow Pilot In conjunction with Scott Klasky, CESP, PPPL, ORNL Automation of simulation, transfer and analytics of FSP

TSI Workflow I In conjunction with Doug Swesty and Eric Myra, Stony Brook Automate the transfer of large-scale simulation data between NERSC and Stony Brook

Promoter Identification Workflow In conjunction with Matt Coleman, LLNL Automate the analysis of gene expression data using a combination of web services and local analysis programs

ChemInformatics Workflow In conjunction with Resurgence Project Automate the management and submission of jobs

SCIRun and Kepler Dataflow Integration Incorporate SCIRun computation and visualization with the SPA workflow engine

Scientific Workflow Automation (e.g., Astrophysics, v-Desk) In conjunction with John Blondin, NC State University Automate data acquisition, transfer and visualization of a large-scale simulation at ORNL Input Data Highly Parallel Compute Output ~500x500 files Aggregate to ~500 files (< 10+GB each) HPSS archive Data Depot Logistic Network L-Bone Local Mass Storage 14+TB) Aggregate to one file (~1 TB each) Viz Wall Viz Client Local 44 Proc. Data Cluster - data sits on local nodes for weeks Viz Software

Workflow - Abstraction Model SendData Merge & Backup To VizWall Parallel Computation RecvData Parallel Visualization Data Mover Channel (e.g. LORS, BCC, SABUL, FC over SONET Split & Viz Web or Client GUI Web Services Head Node Services Head Node Services Mass Storage Fiber C. or Local NFS Model Merge Backup Move Split Viz Construct Orchestrate Monitor/Steer Change Stop/Start Control

Astrophysics Workflow (using Ptolemy II framework)

Blondin Workflow V2 Submit Job Merge Transfer Slicing /Dicing Viz via Ensight Browser Display Supercomputer Cray (Phoenix @ ORNL) Local Cluster (Orbitty @ NCSU) User Laptop Web DB

Basic A number of processing steps Range of data transfer rates Parallelism Speedup? Implementation (e.g., Scripts? App?) Ease of use (e.g., LORS) Tracking (e.g., DB, Web, provenance) Fault-Tolerance Distributed (for most part)

A number of processing steps

Range of data transfer rates

Parallelism

Speedup?

Implementation (e.g., Scripts? App?)

Ease of use (e.g., LORS)

Tracking (e.g., DB, Web, provenance)

Fault-Tolerance

Distributed (for most part)

Runtime Data Collection Each run of the workflow is associated with a runid. The info is organized around it.

Each run of the workflow is associated with a runid. The info is organized around it.

Workflow Screenshot (Swesty)

Screenshot: Log

Screenshot: Running

Generic Actors (Swesty W/F) This workflow uses a small number of actors repeatedly to deliver a complex behavior 160 instances an actors 18 different types of actors ~100 Expression actor instances 13 boolean switch actor instances 13 array manipulation actor instances 8 ssh actor instances < 30 instances of other actors (ex: sleep, file I/O, etc)

This workflow uses a small number of actors repeatedly to deliver a complex behavior

160 instances an actors

18 different types of actors

~100 Expression actor instances

13 boolean switch actor instances

13 array manipulation actor instances

8 ssh actor instances

< 30 instances of other actors (ex: sleep, file I/O, etc)

Work in Progress Validation of input data llsubmit script and config files need to be consistent Distributed infrastructure Start this from a web page Monitor and control the workflow from a remote site Incorporation of data analysis A small workflow driven by the specific simulation results and how Doug wants to visualize the data

Validation of input data

llsubmit script and config files need to be consistent

Distributed infrastructure

Start this from a web page

Monitor and control the workflow from a remote site

Incorporation of data analysis

A small workflow driven by the specific simulation results and how Doug wants to visualize the data

Notes A complex workflow that is typical of many scientific workflows Follows the run simulation, move data, and analyze data paradigm Much can be done with a set of generic actors: Expression actor Ssh2Exec actor BooleanSwitch & Array actors Configuration parameters allow simple adaptation to different environments

A complex workflow that is typical of many scientific workflows

Follows the run simulation, move data, and analyze data paradigm

Much can be done with a set of generic actors:

Expression actor

Ssh2Exec actor

BooleanSwitch & Array actors

Configuration parameters allow simple adaptation to different environments

Notes (Blondin) W/F support system needs to be Flexible - everything changes often!  If such tools are to be used by application scientists they need to be easy to reconfigure. Detachable - for one reason or another, one component may not work (network not usable, disks full on local end), so one would like to fire up individual parts of the workflow as needed. Fault-tolerant - ideally, the software itself can recognize some faults and correct them (eg, re-attempt file upload).

W/F support system needs to be

Flexible - everything changes often!  If such tools are to be used by application scientists they need to be easy to reconfigure.

Detachable - for one reason or another, one component may not work (network not usable, disks full on local end), so one would like to fire up individual parts of the workflow as needed.

Fault-tolerant - ideally, the software itself can recognize some faults and correct them (eg, re-attempt file upload).

Key Issue Very important to distinguish between a custom-made workflow solution and a more cannonical set of operations, methods, and solutions that can be composed into a scientific workflow. Complexity, skill level needed to implement, usability, maintainability, “standardization” e.g., sort, uniq, grep, ftp, ssh on unix boxes SAS (that can do sorting), home-made sort, LORS, SABUL, bbcp (free, but not standard), etc.

Very important to distinguish between a custom-made workflow solution and a more cannonical set of operations, methods, and solutions that can be composed into a scientific workflow.

Complexity, skill level needed to implement, usability, maintainability, “standardization”

e.g., sort, uniq, grep, ftp, ssh on unix boxes

SAS (that can do sorting), home-made sort,

LORS, SABUL, bbcp (free, but not standard), etc.

Overview From Art to Commodity Component-based System Engineering Kepler vs. CCA vs ??? Domain Specific Virtualization and Service-based System Engineering

From Art to Commodity

Component-based System Engineering

Kepler vs. CCA vs ???

Domain Specific

Virtualization and Service-based System Engineering

Make everything into a Service Network-based and On-demand Complements an access/communication unit of choice Utility-like: ubiquitous, reliable, available, maintainable Nearly Device-independent May be application level or smaller granularity (e.g., functions, web-service, grid-service)

Network-based and On-demand

Complements an access/communication unit of choice

Utility-like: ubiquitous, reliable, available, maintainable

Nearly Device-independent

May be application level or smaller granularity (e.g., functions, web-service, grid-service)

Virtualization Services Middleware Hardware Applications Provisioning Operating Systems To effectively deliver on-demand computing services that are maintainable, scalable, and customizable it is essential that the tiers of virtualization are separated but can be coupled

Scientific Workflow Automation (e.g., Astrophysics, v-RP) In conjunction with John Blondin, NC State University Automate data acquisition, transfer and visualization of a large-scale simulation at ORNL Input Data Highly Parallel Computer Output files ~500x500 Aggregate to ~500 files (< 10+GB each) HPSS archive Data Depot Logistic Network L-Bone Local Mass Storage 14+TB) Aggregate to one file (~1 TB each) Viz Wall Viz Client Local 44 Proc. Data Cluster - data sits on local nodes for weeks Viz Software VCL-based Workflow Orchestration, State Tracking, Provenance

A Resource “Utility Wall” V-Desks V-Flow (resource collections) 1-1-1 to N-M-K …

V-Desks

V-Flow

(resource

collections)

1-1-1 to

N-M-K



vcl.ncsu.edu Ongoing: Tipping pt. Usability Availability Pedagogy … Currently scaling to 8000+ users - N-M-K

Ongoing:

Tipping pt.

Usability

Availability

Pedagogy



Currently

scaling to

8000+ users

- N-M-K

Advantages (2) Easy to use remote access from one's own desktop or mobile computer in homes, offices, or the local coffee house, bringing the &quot;lab&quot; to you Full access to a dedicated computing resource (some scheduling choices include monitored root or administrator access). This access is the same or more than what is possible in physical computing laboratory. Vendor-standard remote access protocols and client software. Eliminates the need for specialized customization of one's own computer and eases updates and maintenance Platform agnostic (Macs, Win, Linux, …) Extensible to any remotely-accessible desktop systems in specialized campus labs. Departments can bring their lab to their students. Protection of Intellectual Property Data Provenance and tracking Fault-Tolerance Higher Security

Easy to use remote access from one's own desktop or mobile computer in homes, offices, or the local coffee house, bringing the &quot;lab&quot; to you

Full access to a dedicated computing resource (some scheduling choices include monitored root or administrator access). This access is the same or more than what is possible in physical computing laboratory.

Vendor-standard remote access protocols and client software. Eliminates the need for specialized customization of one's own computer and eases updates and maintenance

Platform agnostic (Macs, Win, Linux, …)

Extensible to any remotely-accessible desktop systems in specialized campus labs. Departments can bring their lab to their students.

Protection of Intellectual Property

Data Provenance and tracking

Fault-Tolerance

Higher Security

Issues (1) Communication Coupling (loose, tight, v. tight, code-level) and Granularity (fine, medium?, coarse) Communication Methods (e.g., ssh tunnels, xmprpc, snmp, web/grid services,etc.) – e.g., apparently poor support for Cray Storage issues (e.g., p-netcdf support, bandwidth) Direct and Indirect Data Flows (functionality, throughput, delays, other QoS parameters) End-to-end performance Level of abstraction Workflow description language(s) and exchange issues – interoperability “ Standard” scientific computing “W/F functions”

Communication Coupling (loose, tight, v. tight, code-level) and Granularity (fine, medium?, coarse)

Communication Methods (e.g., ssh tunnels, xmprpc, snmp, web/grid services,etc.) – e.g., apparently poor support for Cray

Storage issues (e.g., p-netcdf support, bandwidth)

Direct and Indirect Data Flows (functionality, throughput, delays, other QoS parameters)

End-to-end performance

Level of abstraction

Workflow description language(s) and exchange issues – interoperability

“ Standard” scientific computing “W/F functions”

Issues (2) Problem is currently similar to old-time punched-card job submissions (long turn-around time, can be expensive due to front end computational resource I/O bottleneck) - need up front verification and validation – things will change Back-end bottleneck due to hierarchical storage issues (e.g., retrieval from HPSS) Long term workflow state preservation - needed Recovery (transfers, other failures) – more needed Tracking data and files, provenances Who maintains equipment, storage, data, scripts, workflow elements? Elegant solutions my not be good solutions from the perspective of autonomy. EXTREMELY IMPORTANT!!! – We are trying to get out of the business of totally custom-made solutions.

Problem is currently similar to old-time punched-card job submissions (long turn-around time, can be expensive due to front end computational resource I/O bottleneck) - need up front verification and validation – things will change

Back-end bottleneck due to hierarchical storage issues (e.g., retrieval from HPSS)

Long term workflow state preservation - needed

Recovery (transfers, other failures) – more needed

Tracking data and files, provenances

Who maintains equipment, storage, data, scripts, workflow elements? Elegant solutions my not be good solutions from the perspective of autonomy.

EXTREMELY IMPORTANT!!! – We are trying to get out of the business of totally custom-made solutions.

 

Add a comment

Related presentations

Related pages

Put Your Title Here - sdm.lbl.gov

Title: Put Your Title Here Author: Arie Shoshani Last modified by: Vouk Created Date: 6/12/2000 8:31:39 AM Document presentation format: On-screen Show
Read more

PowerPoint Presentation - sdm.lbl.gov

On Large Data-Flow Scientific Workflows: An Astrophysics Case Study Integration of Heterogeneous Datasets using Scientific Workflow Engineering
Read more

SDM SPA/Utah AHM/Mar05– NC State 1 On Large Data-Flow ...

SDM SPA/Utah AHM/Mar05– NC State 1 On Large Data-Flow Scientific Workflows: An Astrophysics Case Study Integration of Heterogeneous Datasets using ...
Read more

Ahm | LinkedIn

View 10763 Ahm posts, presentations, experts, and more. Get the professional knowledge you need on LinkedIn.
Read more

Ähm? | LinkedIn

General Manager at AHM Biotech Indonesia, Co Founder at Catur Pramodya Nusantara Past Operation Manager at BioSM Indonesia, ...
Read more