Distributed-ness: Distributed computing & the clouds

40 %
60 %
Information about Distributed-ness: Distributed computing & the clouds
Technology

Published on January 9, 2009

Author: rcoup

Source: slideshare.net

Description

Discussion on distributed apps and the cloud resources available to support them. Some discussion on the XMPP/Jabber based messaging system we use at Koordinates. Part of the seminar series for the Wellington Summer of Code programme.

Distributed-ness Robert Coup, Koordinates http://rob.coup.net.nz/ robert.coup@koordinates.com

Me

What is it? “Distributed computing deals with hardware and software systems containing more than one processing element or storage element, concurrent processes, or multiple programs, running under a loosely or tightly controlled regime.” - Wikipedia http://en.wikipedia.org/wiki/Distributed_computing

What is it? Application architecture Independent components Dynamic resourcing

What is it? Distributed computing is not scaling Distributed computing can help you scale There are easier ways to scale short-term

Easier ways

Distributed problems Break up my big problem into small chunks which can be worked on in parallel or asynchronously.

Distributed applications Single application with a bunch of components Inter-dependency Components “load-up” differently

catinthehat.biz “great hats for your cats”

Little catinthehat.biz Load balancer Cache Media App Storage DB Worker

Bigger catinthehat.biz Load balancer Cache Media App App App + Cache Media DB DB Worker Storage + Worker +

Talking

Talking Components of a distributed app need to talk

Talking Components of a distributed app need to talk But should have minimal knowledge of each other

Talking Components of a distributed app need to talk But should have minimal knowledge of each other

Talking Components of a distributed app need to talk But should have minimal knowledge of each other Just like in code modules! “Decoupling”

Messaging Point to point: Needs configuration Web services

Messaging Queues: Publish-subscribe Amazon SQS Lots of others http://aws.amazon.com/sqs/

Messaging Peer-to-peer Jabber / XMPP Persistent connections Presence

Jabber at Koordinates Brainz manages the work Korrew does the work

Jabber @ Koordinates Data imports have 20-25 inter-related tasks “Package” defines the dependencies and input data

Task Packages

Task Packages Kerrows & Brainzs connect via XMPP

Task Packages Kerrows & Brainzs connect via XMPP Brainz publishes tasks via PubSub

Task Packages Kerrows & Brainzs connect via XMPP Brainz publishes tasks via PubSub Kerrow negotiates for tasks, then does them

Task Packages Kerrows & Brainzs connect via XMPP Brainz publishes tasks via PubSub Kerrow negotiates for tasks, then does them Brainz notified on task completion/error

Task Packages Kerrows & Brainzs connect via XMPP Brainz publishes tasks via PubSub Kerrow negotiates for tasks, then does them Brainz notified on task completion/error If Kerrows go offline, tasks are re-assigned

Bots Via IM, we can connect to Brainz/Kerrow Ask for status, cancel, new tasks, … And it can message us: errors, info

Live Status Keep a live eye on whats going on Danga apps have terminal consoles (telnet) Otherwise you’re debugging via logs

Bigger catinthehat.biz Load balancer Cache Media App App App + Cache Media DB DB Worker Storage Worker + +

Storage Dump files Get them back Reliably Quickly In bulk Backups

Amazon S3 - Simple Storage Service Unlimited storage Cheap! US$0.15 / GB / month US$0.10 / GB in & US$0.17 / GB out http://aws.amazon.com/s3/

Amazon S3 Not a hard disk or filesystem Data is organised into namespaces (buckets) hatdesigns.catinthehat.biz Within that: key-value pairs Access via HTTP Authentication / access-control Open source version - mogilefs http://www.danga.com/mogilefs/

Amazon S3 - downsides Eventual consistency 99.99% reliable = 1/10K requests fail Will return errors

Amazon S3 - uses Uses for catinthehat.biz? Customer photos of hats on cats Customer hat designs Story videos Manufacturing design files Backups

Bigger catinthehat.biz Load balancer Cache Media App App App + Cache Media DB DB Worker Storage Worker + S3 +

Compute power Supply & demand Supply costs Demand is hard to manage

Amazon EC2 - Elastic Compute Cloud Virtual servers on demand From US$0.10 - US$0.80 / hour Linux & Windows, 1-8 cores, 1.7-15GB memory, 160GB-1.7TB local storage, 32/64bit Permanent storage from US$0.10 / GB / month http://aws.amazon.com/ec2/

Amazon EC2 Turn capacity on & off at will Ideal for batch processing Ideal for dynamic loads

Amazon EC2 Not cheapest - US$70+/month for static server Instances can be terminated at any time! Organise configuration - Puppet, RightScale, Scalr Need an app that is architected to handle it http://slicehost.com/ http://puppet.reductivelabs.com/ http://www.rightscale.com/ http://code.google.com/p/scalr/

Amazon EC2 - uses Uses for catinthehat.biz? converting customer designs creating story videos application servers

Bigger catinthehat.biz Load balancer Cache Media App App App + EC2 Cache Media DB DB Worker Storage Worker + EC2 S3 +

Google AppEngine Auto-scaling web applications Google hosts and runs Access to BigTable, Image/Email/Cache/HTTP APIs Restricted Python environment Free to get started http://code.google.com/appengine/

Google AppEngine Still in beta, no way of buying “extra” capacity No offline/background processing Time limits on requests No file storage Datastore isn’t SQL Lock-in

Google AppEngine Uses at catinthehat.biz? Facebook application? Prototypes?

MapReduce Map Phase Reduce Phase Take a problem Combine all the answers to the chunk Chop it up into chunks to get the real answer Distribute chunks to lots of workers to do http://en.wikipedia.org/wiki/MapReduce

MapReduce Small atomic chunks of work Run across acres of machines on masses of data Easy to write (although problems need to “fit”) Can be chained together Open source versions - Hadoop, others http://en.wikipedia.org/wiki/MapReduce http://hadoop.apache.org/

MapReduce Use at catinthehat.biz? Find most popular non-English words in user stories:

MapReduce Use at catinthehat.biz? Find most popular non-English words in user stories: def map(document): for word in document: if not isEnglishWord(word): yield (word,1)

MapReduce Use at catinthehat.biz? Find most popular non-English words in user stories: def map(document): for word in document: if not isEnglishWord(word): yield (word,1) def reduce(word, partialCounts): return sum(partialCounts)

So De-couple application components Figure out a messaging strategy Monitor your apps live Vertical scaling is cheaper short-term

So On demand storage (S3) & compute power (EC2) Google App Engine for simple apps Lots of tools available

“If you never did, you should. These things are fun, and fun is good.” - Dr. Seuss

Add a comment

Related presentations

Related pages

Apache Zookeeper | LinkedIn

View 5665 Apache Zookeeper ... measures of central tendency are often called averages Decision Tree in Clouds A ... Distributed Computing and ...
Read more

BE IT Syllabus 2012 Course Final 15-6-15 | Service ...

BE IT Syllabus 2012 Course Final 15-6-15 - Download as PDF File (.pdf), Text File (.txt) or read online. syllabus
Read more

Inventive Methods the Happening of the Social - Scribd

Edited by. Celia Lury and Nina Wakeford CULTURE, ECONOMY AND THE SOCIAL Inventive Methods The happening of the social D o w n l o a d e d b y [ N a t i o n ...
Read more

SYSTEMATIC MIGRATION OF WORKLOAD BASED ON CLASSIFICATION ...

Data characterizing a source computing environment having at least one hardware resource and at least one workload is obtained, as is a ...
Read more

Building the Hyperconnected Society (PDF Download Available)

and distributed-ness of the IoT and the need of applications to share resources. and even data. The Security T oolkit (SecKit) [62] models the IoT system for.
Read more