Initial presentation of swift (for montreal user group)

67 %
33 %
Information about Initial presentation of swift (for montreal user group)

Published on March 18, 2014

Author: MarcosGarcia10


Swift A quick introduction March 2014

Index ● What is Object storage ● A quick look to Amazon S3 ● Swift use cases ● History & Architecture ● Swift features ● The API ● Demo using Cyberduck

What is object storage ● HTTP accessible storage of objects (files) in buckets (folders) ● Like FTP or WebDAV ● Added security access, metadata ● Everything is a URL ● Cheap and hassle-free ○ Notion of unlimited capacity ○ No fragmentation or integrity checks ○ No locks or concurrency problems ○ Support of partial reads or writes

What is object storage ● Designed for cloud-era requirements ○ Secure ○ Reliable ○ Scalable ○ Fast ○ Inexpensive ○ Simple

Quick look to Amazon S3

● Content storage and distribution ○ Serve static files or whole websites from S3 directly ● Better scalability for web server tier ○ Reduces ‘data gravity’, low I/O in the server, all HTTP ● Storage for data analysis ● Fine-grained access control to buckets ● Backup, archiving and disaster recovery ○ even if Amazon Glacier is a cheaper option ● ... but it’s not a Content Distribution Network ○ doesn’t optimize routing for lowest latency ○ is not optimized for content streaming ○ that’s why Amazon Cloudfront exists Some Amazon S3 use cases

The cost of Amazon S3 ● Main reason to use S3: price ● Example: 1 TB stored, modified 100GB per month ○ Storage cost: $85 / month ○ Data Transfer (Upload): $0 ○ Data Transfer (Download): $12, at $0.12/GB ● A cheaper option: reduced redundancy (99’9% instead of 99’999999999%) ○ Storage cost: $68 ● Even cheaper, but just for backups (very limited functionalities): Glacier ○ Storage cost: $10

Swift use cases ● Object Storage system ● Massively Scalable ● Runs on commodity hardware ● An S3-like solution What is it ● Hard drive or filesystem ● NFS / SMB share ● Block storage ● any SAN/NAS/DAS ● not even a CDN What is NOT

Swift use cases ● Multi tenancy ○ Ideal for Public or Private Clouds ○ Different URLs, groups of users, access codes, fine-grained privileges ● Backups ○ Write-Once, read-never (long term archiving). ○ Disaster recovery. ● Web Content ○ Write many, read many. ○ File-sharing websites (temporary access). ○ Static website or media-focused blogs (i.e. imgur). ● Large Objects ○ Medical/Scientific images. ○ Store your fancy images from the moon (i.e: nasa). ○ Store your VM from the cloud.

History ● Rackspace Cloud Files V1. ○ Distributed storage. ○ Centralized metadata. ○ PostgreSQL DB ● 2009: Rackspace Cloud Files V2 (Swift). ○ Full redesign and rewrite. Opensource. ○ API compatible with Amazon S3 ○ Worked closely with ops. ○ Distributed storage and metadata. ○ Logical placement, based on algorithm

● Highly available, distributed, eventually consistent object storage, using commodity servers ● Eventually consistent: a write is acknowledged before waiting for full replication confirmation ○ Referring the CAP theorem, Swift chose: ■ availability and partition tolerance ■ dropped consistency. ● 3 rings to replicate ○ Accounts ○ Containers ○ Objects Swift architecture

Swift architecture Proxy Proxy Proxy Proxy Storage Storage Storage Storage The Ring ● Multiple components, usually on 2 type of nodes ○ Proxy servers: Runs the swift-proxy-server processes which proxy requests to the appropriate Storage nodes. It also contains the TempAuth service as WSGI middleware. ○ Storage servers: Runs the swift-account-server, swift-container- server, and swift-object-server processes which control storage of the account databases, the container databases, as well as the actual stored objects.

Swift architecture Proxy Proxy Proxy Proxy Storage Storage Storage Storage The Ring ● Proxy tier ○ Handles Incoming Requests Scales Horizontally

Swift architecture Proxy Proxy Proxy Proxy Storage Storage Storage Storage The Ring ● The Ring ○ Maps data (accounts, containers, objects) to storage servers Example of 3-replication

Swift architecture Proxy Proxy Proxy Proxy Storage Storage Storage Storage The Ring ● Storage zones ○ Isolate Physical failures

Swift architecture Proxy Proxy Proxy Proxy Storage Storage Storage Storage The Ring ● Quorum writes ○ Proxy acknowledges after the 2nd replica is OK, no wait for 3rd Lookup

Swift architecture Proxy Proxy Proxy Proxy Storage Storage Storage Storage The Ring ● Single-disk reads Lookup

Swift architecture Proxy Proxy Proxy Proxy Storage Storage Storage Storage The Ring ● Replication ○ A process that runs continuously, checks integrity as well

Swift features ● ACL ○ Free form implemented by the auth system middleware ● Healthcheck ○ Simple healthcheck page for LB ● Ratelimit ○ Rate Limiting requests ● Staticweb ○ Provide index.html in containers ● TempURL ○ Temporary URL generation for objects ● FormPost ○ Translates a browser form post into a regular Swift object PUT ● Domain Remap ○ Pretty URL with domains based containers

Swift features ● Bulk Operations ○ Multiple DELETE or upload or even tar.(b|g)z upload ● Account Quotas ○ Give operator ability to limit or set as read only accounts ● Container Quotas ○ Allows user to restrict a public container (i.e: with formpost) ● Large Objects (upload > 5GB) ○ Internally splitted when uploaded. Downloads a single assembled object, supports files of virtually unlimited size ● CORS ○ Upload directly from the browser via javascript to Swift ● Versioning ○ Allow versioning all object in a container ● Swift3 ○ S3 Compatible but this one has been pulled out of swift

The API ● Bindings for different languages: python, ruby, java… ● Multiple CLI tools: python-swiftclient, jcloud, fog

● Swift CLI: ○ delete, download, list, post, stat,upload,capabilities ○ post: Updates meta information for the account, container,or object ● Examples of metadata (HTTP Headers) ○ X-Account-Access-Control (for ACL) ○ X-Account-Sysmeta-Global-Write-Ratelimit (for ratelimit) ○ X-Object-Manifest (for dynamic large objects) ○ X-Versions-Location (for object versioning) ○ X-Container-Sync-* (used internally for container synchronisation) ○ X-Delete-At and X-Delete-After (for object expiration) ○ X-Container-Meta-Access-Control (for CORS) ● Other ○ crossdomain.xml (for cross-domain policies) The API

Demo using Cyberduck Connection templates here:

Thank you


Proxy Servers ● Swift public face ○ The entry point, and it has to do a lot of work too ● Determines the appropriate storage nodes ○ By using a logical map ● Coordinates responses ○ Ensures at least two replicas have succeeded writing the object to disk before confirming to the client

The ring ● Used by proxies and replication processes. ● Maps requests to storage nodes ● Availability zones ○ Ensure your objects are placed as far as possible ● Regions ○ Support for global clusters, multi-region replication ● Scale-out without affecting most entities ○ Only a fraction needs to be moved around ○ Still, it’s better to use the weighing system ● Up to you how to synchronise the ring

The ring Example: - partition power of 3 - 3 first digits are ring coordinates MD5 hash

Account / Container Servers ● Stored using SQLITE Database ● Simple schema ○ Table for listing ○ Table for metadata ○ Stats information ● Scaling ○ With high concurrency, SQLite gets you a lot of IO Wait, this is when you use ‘ratelimit’

Object Servers ● Use filesystem to store files ○ The file (object) is dumped on disk ‘as is’ ● Use ‘xattrs’ to store metadata ○ On ext4, xfs ● Files named by timestamps ○ Last write always win ○ Deletion is treated as a version of the file with a tombstone object ● Directory structures ○ /mount/data_dir/partition/hash_suffix/hash/object.ts

Replication ● N-factor, configurable. By default is 3 ● Asynchronous and peer-to-peer replicator process ○ Traverses the local filesystem to detect changes ○ Concurrently performs operations, balancing load across physical disks ● Push model system ○ Records and files are generally only copied from local to remote replicas ○ It’s the duty of a node holding data to ensure its data gets to where it belongs ○ Replica placement handled by the ring

● DB Replication ○ Hash comparison of DB files ○ Replicates whole database file using rsync, new unique id is assigned ● Object replication ○ Uses rsync for transport ○ Sync only subsets of directories ○ Hash based ○ Bound by the number of uncached directories it has to traverse Replication

● Standard WSGI ○ Pipeline composed of a succession of middleware, ending with one application. The last one, ● Usually provided by the proxy ○ But it can be provided by other server roles ● Auth is pluggable via middleware ○ swauth ○ keystone Middleware

Amazon S3 in initial slides: price of $0,085 per GB per month. ROI after 5-6 months Swift cost estimation

Amazon S3 in initial slides: price of $0.085 per GB per month. ROI after barely 9 months ○ Monthly S3 cost for 145 TB = $10,600 ($8.5k if reduced redundancy) ○ Monthly S3 cost for 1.3 PB = $82,600 ($66k if reduced redundancy) Swift cost estimation

Connecting to Swift (I) 1. (Example using a account) 2. download your file 3. source it (i.e. source 4. put your password 5. do “keystone catalog” to validate the keystone public URL 6. recover the object-store public URL (i.e. 8080/v1/AUTH_17698de747ea403283730999605716c9 ) 7. use swift CLI to validate (i.e. swift list) 8. in Cyberduck, setup a connection ‘Openstack Swift (Keystone HTTP)’, with tenant:username (i.e. marcos.garcia:marcos.garcia) and password, server and port 5000

Connecting to Swift (II)

Connecting to Swift (III)

Connecting to Swift (IV)

Add a comment

Related pages

Initial presentation of swift (for montreal user group ...

Search; Home; Documents; Initial presentation of swift (for montreal user group)
Read more

WHAT IS Swift ? - C24 | Take control of your... -

What is SWIFT? Who uses SWIFT? ... All classes of member pay an initial joining fee and an annual ... i.e. messages from a user to SWIFT and vice ...
Read more

SWIFT - University of Minnesota

Group Photos; Grad Student Profiles; ... The SWIFT sequences developed at CMRR are now available for use at other institutions by MTA ... and presentations:
Read more

OpenStack Swift | Enterprise Storage from SwiftStack

Harness the power of OpenStack Swift from SwiftStack, ... Swift enables users to ... Within a cluster the nodes will also belong to two logical groups: ...
Read more

The xbcloud Binary - Percona

... Initial implementation; ... xbcloud get --swift-container = test_backup --swift-auth-version = 2.0 --swift-user = admin --swift-tenant ... --swift ...
Read more


Manages users, tenants, roles Pluggable backends: SQL, PAM, LDAP, KVS Support for oAuth, SAML, openID. ... Can use Swift to store results. Oslo
Read more

An Evaluation of OpenStack Deployment Frameworks ...

User Groups; Speakers Bureau; Supporting Companies; ... Symantec initial list of proposed features and ... An Evaluation of OpenStack Deployment Frameworks ...
Read more

Your Private Bank - Julius Baer Group

Presentations. Investor ... the Julius Baer Group are not accessible to residents and/or nationals of certain countries. Website users are therefore asked ...
Read more