Super Sizing Youtube with Python

50 %
50 %
Information about Super Sizing Youtube with Python

Published on January 28, 2008

Author: didip



by Mike Solomon.

See more scalability tales at:

Super-sizing YouTube with Python Mike Solomon

Welcome this is about scaling a web application there are a lot of things left out - mostly mistakes and implementation details this may generate more questions than it answers my goal is to give you ideas for solving your own problems

Architecture this is the core of scalability systems change over time, so will your architecture impossible to predict the optimal approach start simple aim for local maxima python enables flexibility

YouTube's Early Days web boxes do everything servlets, images, thumbnails, search shoehorn everything into Apache, MySQL very simple this survives longer than you'd think

hw load balancer httpd mod_python db objects search thumbnails biz logic servlets templates Early Web Stack db master circa January ‘06 db replicas

Early Key Factors in Engineering really small team we python logical separation in code discipline and honor - not linguistically enforced (don’t waste time writing code to restrict people)* grown by systematically removing bottlenecks easy to know when something is a `win`

Running Without Tripping user demand can grow 50% in a day removing one bottleneck can immediately reveal another (usually more heinous) replace and migrate components as they become problems good (python) components make this easy obviously, pick your battles

Good Components (Hypothetical) minimize dependencies* accept some latency localize failures - don’t let them spread you are only down if it looks like you are applies to both systems and software

Balance Machine Resources more efficient resource utilization via specialized deployment balance based on CPU, RAM, network and disk usage patterns overlay orthogonal loads disjoint tasks running on the same physical hardware

Migratory Patterns of the Norwegian Blue move from mod_python to mod_fastcgi move thumbnails to their own machines make search to a remote service running on separate machines run transcoder processes on video servers do more with the same hardware

Serenity Now Can you spot where we turned on transcoding processes?

SQL Shenanigans if you have a relational database, it will be abused difficult to track the true source series of object proxies for DB-API enable logging encode a portion of call stack as a query comment* (more about this later)

Object Caching take pressure off of relational db can save additional resources if your objects require significant computation to set up memcached makes a good home for this need good client to make this into a truly useful service ‡ pools and better failure handling

Software Optimization fast vs fast enough strive for machine efficiency - don't obsess be scientific - collect data and understand it can yield some surprising results don't assume code optimization techniques from another language are relevant just like carpentry, measure twice cut once

Python Optimization pure python HMAC was 40% of web cpu write a few lines of C threaded comments fiasco overly complex algorithm to compute the display object tree simplify query, simplify algorithm

Python Optimization psyco - specializing compiler for Python 'hot' functions are psyco-ized there is a 'context switch' penalty so you need to experiment to see if it helps previous threaded comments algorithm -closure +psyco = 400% boost

Reasonable Efficiency pruned all the obvious leaf services dynamic web requests are one `service` web service is easy to scale, so it stresses out other resources - probably a DB DB’s are hard(er) to scale tricks of escalating cleverness‡ eventually, no cards left to play

Scaling MySQL pretty much have to go horizontal choose your partition plan carefully understand your data access patterns what queries do you run most often? do you have joins? do you need transactional consistency? why? does an 'entity' emerge?

Partition By Entity entities are 'transactional' allow joins across properties of an entity entities are migratory cross entity is more complicated weaken guarantees to make it easier minimize activity by design

EMD, a TLA not an ORM! connection and transaction management lookup service query factory minimalist table abstraction ORM can be (is?) evil make common behaviors simple, while leaving some transparency to the actual database

Seismic Retrofit apply this fundamental change to a large and growing site make it relatively painless with python multiple inheritance decorators AST plugins for validation and testing

Resulting API all the scale-aware code nicely opaque to application developers base use cases are painless User.select_by_username(db_context, username) Video.select_by_id(db_context, video_id) Video.select_by_user_id(db_context, user_id)

Bulk Entity Migration hijack mysql replication to partition on the fly while the live site is running all DML gets tagged with an entity id read master binlog and selectively replay it into a set of new mini-masters update lookup service to point to new resources

Recurring Themes the elegance of simplicity take reliable open software and customize it `pythonic veneer` DIY - filing a ticket for a bugfix doesn’t give me a warm feeling - take matters into your own hands*


Add a comment

Related pages

Super-sizing YouTube with Python

Super-sizing YouTube with Python Mike Solomon this is about scaling a web application ... move from mod_python to mod_fastcgi
Read more

O'Reilly Open Source Convention 2007 • July 23-27, 2007 ...

Session Super-sizing YouTube. Mike Solomon, YouTube Track: Python Date: Thursday, July 26 Time: 10:45am - 11:30am Location: Portland 256
Read more

Super Sizing YouTube with Python | RAPD

By Mike Solomon ... Super Sizing YouTube with Python. January 29, 2008 § Leave a comment
Read more

Super Sizing Youtube with Python -

Super Sizing Youtube with Python . Home; Upload; Search; Marketplace; FAQ; Contact; Register; Login; Upload pdf; Super Sizing Youtube with Python ...
Read more

Super Sizing Youtube with Python – thoughts…

Recent Posts. Achieving a Perfect SSL Labs Score with Go; JSONScript – Asynchronous scripting language using JSON format; uWebSockets: Truly scalable and ...
Read more

Which Python web framework was YouTube built with when ...

Which Python web framework was YouTube built with when they started off? ... Super Sizing Youtube with Python. Using the Wayback Machine links above, ...
Read more

My Christian Louboutin Unboxing! - YouTube

... Numero Prive 120 Python Lucido Color: Oyster Purchased at the Christian Louboutin Bo ... Christian Louboutin Pigalle Sizing and More ...
Read more

EEEnthusiast - YouTube

EEEnthusiast; Videos Playlists; Channels; Discussion; About; Home Trending History Best of YouTube Music ... LED Blink in Python ...
Read more