Published on January 28, 2008
Super-sizing YouTube with Python Mike Solomon firstname.lastname@example.org
Welcome this is about scaling a web application there are a lot of things left out - mostly mistakes and implementation details this may generate more questions than it answers my goal is to give you ideas for solving your own problems
Architecture this is the core of scalability systems change over time, so will your architecture impossible to predict the optimal approach start simple aim for local maxima python enables ﬂexibility
YouTube's Early Days web boxes do everything servlets, images, thumbnails, search shoehorn everything into Apache, MySQL very simple this survives longer than you'd think
hw load balancer httpd mod_python db objects search thumbnails biz logic servlets templates Early Web Stack db master circa January ‘06 db replicas
Early Key Factors in Engineering really small team we python logical separation in code discipline and honor - not linguistically enforced (don’t waste time writing code to restrict people)* grown by systematically removing bottlenecks easy to know when something is a `win`
Running Without Tripping user demand can grow 50% in a day removing one bottleneck can immediately reveal another (usually more heinous) replace and migrate components as they become problems good (python) components make this easy obviously, pick your battles
Good Components (Hypothetical) minimize dependencies* accept some latency localize failures - don’t let them spread you are only down if it looks like you are applies to both systems and software
Balance Machine Resources more efﬁcient resource utilization via specialized deployment balance based on CPU, RAM, network and disk usage patterns overlay orthogonal loads disjoint tasks running on the same physical hardware
Migratory Patterns of the Norwegian Blue move from mod_python to mod_fastcgi move thumbnails to their own machines make search to a remote service running on separate machines run transcoder processes on video servers do more with the same hardware
Serenity Now Can you spot where we turned on transcoding processes?
SQL Shenanigans if you have a relational database, it will be abused difﬁcult to track the true source series of object proxies for DB-API enable logging encode a portion of call stack as a query comment* (more about this later)
Object Caching take pressure off of relational db can save additional resources if your objects require signiﬁcant computation to set up memcached makes a good home for this need good client to make this into a truly useful service ‡ pools and better failure handling
Software Optimization fast vs fast enough strive for machine efﬁciency - don't obsess be scientiﬁc - collect data and understand it can yield some surprising results don't assume code optimization techniques from another language are relevant just like carpentry, measure twice cut once
Python Optimization pure python HMAC was 40% of web cpu write a few lines of C threaded comments ﬁasco overly complex algorithm to compute the display object tree simplify query, simplify algorithm
Python Optimization psyco - specializing compiler for Python 'hot' functions are psyco-ized there is a 'context switch' penalty so you need to experiment to see if it helps previous threaded comments algorithm -closure +psyco = 400% boost
Reasonable Efﬁciency pruned all the obvious leaf services dynamic web requests are one `service` web service is easy to scale, so it stresses out other resources - probably a DB DB’s are hard(er) to scale tricks of escalating cleverness‡ eventually, no cards left to play
Scaling MySQL pretty much have to go horizontal choose your partition plan carefully understand your data access patterns what queries do you run most often? do you have joins? do you need transactional consistency? why? does an 'entity' emerge?
Partition By Entity entities are 'transactional' allow joins across properties of an entity entities are migratory cross entity is more complicated weaken guarantees to make it easier minimize activity by design
EMD, a TLA not an ORM! connection and transaction management lookup service query factory minimalist table abstraction ORM can be (is?) evil make common behaviors simple, while leaving some transparency to the actual database
Seismic Retroﬁt apply this fundamental change to a large and growing site make it relatively painless with python multiple inheritance decorators AST plugins for validation and testing
Resulting API all the scale-aware code nicely opaque to application developers base use cases are painless User.select_by_username(db_context, username) Video.select_by_id(db_context, video_id) Video.select_by_user_id(db_context, user_id)
Bulk Entity Migration hijack mysql replication to partition on the ﬂy while the live site is running all DML gets tagged with an entity id read master binlog and selectively replay it into a set of new mini-masters update lookup service to point to new resources
Recurring Themes the elegance of simplicity take reliable open software and customize it `pythonic veneer` DIY - ﬁling a ticket for a bugﬁx doesn’t give me a warm feeling - take matters into your own hands*
Super-sizing YouTube with Python Mike Solomon email@example.com. this is about scaling a web application ... move from mod_python to mod_fastcgi
Session Super-sizing YouTube. Mike Solomon, YouTube Track: Python Date: Thursday, July 26 Time: 10:45am - 11:30am Location: Portland 256
By Mike Solomon ... Super Sizing YouTube with Python. January 29, 2008 § Leave a comment
Super Sizing Youtube with Python . Home; Upload; Search; Marketplace; FAQ; Contact; Register; Login; Upload pdf; Super Sizing Youtube with Python ...
Recent Posts. Achieving a Perfect SSL Labs Score with Go; JSONScript – Asynchronous scripting language using JSON format; uWebSockets: Truly scalable and ...
Which Python web framework was YouTube built with when they started off? ... Super Sizing Youtube with Python. Using the Wayback Machine links above, ...
... Numero Prive 120 Python Lucido Color: Oyster Purchased at the Christian Louboutin Bo ... Christian Louboutin Pigalle Sizing and More ...
EEEnthusiast; Videos Playlists; Channels; Discussion; About; Home Trending History Best of YouTube Music ... LED Blink in Python ...