Scaling Early

50 %
50 %
Information about Scaling Early

Published on October 27, 2007

Author: vishnu

Source: slideshare.net

Description

by Mark Maunder

Scaling an early stage startup by Mark Maunder <mark@feedjit.com>

Why does performance and scaling quickly matter? Slow performance could cost you 20% of your revenue according to Google. Any reduction in hosting costs goes directly to your bottom line as profit or can accelerate growth. In a viral business, slow performance can damage your viral growth.

Slow performance could cost you 20% of your revenue according to Google.

Any reduction in hosting costs goes directly to your bottom line as profit or can accelerate growth.

In a viral business, slow performance can damage your viral growth.

My first missteps Misconfiguration. Web server and DB configured to grab too much RAM. As traffic builds, the server swaps and slows down drastically. Easy to fix – just a quick config change on web server and/or DB.

Misconfiguration. Web server and DB configured to grab too much RAM.

As traffic builds, the server swaps and slows down drastically.

Easy to fix – just a quick config change on web server and/or DB.

Traffic at this stage 2 Widgets per second 10 HTTP requests per second. 1 Widget = 1 Pageview We serve as many pages as our users do, combined.

2 Widgets per second

10 HTTP requests per second.

1 Widget = 1 Pageview

We serve as many pages as our users do, combined.

Keepalive – Good for clients, bad for servers. As http requests increased to 10 per second, I ran out of server threads to handle connections. Keepalive was on and Keepalive Timeout was set to 300. Turned Keepalive off.

As http requests increased to 10 per second, I ran out of server threads to handle connections.

Keepalive was on and Keepalive Timeout was set to 300.

Turned Keepalive off.

Traffic at this stage 4 Widgets per second 20 HTTP requests per second

4 Widgets per second

20 HTTP requests per second

Cache as much DB data as possible I used Perl’s Cache::FileCache to cache either DB data or rendered HTML on disk. MemCacheD, developed for LiveJournal, caches across servers. YMMV – How dynamic is your data?

I used Perl’s Cache::FileCache to cache either DB data or rendered HTML on disk.

MemCacheD, developed for LiveJournal, caches across servers.

YMMV – How dynamic is your data?

MySQL not fast enough High number of writes & deletes on a large single table caused severe slowness. Writes blow away the query cache. MySQL doesn’t support a large number of small tables (over 10,000). MySQL is memory hungry if you want to cache large indexes. I maxed out at about 200 concurrent read/write queries per second with over 1 million records (and that’s not large enough).

High number of writes & deletes on a large single table caused severe slowness.

Writes blow away the query cache.

MySQL doesn’t support a large number of small tables (over 10,000).

MySQL is memory hungry if you want to cache large indexes.

I maxed out at about 200 concurrent read/write queries per second with over 1 million records (and that’s not large enough).

Perl’s Tie::File to the early rescue Tie::File is a very simple flat-file API. Lots of files/tables. Faster – 500 to 1000 concurrent read/writes per second. Prepending requires reading and rewriting the whole file.

Tie::File is a very simple flat-file API.

Lots of files/tables.

Faster – 500 to 1000 concurrent read/writes per second.

Prepending requires reading and rewriting the whole file.

BerkeleyDB is very very fast! I’m also experimenting with BerkeleyDB for some small intensive tasks. Data From Oracle who owns BDB: Just over 90,000 transactional writes per second. Over 1 Million non-transactional writes per second in memory. Oracle’s machine: Linux on an AMD Athlon™ 64 processor 3200+ at 1GHz system with 1GB of RAM. 7200RPM Drive with 8MB cache RAM. Source: http://www.oracle.com/technology/products/berkeley-db/pdf/berkeley-db-perf.pdf

I’m also experimenting with BerkeleyDB for some small intensive tasks.

Data From Oracle who owns BDB: Just over 90,000 transactional writes per second.

Over 1 Million non-transactional writes per second in memory.

Oracle’s machine: Linux on an AMD Athlon™ 64 processor 3200+ at 1GHz system with 1GB of RAM. 7200RPM Drive with 8MB cache RAM.

Traffic at this stage 7 Widgets per second 35 HTTP requests per second

7 Widgets per second

35 HTTP requests per second

Created a separate image and CSS server Enabled Keepalive on the Image server to be nice to clients. Static content requires very little memory per thread/process. Kept Keepalive off on the App server to reduce memory. Added benefit of higher browser concurrency with 2 hostnames. Source: http://www.die.net/musings/page_load_time/

Enabled Keepalive on the Image server to be nice to clients.

Static content requires very little memory per thread/process.

Kept Keepalive off on the App server to reduce memory.

Added benefit of higher browser concurrency with 2 hostnames.

Now using Home Grown Fixed Length Records A lot like ISAM or MyISAM Fixed length records mean we seek directly to the data. No more file slurping. Sequential records mean sequential reads which are fast. Still using file level locking. Benchmarked at 20,000+ concurrent reads/writes/deletes.

A lot like ISAM or MyISAM

Fixed length records mean we seek directly to the data. No more file slurping.

Sequential records mean sequential reads which are fast.

Still using file level locking.

Benchmarked at 20,000+ concurrent reads/writes/deletes.

Traffic at this stage 12 Widgets per second 50 to 60 HTTP requests per second Load average spiking to 12 or more about 3 times per day for unknown reason.

12 Widgets per second

50 to 60 HTTP requests per second

Load average spiking to 12 or more about 3 times per day for unknown reason.

Blocking Content Thieves Content thieves were aggressively crawling our site on pages that are CPU intensive. Robots.txt is irrelevant. Reverse DNS lookup with ‘dig –x’ Firewall the &^%$@’s with ‘iptables’

Content thieves were aggressively crawling our site on pages that are CPU intensive.

Robots.txt is irrelevant.

Reverse DNS lookup with ‘dig –x’

Firewall the &^%$@’s with ‘iptables’

Moved to httpd.prefork Httpd.worker consumes more memory than prefork because worker doesn’t share memory. Tuning the number of Perl interpreters vs number of threads didn’t improve things. Prefork with no keepalive on the app server uses less RAM and works well – for Mod_Perl.

Httpd.worker consumes more memory than prefork because worker doesn’t share memory.

Tuning the number of Perl interpreters vs number of threads didn’t improve things.

Prefork with no keepalive on the app server uses less RAM and works well – for Mod_Perl.

The amazing Linux Filesystem Cache Linux uses spare memory to cache files on disk. Lots of spare memory == Much faster I/O. Prefork freed lots of memory. 1.3 Gigs out of 2 Gigs is used as cache. I’ve noticed a roughly 20% performance increase since using it.

Linux uses spare memory to cache files on disk.

Lots of spare memory == Much faster I/O.

Prefork freed lots of memory. 1.3 Gigs out of 2 Gigs is used as cache.

I’ve noticed a roughly 20% performance increase since using it.

Tools httperf for benchmarking your server Websitepulse.com for perf monitoring.

httperf for benchmarking your server

Websitepulse.com for perf monitoring.

Summary Make content as static as possible. Cache as much of your dynamic content as possible. Separate serving app requests and serving static content. Don’t underestimate the speed of lightweight file access API’s . Only serve real users and search engines you care about.

Make content as static as possible.

Cache as much of your dynamic content as possible.

Separate serving app requests and serving static content.

Don’t underestimate the speed of lightweight file access API’s .

Only serve real users and search engines you care about.

Add a comment

Related presentations

Related pages

500 Startups: Scaling Early-Stage Investing | Stanford ...

This case focuses on the investment strategy employed by 500 Startups, an early-stage investment firm founded by Dave McClure. McClure, an outspoken ...
Read more

Center for Early Literacy Learning Scaling Up Early ...

Scaling Up Early Childhood Intervention Literacy Learning Practices ... tion of scaling up early literacy learning practices. Th e fi rst is
Read more

Capitalizing on Potential: Scaling Early College High Schools

Capitalizing On Potential: Scaling Early College High Schools | 1. wwwknowledgeworksorg. A State Policy Brief March 2015. Capitalizing on Potential:
Read more

Early Childhood Matters Magazine

Lessons from our first two and a half years: Scaling up Early Child Development
Read more

#1 Cause of Startup Death? Premature Scaling - Forbes

Opinions expressed by Forbes Contributors are ... most startups fail precisely because they try to scale too early. ... Premature Scaling.
Read more

Scaling Early Stage Startups - High Scalability -

More Favorites... YouTube Architecture; Plenty Of Fish Architecture; Google Architecture; How Twitter Stores 250 Million Tweets A Day Using MySQL; Scaling ...
Read more

SCALING-UP EARLY CHILD DEVELOPMENT IN ROMANIA (PDF ...

Official Full-Text Publication: SCALING-UP EARLY CHILD DEVELOPMENT IN ROMANIA on ResearchGate, the professional network for scientists.
Read more

Early Learning: Lessons from Scaling Up - Early Childhood ...

Early Childhood Development; Ebola; Entertainment-Education; Evidence: Under 5 Health and Development; Fragile Contexts ; Girls Rights; Health; HIV-AIDS;
Read more

Scaling Up Early College - SREE

Scaling Up Early College SREE Symposium Proposal: Education and Life Cycle Transitions 1 Scaling up Early Colleges: Implementation and Impacts across Settings
Read more