Feed Burner Scalability

50 %
50 %
Information about Feed Burner Scalability

Published on January 28, 2008

Author: didip

Source: slideshare.net

Description

See more scalability tales at:
http://rapd.wordpress.com

FeedBurner: Scalable Web Applications using MySQL and Java Joe Kottke, Director of Network Operations

What is FeedBurner? 2 • Market-leading feed management provider • 170,000 bloggers, podcasters and commercial publishers including Reuters, USA TODAY, Newsweek, Ars Technica, BoingBoing… • 11 million subscribers in 190 countries. • Web-based services help publishers expand their reach online, attract subscribers and make money from their content • The largest advertising network for feeds © 2006 F eedBurner

Scaling history 3 • July 2004 – 300Kbps, 5,600 feeds – 3 app servers, 3 web servers 2 DB servers • April 2005 – 5Mbps, 47,700 feeds – My first MySQL Users Conference – 6 app servers, 6 web servers (same machines) • September 2005 – 20Mbps, 109,200 feeds • Currently – 115 Mbps, 270,000 feeds, 100 Million hits per day © 2006 F eedBurner

Scalability Problem 1: Plain old reliability 4 • August 2004 • 3 web servers, 3 app servers, 2 DB servers. Round Robin DNS • Single-server failure, seen by 1/3 of all users © 2006 F eedBurner

Solution: Load Balancers, Monitoring 5 • Health Check pages – Round trip all the way back to the database – Same page monitored by load balancers and monitoring • Monitoring – Cacti (http://www.cacti.net/) – Nagios (http://www.nagios.org) © 2006 F eedBurner

Health Check 6 UserComponent uc = UserComponentFactory.getUserComponent(); User user = uc.getUser(”monitor-userquot;); // If first load, mark as down. // Let FeedServlet mark things as up in init method. load-on-startup String healthcheck = (String) application.getAttribute(quot;healthcheckquot;); if(healthcheck == null || healthcheck.length() < 1) { healthcheck = new String(”DOWNquot;); application.setAttribute(quot;healthcheckquot;,healthcheck); } // We return null in case of problem, or if user doesn’t exist if( user == null ) { healthcheck = new String(quot;DOWNquot;); application.setAttribute(quot;healthcheckquot;,healthcheck); } System.out.print(healthcheck); © 2006 F eedBurner

Cacti 7 © 2006 F eedBurner

Start/Stop scripts 8 #!/bin/bash # Source the environment . ${HOME}/fb.env # Start TOMCAT cd ${FB_APPHOME} # Remove stale temp files find ~/rsspp/catalina/temp/ -type f -exec rm -rf {} ; # Remove the work directory #rm -rf ~/rsspp/catalina/work/* ${CATALINA_HOME}/bin/startup.sh © 2006 F eedBurner

Start/Stop scripts 9 #!/bin/bash FB_APPHOME=/opt/fb/fb-app JAVA_HOME=/usr CATALINA_HOME=/opt/tomcat CATALINA_BASE=${FB_APPHOME}/catalina CATALINA_OPTS=quot;-Xmx768m -Xms7688m -Dnetworkaddress.cache.ttl=0quot; WEBROOT=/opt/fb/webroot export JAVA_HOME CATALINA_HOME CATALINA_BASE CATALINA_OPTS WEBROOT © 2006 F eedBurner

Scalability Problem 2: Stats recording/mgmt 10 • Every hit is recorded • Certain hits mean more than others • Flight recorder • Any table management locks • Inserts slow way down (90GB table) © 2006 F eedBurner

Solution: Executor Pool 11 • Executor Pool – Doug Lea’s concurrency library – Use a PooledExecutor so stats inserts happen in a separate thread – Spring bean definition: <bean id=quot;StatsExecutorquot; class=quot;EDU.oswego.cs.dl.util.concurrent.PooledExecutorquot;> <constructor-arg> <bean class=quot;EDU.oswego.cs.dl.util.concurrent.LinkedQueuequot;/> </constructor-arg> <property name=quot;minimumPoolSizequot; value=quot;10quot; /> <property name=quot;keepAliveTimequot; value=quot;5000quot; /> </bean> © 2006 F eedBurner

Solution: Lazy rollup 12 • Only today’s detailed stats need to go against real-time table • Roll up previous days into sparse summary tables on-demand • First access for stats for a day is slow, subsequent request are fast © 2006 F eedBurner

Scalability Problem 3: Primary DB overload 13 • Mostly used master DB server for everything • Read vs. Read/Write load didn’t matter in the beginnning • Slow inserts would block reads, when using MyISAM © 2006 F eedBurner

Solution: Balance read and read/write load 14 • Looked at workload – Found where we could break up read vs. read/write – Created Spring ExtendedDaoObjects – Tomcat-managed DataSources • Balanced master vs. slave load (Duh) – Slave becomes perfect place for snapshot backups • Watch for replication problems – Merge table problems (later) – Slow queries slow down replication © 2006 F eedBurner

Example: Cacti graph of MySQL handlers 15 © 2006 F eedBurner

ExtendedDaoObject 16 • Application code extends this class and uses getHibernateTemplate() or getReadOnlyHibernateTemplate() depending upon requirements • Similar class for JDBC public class ExtendedHibernateDaoSupport extends HibernateDaoSupport { private HibernateTemplate readOnlyHibernateTemplate; public void setReadOnlySessionFactory(SessionFactory sessionFactory) { this.readOnlyHibernateTemplate = new HibernateTemplate(sessionFactory); readOnlyHibernateTemplate.setFlushMode(HibernateTemplate.FLUSH_NEVER); } protected HibernateTemplate getReadOnlyHibernateTemplate() { return (readOnlyHibernateTemplate == null) ? getHibernateTemplate() : readOnlyHibernateTemplate; } } © 2006 F eedBurner

Scalability Problem 4: Total DB overload 17 • Everything slowing down • Using DB as cache • Database is the ‘shared’ part of all app servers • Ran into table size limit defaults on MyISAM (4GB). We were lazy. – Had to use Merge tables as a bridge to newer larger tables © 2006 F eedBurner

Solution: Stop using the database 18 • Where possible :) • Multi-level caching – Local VM caching (EHCache, memory only) – Memcached (http://www.danga.com/memcached/) – And finally, database. • Memcached – Fault-tolerant, but client handles that. – Shared nothing – Data is transient, can be recreated © 2006 F eedBurner

Scalability Problem 5: Lazy initialization 19 • Our stats get rolled up on demand – Popular feeds slowed down the whole system • FeedCount chicklet calculation – Every feed gets its circulation calculated at the same time – Contention on the table © 2006 F eedBurner

Solution: BATCH PROCESSING 20 • For FeedCount, we staggered the calculation – Still would run into contention – Stats stuff again slowed down at 1AM Chicago time. • We now process the rolled-up data every night – Delay showing the previous circulation in the FeedCount until roll-up is done. • Still wasn’t enough © 2006 F eedBurner

Scalability Problem 6: Stats writes, again 21 • Too much writing to master DB • More and more data stored associated with each feed • More stats tracking – Ad Stats – Item Stats – Circulation Stats © 2006 F eedBurner

Solution: Merge Tables 22 • After the nightly rollup, we truncate the subtable from 2 days ago • Gotcha with truncating a subtable: – FLUSH TABLES; TRUNCATE TABLE ad_stats0; – Could succeed on master, but fail on slave • The right way to truncate a subtable: – ALTER TABLE ad_stats TYPE=MERGE UNION=(ad_stats1,ad_stats2); – TRUNCATE TABLE ad_stats0; – ALTER TABLE ad_stats TYPE=MERGE UNION=(ad_stats0,ad_stats1,ad_stats2); © 2006 F eedBurner

Solution: Horizontal Partitioning 23 • Constantly identifying hot spots in the database – Ad serving – Flare serving – Circulation (constant writes, occasional reads) • Move hottest tables/queries off to own clusters – Hibernate and certain lazy patterns allow this – Keeps the driving tables from slowing down © 2006 F eedBurner

Scalability Problem 7: Master DB Failure 24 • Still using just a primary and slave • Master crash: Single point of failure • No easy way to promote a slave to a master © 2006 F eedBurner

Solution: No easy answer 25 • Still using auto_increment – Multi-master replication is out • Tried DRBD + HeartBeat – Disk is replicated block-by-block – Hot primary, cold secondary • Didn’t work as we hoped – Myisamchk takes too long after failure – I/O + CPU overhead • InnoDB is supposedly better © 2006 F eedBurner

Our multi-master solution 26 • Low-volume master cluster – Uses DRBD + HeartBeat – Works well under smaller load – Does mapping to feed data clusters • Feed Data Cluster – Standard Master + Slave(s) structure – Can be added as needed © 2006 F eedBurner

Mapping / Marshalling Database Cluster 27 © 2006 F eedBurner

Scalability Problem 8: Power Failure 28 • Chicago has ‘questionable’ infrastructure. • Battery backup, generators can be problematic • Colo techs have been known to hit the Big Red Switch • Needed a disaster recovery/secondary site – Active/Active not possible for us. Yet. – Would have to keep fast connection to redundant site – Would require 100% of current hardware, but would lie quiet © 2006 F eedBurner

Code Name: Panic App 29 • Product Name: Feed Insurance • Elegant, simple solution • Not Java (sorry) • Perl-based feed fetcher – Downloads copies of feeds, saved as flat XML files – Synchronized out to local and remote servers – Special rules for click tracking, dynamic GIFs, etc © 2006 F eedBurner

General guidelines 30 • Know your DB workload – Cacti really helps with this • ‘EXPLAIN’ all of your queries – Helps keep crushing queries out of the system • Cache everything that you can • Profile your code – Usually only needed on hard-to-find leaks © 2006 F eedBurner

Our settings / what we use 31 • Don’t always need the latest and greatest – Hibernate 2.1 – Spring – DBCP – MySQL 4.1 – Tomcat 5.0.x • Let the container manage DataSources © 2006 F eedBurner

JDBC 32 • Hibernate/iBatis/Name-Your-ORM-Here – Use ORM when appropriate – Watch the queries that your ORM generates – Don't be afraid to drop to JDBC • Driver parameters we use: # For Internationalization of Ads, multi-byte characters in general useUnicode=true characterEncoding=UTF-8 # Biggest performance bits cacheServerConfiguration=true useLocalSessionState=true # Some other settings that we've needed as things have evolved useServerPrepStmts=false jdbcCompliantTruncation=false © 2006 F eedBurner

Thank You 33 Questions? joek@feedburner.com © 2006 F eedBurner

Add a comment

Related pages

Feed Burner Scalability - Nagios Exchange

Nagios Exchange - The official site for hundreds of community-contributed Nagios plugins, addons, extensions, enhancements, and more!
Read more

FeedBurner Architecture - High Scalability

FeedBurner is a news feed management provider launched in 2004. FeedBurner provides custom RSS feeds and management tools to bloggers, podcasters, and ...
Read more

Review Feed Burner Scalability - Nagios Exchange

Directory. Feed Burner Scalability. You need to login first before you can write any reviews. Back to Listing: Awards: Sitemap:
Read more

Nagios Planet - Feed Burner Scalability

Home Archives Nagios Exchange Feed Burner Scalability. About Nagios Planet Nagios Planet is an aggregation of news feeds from around the Nagios world.
Read more

FeedBurner Architecture | High Scalability

FeedBurner Architecture (6) FeedBurner is a news feed management provider launched in 2004. FeedBurner provides custom RSS feeds and management tools to ...
Read more

FeedBurner: Alive and Well - mashable.com

FeedBurner and all their kin are indeed alive and well. They pinged their blog readership today with an update as to exactly what they've been ...
Read more

FeedBurner: Scalable Web Applications using MySQL and Java

FeedBurner: Scalable Web Applications using MySQL and Java Joe Kottke, Director of Network Operations
Read more

Download Exchange 2010 SP2 Multi-Tenant Scale Guidance ...

This document provides scalability and deployment guidance for multi-tenant Microsoft Exchange Server 2010 Service Pack 2 solutions.
Read more

Download SharePoint Server 2010 performance and capacity ...

Troubleshooting performance and scalability;
Read more