Elasticsearch at Automattic

71 %
29 %
Information about Elasticsearch at Automattic
Technology

Published on February 25, 2014

Author: GregBrown20

Source: slideshare.net

Description

Presentation from the Elasticsearch Denver Meetup.

Discusses scaling of Elasticsearch for Related Posts across WordPress.com and some of the big changes that were needed in order to scale for 23 million queries a day across 800 million documents.

at Tuesday, February 25, 14

Greg Ichneumon Brown Data Wrangler at Automattic http://gibrown.wordpress.com @gregibrown greg@automattic.com Tuesday, February 25, 14

Tuesday, February 25, 14

1 Billion Monthly Uniques Tuesday, February 25, 14

Elasticsearch Deployments Internal Search - 216 Internal Blogs - 750k docs [3 GB] Support Documents - KNN Link Prediction - 1.7m docs [14 GB] Polldaddy - Word Clouds/Freq Response - 39m docs [9 GB] WordPress.com VIP Search - KFF.org - 18m docs [99 MB] - NY Post - 600k docs [2.3 GB] WordPress.com - ~800m docs [4 TB] - Related Posts - 48 mil reqs/day - search.wordpress.com - 3 mil reqs/day Tuesday, February 25, 14

Overview of Related Posts Our “10X Improvements” - Indexing - Querying Our Open Issues Tuesday, February 25, 14

Related Posts Search within just the one blog Tuesday, February 25, 14

WordPress.com Total Elasticsearch Operations Operation Routed Queries 23 mil Global Queries 2 mil Docs Indexed 13 mil Docs Updated 10 mil Docs Deleted 2.5 mil Delete By Query Tuesday, February 25, 14 Ops/Day 250k

Global Cluster DC1 1 Master DC2 DC3 1 Master 14 Data Tuesday, February 25, 14 14 Data 1 Master 14 Data

Our Secret To Scaling Routed Queries All Posts for each Blog are on the same Shard Tuesday, February 25, 14

Global Index 7 Indices 10 mil Blogs per Index 25 Shards per Index 175 Shards Total Tuesday, February 25, 14

Overview of Related Posts Our “10X Improvements” - Indexing - Querying Our Open Issues Tuesday, February 25, 14

20% Improvements Don’t solve scaling problems Tuesday, February 25, 14

Indexing Entangling Elasticsearch with Existing Systems Tuesday, February 25, 14

Bulk Indexing 1.0 44 Days to Index all Posts (estimated) Tuesday, February 25, 14

Bulk Indexing Problems - Overhead: Spent too much time starting indexing jobs WordPress.com has 500 mil MySQL tables. - High DB Load: Corner Cases. Blogs with 1+ mil followers. - High DB Load: Indexing sequentially doesn’t spread the load. - High DB Load: Heavy load on archive DBs. Tuesday, February 25, 14

Bulk Indexing Today 12.0? 4 Days to Index all Posts (running right now) Tuesday, February 25, 14

Real Time Indexing The Hardest Part! Tuesday, February 25, 14

Real Time Goals 1) Eventually Consistent 2) Minimize Bulk Re-indexing 3) Normally updated < 1 minute Tuesday, February 25, 14

Real Time Goals 1) Eventually Consistent 2) Minimize Bulk Re-indexing 3) Normally updated < 1 minute Bulk reindexed 3 times in 5 months. One intentional, Two during system upgrades. Tuesday, February 25, 14

Stuff Fails 1) Humans 2) Hardware 3) Elasticsearch (steady improvements) Combinations of the above. Tuesday, February 25, 14

Hardware Problems 1) Detect and Track Down Servers 2) Prioritize Queries over Indexing 3) Throttle Indexing Jobs - any issues: block bulk changes to blogs - >10 min: block doc updates - >20 min: block all indexing Tuesday, February 25, 14

Real Time Failures 1) Auto Retry Failed Indexing Jobs 2) Indexing Queue for Failures 3) Scrolling Queries to Find Bad Docs Tuesday, February 25, 14

Cluster Restarts Indexing across replicas is non-deterministic Segments diverge Slows Restart Time Tuesday, February 25, 14

Simplistic Example Docs Shard 1 merges Primary Replica Segments w/ identical checksums Tuesday, February 25, 14 Only first segment is identical

After Bulk Index Every segment is out of sync! Tuesday, February 25, 14

Our Bulk Indexing Procedure 1) Bulk Index All Docs 2) Optimize the index 3) Rolling Restart (sync segments) 4) Future restarts will be much faster. - Play with recovery settings - SSDs? => use noop Linux scheduling Tuesday, February 25, 14

Indexing It’s all about handling Failures Tuesday, February 25, 14

Overview of Related Posts Our “10X Improvements” - Indexing - Querying Our Open Issues Tuesday, February 25, 14

Querying Test and Iterate Tuesday, February 25, 14

Related Posts Query Started with MoreLikeThis API. Did not scale well enough. Tuesday, February 25, 14

MLT API 1) Get Document 2) Analyze Document 3) Search for Similar Docs Tuesday, February 25, 14

MLT API vs MLT Query MLT API MLT Query 147 req/sec 1062 req/sec 40% CPU 30% CPU 306 ms median latency 49.5 ms median latency All processing by ES Tuesday, February 25, 14 Build query in PHP

Related Posts Relevancy Great With Long Content { "more_like_this":{ "fields":["mlt_content"], "like_text":"Scaling Elasticsearch Part 1: Overview ElasticSearch scaling Search We recently launched Related Posts across WordPress.com, so its time to pop the hood and take a look at what ended up in our engine... ", "percent_terms_to_match":0.08, "boost_terms":5, "analyzer": "en_analyzer" }} Tuesday, February 25, 14

MLT Query Relevancy Use match or multi_match for short content. Average Related Posts CTR Tuesday, February 25, 14

Language Analyzers arabic, armenian, basque, brazilian, bulgarian, catalan, chinese, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, italian, japanese, korean, norwegian, persian, portuguese, romanian, russian, spanish, swedish, turkish, thai Tuesday, February 25, 14

Related Posts Relevancy How Important is using the correct Language Analyzer? Tuesday, February 25, 14

Related Posts Relevancy How Important is using the correct Language Analyzer? Doubled Click Through Rate Tuesday, February 25, 14

Unfortunately Increased Slow Queries (>1 second) by 10x still worth it. Tuesday, February 25, 14

Global Query Performance search.wordpress.com Tuesday, February 25, 14

Parent-Child Filtering Blog Doc public: true|false Post Doc title: “...” content: “...” Tuesday, February 25, 14

has_parent Filter Querying Across All Shards With has_parent Without has_parent 7.6 req/sec 17.5 req/sec 75% CPU 50% CPU 503 ms median latency 207 ms median latency Requires more Indexing Tuesday, February 25, 14

Indexing: Optimize to Handle Failures Querying: Test and Iterate Tuesday, February 25, 14

Overview of Related Posts Our “10X Improvements” - Indexing - Querying Our Open Issues Tuesday, February 25, 14

Open Issues Slow Queries (> 1 second) Getting Better. Shards are too big. Tuesday, February 25, 14

Open Issues What does it take to scale? 3x Data 5x Queries Tuesday, February 25, 14

Open Issues Elasticsearch for Natural Language Processing? At Scale. On Live Data. Tuesday, February 25, 14

http://gibrown.wordpress.com @gregibrown Feeling Inspired? http://automattic.com/work-with-us/data-wrangler/ Tuesday, February 25, 14

Add a comment

Related presentations

Related pages

GitHub - Automattic/elasticsearch-statsd-plugin: ES StatsD ...

README.md Elasticsearch StatsD Plugin. This plugin creates a little push service, which regularly updates a StatsD host with indices stats and nodes stats.
Read more

Presentation: Elasticsearch at Automattic | gibrown

I gave a presentation at the Elasticsearch Denver meetup last night covering some of the bigger changes we had to make to scale a cluster to handle all ...
Read more

Elasticsearch StatsD Plugin – xyu.io

If you're running a multi-node Elasticsearch cluster checkout Automattic's fork of the Elasticsearch StatsD Plugin for pushing cluster and node metrics to ...
Read more

GitHub - Automattic/es-backbone: ElasticSearch Backbone ...

es-backbone - ElasticSearch Backbone library for quickly building Faceted Search front ends.
Read more

ElasticSearch | gibrown

Posts about ElasticSearch written by Greg Ichneumon Brown. gibrown NLP, Search, and Web Development. ... Presentation: Elasticsearch at Automattic.
Read more

WordPress Full Text Search With ElasticSearch

WordPress is a great piece of software: It powers million of websites though its native search feature is not as powerful as it could. In fact, Automattic ...
Read more

Elasticsearch & 63 Million WordPress Sites // Speaker Deck

Overview of the Elasticsearch infrastructure that Automattic maintains to support WordPress.com.
Read more

Blog dedicated to Elasticsearch Server Books series

Elasticsearch Server; Mastering Elasticsearch; Tagged book, elasticsearch, sale. Feb 22 2015. Leave a comment. Announcement, Book. ... Chunk by Automattic. ...
Read more