How elasticsearch powers the Guardian's newsroom

60 %
40 %
Information about How elasticsearch powers the Guardian's newsroom
Technology

Published on March 12, 2014

Author: tackers

Source: slideshare.net

Description

http://qconlondon.com/london-2014/presentation/How%20Elasticsearch%20Powers%20the%20Guardian's%20Newsroom:

theguardian.com is one of the world's most popular news websites, visited by over 80 million unique browsers every month. Yet in the past, their journalists and editors found it difficult to get meaningful, timely data on what people were reading.

In response to these issues, Graham and colleagues at the Guardian built "ophan", an in-house real-time analytics system based on Elasticsearch. By working closely with journalists and editors, they've focused on what they can action to provide a better experience for the Guardian's existing readers and enable more people discover their unique content.

In this talk, Graham will dive into the details of ophan - obstacles faced by the newsroom that prompted them to build the system, how it works for alerting and how the tool has made the Guardian's readers - and staffers - lives better. While Graham explores this real world use case, Shay will cover the technical underpinnings of ophan with a deep dive into the Elasticsearch features and functionality that power the ophan system.

Attendees will leave with a solid understanding of Elasticsearch's features and architecture, all gained through the lens of a real-world and hyperlocal use case.

How Elasticsearch powers the Guardian’s newsroom graham tackley ■ @tackers director of architecture guardian news and media shay banon ■ @kimchy creator, co-founder and cto elasticsearch

“created in 1936 ... to secure the financial and editorial independence of the Guardian in perpetuity”

our in-house real-time traffic tool

my desktop workstation production apaches something htmly ?

ssh $SERVER "nice tail -f /apache2/logs/guardian-access_log"

my desktop workstation 2 x production apaches publisher ssh “tail” zeromq x SEO dashboard

my desktop workstationx

Javascript in browser SNS SQS hidden pixel Dashboard Tracker

Javascript in browser Tracker SNS SQS hidden pixel SQS Dashboard Serf elasticsearch Dashboard

12 * m3.xlarge in an autoscaling group (with manual scaling) instance store (SSD) https://github.com/guardian/status-app

{ "dt": "2014-03-03T02:01:48.026Z", "url": "http://www.theguardian.com/film/2014/mar/03/oscars-2014-winners-list", "queryString": "", "host": "www.theguardian.com", "path": "/film/2014/mar/03/oscars-2014-winners-list", "section": "film", "platform": "r2", "userAgent": { "type": "Browser", "family": "Safari 5.1.9", "os": "OS X 10.6.8", "device": "Personal computer" }, "documentReferrer": "http://www.theguardian.com/world", "browser": { "id": "gA6RUFLhWNQvWdt0rW4r78Fg", "isNew": false }, "referringHost": "theguardian.com", "referringPath": "/world", "isContent": true, "contentPublicationDate": "2014-03-03", "countryCode": "US", "countryName": "United States", "location": { "lonlat": [-73.4409, 41.2094] } } ⇠filter ⇠filter ⇠count per minute

{ "query" : { "filtered" : { "query" : { "match_all" : { } }, "filter" : { "term" : { "path" : "/film/2014/mar/03/oscars-2014-winners-list" } } } }, …

… "facets": { "Reddit": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "term": { "referringHost": "reddit.com" } } }, "Facebook": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "term": { "referringHost": "facebook.com" } } }, "Google": { "date_histogram": { "field": "dt", "interval": "1m" }, "facet_filter": { "or": { "filters": [ { "prefix": { "referringHost": "www.google." } }, { "prefix": { "referringHost": "news.google." } } ] } } } } }

/graph/breakdown?section=commentisfree

?section=commentisfree ophan.StandardFilters ophan.StandardFiltersToElasticsearch org.elasticsearch.index. query.FilterBuilder

{ "query" : { "filtered" : { "query" : { "match_all" : { } }, "filter" : { "term" : { "path" : "/film/2014/mar/03/oscars-2014-winners-list" } } } }, …

"filter": { "and": { "filters": [ { "range": { "dt": { "from": "2014-03-03T00:00:00.000Z", "to": "2014-03-03T22:30:59.999Z", "include_lower": true, "include_upper": false } } }, { "not": { "filter": { "term": { "countryCode": "GNM" } } } }, { "not": { "filter": { "term": { "userAgent.type": "Robot" } } } }, { "filter": { "terms": { "section": [ "commentisfree" ] }} } ] } }

thank you graham tackley ■ @tackers director of architecture guardian news and media shay banon ■ @kimchy creator, co-founder and cto elasticsearch

Add a comment

Related presentations

Related pages

How Elasticsearch Powers the Guardian's Newsroom

Bio. Graham Tackley is Director of Architecture at Guardian News and Media. He spent a few years leading the Web Platform Team engineering guardian.co.uk ...
Read more

How Elasticsearch powers the Guardian’s newsroom

How Elasticsearch powers the Guardian’s newsroom graham tackley @tackers director of architecture guardian news and media shay banon @kimchy
Read more

Presentations -> How Elasticsearch Powers the Guardian's ...

theguardian.com is one of the world's most popular news websites, recently passing 100 million unique browsers in a month. Yet in the past, their ...
Read more

Presentations -> How Elasticsearch Powers the Guardian's ...

Presentation: Tweet "How Elasticsearch Powers the Guardian's Newsroom" Track: ...
Read more

GitHub - dzharii/awesome-elasticsearch: A curated list of ...

awesome-elasticsearch ... dzharii / awesome-elasticsearch. Code. Issues 0. ... How Elasticsearch powers the Guardian's newsroom; Advanced.
Read more

This Week in Elasticsearch - February 12, 2014 | Elastic

This Week in Elasticsearch ... Tackley from The Guardian will share the stage at QCon London, discussing How Elasticsearch Powers The Guardian's Newsroom.
Read more

9 Ways Elasticsearch Helps Us, From Dawn to Dusk - eWeek

Enterprise Apps / 9 Ways Elasticsearch Helps Us, From Dawn to Dusk . 9 Ways Elasticsearch ... Elasticsearch. Ophan, "The Guardian ... Elasticsearch powers ...
Read more

This Week in Elasticsearch - March 05, 2014 | Elastic

This Week in Elasticsearch - March 05, 2014. Von Alexander Reelsen. Welcome to This Week in Elasticsearch. In this roundup, ...
Read more

Guardians | LinkedIn

View 7284 Guardians posts, presentations, experts, and more. Get the professional knowledge you need on LinkedIn. LinkedIn Home What is LinkedIn? Join Today
Read more