Sound cloud - User & Partner Conference - AT Internet

50 %
50 %
Information about Sound cloud - User & Partner Conference - AT Internet

Published on December 9, 2013

Author: AT-Internet

Source: slideshare.net

Description

Big Data with Amazon Redshift and ATI - AT Internet

Big Data with Amazon Redshift and ATI November, 27th 2013

HI, I’M OLE

SOUNDCLOUD IS THE WORLD’S LEADING AUDIO PLATFORM

Every minute, creators upload 12hrs of audio

reaching over 250m people every month

8% of the internet

PRESIDENT OBAMA FOO FIGHTERS SNOOP LION MADONNA SKRILLEX MACKLEMORE JOHN OLIVER (DAILY SHOW/BUGLE)

How‘s the sales funnel performing in Brazil and what‘s the split between products?

DATA DEMOCRATIZATION • Avoid Silos • Remove unnecessary restrictions • Provide simple tools • Teach People how to use data

DATA DEMOCRATIZATION In one sentence: Deliver the right information to the right person at the right time.

DATA ANALYSIS AND REPORTING 2010-2012 PRODUCTION DB ANALYTICS DB AT Internet

DATA ANALYSIS AND REPORTING Listens Sounds Users Comments Favorites Shares Reposts Impressions Clicks Conversions Suggestions Downloads Taggings Uploads

DATA ANALYSIS AND REPORTING Listens timestamp duration sound owner listener API-key (location) country

DATA ANALYSIS AND REPORTING additional metadata: • location within sound • context (location on site) • segmentation Listening creates >6000 events/s BIG DATA

HADOOP TO THE RESCUE 2 Datacenter in AMS 200+ Nodes

HADOOP TO THE RESCUE listen data listen metadata search data recommender data product testing data backend production data backend logs

HADOOP AND DATA DEMOCRATIZATION Data is siloed on hadoop Data governance not existing Technical hurdles for access Not realtime Slow access

AMAZON REDSHIFT Fast fully managed DW service Optimized for petabyte or more datasets Fast query and I/O performance Columnar storage technology

BI INFRASTRUCTURE 2013 Source Systems Staging Area DataWarehouse Data Exploration Amazon EMR Hadoop Pig/Ruby Scripts COPY MySql (production db) Pig/Ruby Scripts AT Internet ETL Scripts External Systems Job execution powered by: ETL Scripts

How‘s the sales funnel performing in Brazil and what‘s the split between products?

ATI Data Query Create query: 1. filter on funnel pages 2.select metrics and dimension 3.add REST URL to ETL pipeline

Source Systems Staging Area DataWarehouse Data Exploration Amazon EMR Hadoop Pig/Ruby Scripts COPY MySql (production db) Pig/Ruby Scripts AT Internet ETL Scripts External Systems Job execution powered by: ETL Scripts

DATA EXPLORATION Simple and fast access to data More time for “deep dives” into data Individualized Reporting Allows interactivity between users Integrated with RedShift

DATA DEMOCRATIZATION • Reports designed by end users • Central repository for data analysis • User interaction • Data from one source only • Scalable solution • Data to the people!

QUESTIONS?

THANK YOU! P.S. WE’RE HIRING. SOUNDCLOUD.COM/JOBS

APPENDIX

IMPORT DATA FROM SOURCE SYSTEMS First: Gather data from the several source systems into S3 Hadoop Full/Daily Imports MySql (production db) External Systems MapReduce for: - Listens - Plays - Impressions - Affiliations - ...

IMPORT DATA FROM SOURCE SYSTEMS Second: Rebuild staging area tables for full imports Based on configuration files tracks users client applications Create statements generated ... Re-create DISTKEYS and SORTKEYS Full control in changes in the data model Staging Area yaml config files

IMPORT DATA FROM SOURCE SYSTEMS Third: Import the data from S3 to RedShift tracks Full import: TRUNCATE & COPY Daily import: COPY users Staging Area client applications ...

ETL AND DW DATAMODEL ETL scripts divided into layers: - Layer 1: Staging -> DW (dimensions) - Layer 2: Staging -> DW (fact tables - raw data) - Layer 3: DW -> DW (aggregated fact tables) - Layer 4: DW -> Reporting Data Cubes (reporting data)

ETL AND DW DATAMODEL DataWarehouse ETL Layer 1 & 2 ETL Layer 3 ETL Layer 4 Data Exploration Staging Area Data Cleaning Data Transformation Data Presentation SQL Ruby/SQL Scripts Data Aggregation Ruby/SQL Scripts

JOB SCHEDULE AND EXECUTION Job-scheduling tool developed internally Set dependencies between jobs Execution in multiple machines Supports all the ETL layers

TIMELINE Week 2 • • Week 4 Gap Analysis Business Exploration (requirements interviews) Week 6 Week 8 Week 10 Week 12 Week 14 Week 16 Requirement Analysis • • Information Mapping Design Solution Design (Draft) End of Analysis Stage • • Define Infrastructure Design Data Model Infrastructure Ready! • • • Build ETL Build Data Cubes Design Reports/Dashboards (Presentation Layer) BI 1.0 is built! • • System/Integration Tests User Acceptance BI 1.0 is tested! • • User Workshops BI 1.0 Evaluation BI 1.0 is ready to use! Milestones Analysis Stage Design & Build Test & Deploy

Add a comment

Related pages

Microsoft – Offizielle Homepage

Microsoft Cloud. Lesen Sie weiter, wie Temenos die Kosten für Anleihen um 90 Prozent senken konnte. Microsoft folgen. Website-Feedback. ... Internet Explorer
Read more

Microsoft Cloud

The Microsoft Cloud is designed to empower your business, ... Windows 10 for Internet of Things ... Microsoft Worldwide Partner Conference
Read more

SoundCloud - Wikipedia, the free encyclopedia

... of the global Internet—while ... SoundCloud as a third-party music partner, ... music and sound files, or download files if the user has ...
Read more

Avaya Products - Business Phones, Video Conferencing ...

Cloud. Partner and Service Provider Cloud Solutions; Popular Products. ... Not long ago, the credit union launched a new internet banking service, ...
Read more

Polycom: Video Conferencing, Voice Conferencing, Telepresence

Polycom is the leader in HD video conferencing, ... Cloud Burst Service; Conference Management; ... Cloud Partner; Technology Partner;
Read more

Video, Audio, & Web Conferencing in the Cloud - Lifesize

Lifesize Cloud makes video ... winning video conference application. Learn More Cloud Pricing Lifesize Cloud ... Chrome or Internet Explorer® 11 ...
Read more

Lifesize Hardware & Software Documentation

Read documentation on ... which combines best-in-class endpoints with our award-winning video conference application. Learn More Cloud Pricing ... Partner ...
Read more

Dubsmash, IAB Tech Lab, SoundCloud and Mobile Advertising ...

... AppLift, a leading mobile advertising technology company, in partnership with conference organization ... agencies to partner ad ... user acquisition ...
Read more

Audio & Sound Systeme für Zuhause und Unterwegs | Sony DE

Erleben Sie mit den tragbaren Bluetooth® Lautsprechern von Sony großartigen Sound und einfaches Streaming bei ... Internet Explorer. Aktuellste Version ...
Read more