Finding and Using Big Data in your business

33 %
67 %
Information about Finding and Using Big Data in your business

Published on February 26, 2014

Author: SimonEllistonBall



Technology challenges and how to introduce big data tools to your organisation, a real use case based on Red Gate's Feature analytics, and some cultural tools

Finding (and using) Big Data in your Business Simon Elliston Ball Head of Big Data ! @sireb !#findBigData ! ! !

Now THAT's Big Data • A modern Ford kicks out 25GB per car, in a day. • Ad networks: over a billion event logs per day. • PayPal: 3 billion transactions a year • Climate Corporation: soil type record for every square meter in the USA • Facebook: 10PB a day

So you're probably not Facebook • Big Data takes many forms • Velocity • Variety • Volume • Value • Veracity

Feature usage at Red Gate • We are obsessed with UX • Knowing what our users do helps us make their life better • Error reporting • Feature usage reporting • Conversations, survey, sales everything goes into making products better.

The default: SQL Server

The problem: SQL Server "I used to use FUR all the time! I can't use it anymore, it's too slow." - Michelle, Product Manager "I'm running a query right now... It started yesterday :(" - Ben, Product Manager "Hey, this database is taking up a few TBs, can we just delete it?" - Simon, DBA

DELETE IT!?!?!? • Thinning out old data • Archiving to cheaper storage (even tape) • Turning down collection

Big Data to the rescue • Cheap storage in Hadoop • Scale out, not scale up • Distributed computing required for speed • Occasional bursty workloads • Semi-structured

Hadoop • Created by Doug Cutting as a backend for a search engine and crawler (Nutch) in 2005. • Developed further at Yahoo • Based on Google's papers on Google Filesystem, and MapReduce • Since grown into an ecosystem of tools • Now version 2.0


All grown up

Really complex • Lots of moving parts • Integrating into your network can be complex • Getting all the tools to play nice • • • Self build Fixing up from a good starting point Use a distro

Sandboxes • Quick Start • Great to learn

The menagerie

What we did • Test cloud • Virtualization is not Hadoop's friend. • • • Performance is not good “Can we have 2TB on the SAN for /tmp?” Ur. No. "Borrowed" some old hardware, and got a small cluster running.

Putting data in • Sqoop • Cleaning • ORC

How to not kill SQL server • To a DBA Sqoop is a DDOS attack • Limit the number of mappers Sqoop uses • Import from a replica, or backup

Immediate value • The data was a lot smaller • Cheaper to store • Column formats • Compression: use lzo, bzip costs too much, and gzip is bad for Hadoop.

Give it back! Queries and ETL • Hive. Reuse your SQL • Pig. New, but worth learning • MapReduce? (Optional. Warning: may contain java. Or snakes)

Give it back to the business • Summary report in Excel • Batch jobs • Pump back into SQL for slicing and dicing • Give us MORE!

Give it back! The platform • To the cloud! • Reuse all our existing queries and workflow • On demand compute • Takes time to lift the initial data set into cloud storage, but incremental updates are fast

Demo HDInsight

Thinking like a data scientist • Plan your experiments • Precision is subjective. • Show the error bars • Use whatever tool works • Embrace uncertainty

Know your business

Think strategically • Business buy-in • Show quick wins • What is your analysis for? • What will it deliver to the business?

Break down the requirements • Prioritize • Go for the top value pieces • Perfect fit for Agile methodologies

Communication • Talk to everyone you can • Before • After • During • Organizational knowledge • Keep a log

Communication • Conversations • Coffee machine • Formal talks

So what's next? • Denormalize • Democratize • Machine learning for alerts • Marketing • Sales

And of course new tools • We want to talk to you...

Questions Simon Elliston Ball ! @sireb #findBigData

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

Simon Elliston Ball - Finding (and using!) Big Data in ...

... (and using!) Big Data in your business. ... I will also introduce some great technical and cultural tools you need to make big data work for your business.
Read more

6 Companies Using Big Data to Change Business |

6 Companies Using Big Data to Change Business. ... stuff in big data, but how can you sort out what ... public data, your business ...
Read more

Finding Ways to Use Big Data to Help Small Shops - The New ...

Finding Ways to Use Big Data ... just 1.7 percent of small businesses were using business ... How Is Your Small Business Making Use of Big Data?
Read more

IBM big data use cases – What is a big data use case and ...

A big data use case can help you solve a specific business challenge by using patterns or examples of big data technology solutions.
Read more

Big Data: Big Opportunities to Create Business Value - EMC

Big Data: Big Opportunities to Create Business ... you already have. By preparing existing data ... Opportunities to Create Business Value Big ...
Read more

What Is Big Data? - Forbes

... to provide business insights ... Once you start tackling big data, you’ll learn ... RECOMMENDED BY FORBES. CMOs, You May Have More In ...
Read more

Understanding big data leads to insights, efficiencies ...

... a wide range of biomedical research problems using big data, ... and your health. We are moving from a big data problem to a ... “Finding imaginative ...
Read more

What Is Big Data? | SAS

... banks are faced with finding new and ... Big data brings big ... The final step in making big data work for your business is to research the ...
Read more

IBM Big Data – What is Big Data – United States

Learn about big data and how IBM can help you use big data to achieve big ... from big data, you need ... business models with big data and ...
Read more