Advanced Analytics with Cloudera's Enterprise Data Hub

44 %
56 %
Information about Advanced Analytics with Cloudera's Enterprise Data Hub
Technology

Published on February 20, 2014

Author: cloudera

Source: slideshare.net

Description

Have you run into one or more of the following barriers or limitations with your existing data warehousing architecture:

> Increasingly high data storage and/or processing costs?
> Silos of data sources?
> Complexity of management and security?
> Lack of analytics agility?

Rethink Analytics: EDH for Advanced Analytics Josh Wills, Director of Data Science Sandy Lii, Senior Manager, Solutions Marketing 1

Agenda • Market Background • Challenges and Limitations • EDH for Advanced Analytics • Case Studies • How to Get Started 2

Market Background 3

From BI to Advanced Analytics What will happen? How can we do better? What happened? When? And Where? How and why did it happen? Time Data Size 4 Facts Interpretations

Advanced Analytics that Saves Us Money • Customer churn analysis model • Integrated customer support and services • Fraud detection 5 5

Advanced Analytics that Makes Us Money • Product recommendation $ 6 6 engines • Location-based real-time offers • Target-based pricing strategy

Traditional Advanced Analytics Process Problem ID Project Definition Data Access Request & Discovery Data Transformation Data Sampling Model Evaluation Data Preparation Time-to-Insight 7 Model Creation Model Development Deploy Model Model Deployment

Challenges and Requirements 8

Accessing the Right Data is Difficult Multi-structured or External Data Structured Internal Data Data Warehouse 9

“Are we there yet?” 2. Get access to data 3. Learn about the data 4. Move data to ADW and process data 1. Find the data 6. Model Deployment Data Discovery 5. Data Modeling 10

Silo’d Platforms Challenge Collaboration & Mgmt Non-Agile Models Data Sources Departmental Warehouse Enterprise Apps Departmental Warehouse Reporting Silo’d Analytics Silo’d Analytics Opaque schemas accumulates over time 11 Silo’d Analytics

Impact of Status Quo Executives “We don’t have the information we need to answer key business questions.” Data Scientists “I’m sick of waiting for my data, I’m going to make my own copy.” 12 DBA/DW Admins “I need to make sure the DW is secure & compliant for the mission critical reports.”

Cloudera’s Enterprise Data Hub 13

Use All Your Data Use more data, and more types of data, with existing tools • Reduce the need to limit or move large datasets • Centralize information security, metadata, management, and governance • 14

Shorten Analytics Lifecycle Facilitate data discovery • Track data life-cycle in place • Define, test, deploy, and update models all within a single platform • 15

Do More with Data Deliver multi-genre analytics in a single platform • Apply diverse concurrent analytics to full datasets inplace • Protect existing technology and skillset investments • Search EDH Machine Learning BI 16 SQL Query In-memory analytics

Cloudera EDH for Analytics ANALYTIC SQL SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING WORKLOAD MANAGEMENT 3RD PARTY APPS DATA MANAGEMENT BATCH PROCESSING STORAGE FOR ANY TYPE OF DATA Filesystem 17 Online NoSQL SYSTEM MANAGEMENT UNIFIED, ELASTIC, RESILIENT, SECURE

Cloudera EDH for Analytics Use all data with centralized mgmt & security ANALYTIC SQL SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING WORKLOAD MANAGEMENT UNIFIED, ELASTIC, RESILIENT, SECURE HADOOP Filesystem 18 Online NoSQL SYSTEM CLOUDERA MANAGER MANAGEMENT STORAGE FOR ANY TYPE OF DATA 3RD PARTY APPS DATA MANAGEMENT BATCH MAPREDUCE PROCESSING

Cloudera EDH for Analytics Faster data discovery ANALYTIC SQL SEARCH SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING WORKLOAD MANAGEMENT 3RD PARTY APPS DATA NAVIGATOR MANAGEMENT BATCH PROCESSING STORAGE FOR ANY TYPE OF DATA Filesystem 19 Online NoSQL SYSTEM MANAGEMENT UNIFIED, ELASTIC, RESILIENT, SECURE

Cloudera EDH for Analytics Multiple tools on one platform ANALYTIC IMPALA SQL SEARCH ENGINE SPARK/ ORYX MACHINE LEARNING / MAHOUT STREAM PROCESSING WORKLOAD MANAGEMENT RD 3RD PARTY APPS DATA MANAGEMENT BATCH PROCESSING STORAGE FOR ANY TYPE OF DATA Filesystem 20 Online NoSQL SYSTEM MANAGEMENT UNIFIED, ELASTIC, RESILIENT, SECURE

Cloudera EDH for Analytics Operationalize Models ANALYTIC SQL SEARCH ENGINE MACHINE LEARNING SPARK STREAM STREAMING / PROCESSING FLUME WORKLOAD MANAGEMENT 3RD PARTY APPS DATA MANAGEMENT BATCH PROCESSING STORAGE FOR ANY TYPE OF DATA Filesystem 21 Online NoSQL SYSTEM MANAGEMENT UNIFIED, ELASTIC, RESILIENT, SECURE

Cloudera Enterprise CLOUDERA ENTERPRISE ANALYTIC SQL SEARCH ENGINE MACHINE LEARNING STREAM PROCESSING WORKLOAD MANAGEMENT 3RD PARTY APPS DATA MANAGEMENT BATCH PROCESSING STORAGE FOR ANY TYPE OF DATA Filesystem 22 Online NoSQL SYSTEM MANAGEMENT UNIFIED, ELASTIC, RESILIENT, SECURE

Capabilities of Cloudera Enterprise APACHE HADOOP™ 23

Capabilities of Cloudera Enterprise APACHE HADOOP™ 24

Capabilities of Cloudera Enterprise APACHE HADOOP™ 25

Capabilities of Cloudera Enterprise APACHE HADOOP™ 26

Analytics Process with EDH Problem ID Project Definition Data Access Request & Discovery Model Creation Data Transformation Data Sampling Model Evaluation Data Preparation Time-to-Insight 27 Model Development Deploy Model Model Deployment

Analytics Process with EDH Problem ID Project Definition Data Access Request & Discovery Data Transformation Data Sampling Data Preparation Time-to-Insight 28 Model Creation Model Evaluation Model Development Deploy Model Model Deployment

Analytics Process with EDH Problem ID Project Definition Data Access Request & Discovery Data Transformation Data Preparation Data Sampling Model Creation Model Evaluation Model Development Deliver Insights Sooner 29 Deploy Model Model Deployment

Business Value Delivered Data Scientists Executives DBA/DW Admins • Acquire data necessary for projects • Acquire necessary information sooner to make critical business decisions • Support both reporting and analytics needs • Develop analysis/models with better lift faster • Share data sets to empower others 30 • Save resources with shared security and management

Case Studies 31

Ask Bigger Questions: How can we prevent re-admittance? Kaiser Permanente helps providers recommend at-home action based on real-time data to prevent hospital visits. 32 32 32

Kaiser Makes Medical Data Actionable The Challenge: • • • Re-admittance is expensive, reflects sub-par provider-to-patient communications IT infrastructures can’t accommodate 24x7 data streams from devices Diverse medical ontologies present data challenge Kaiser Permanente helps providers recommend at-home action based on real-time data to prevent hospital visits. The Solution: Cloudera EDH provides a scalable, flexible platform for collection, ingestion & dissemination of healthcare information • Ingests real-time data streams of multistructured data • 33

Ask Bigger Questions: How do we feed the world? Monsanto can automate data-driven R&D decisions to reduce time to market from years to months. 34

Monsanto feeds our growing, global population The Challenge: • 1,000+ research scientists developing products in silos • Data processing bottleneck slows development • Time to market for new product is 5-10 years Monsanto can automate data-driven R&D decisions to reduce time to market to months from years. The Solution: • Cloudera Enterprise + Search + Impala: PB-scale platform for single view of all R&D data • Integration: Exadata, spatial awareness & visualization • Scientists directly access CDH; Navigator offers auditing & access control 35

ARE YOU READY TO START? Answer questions using ALL YOUR DATA 36

QUESTIONS? • Try Cloudera today Type in the “Chat” panel to ask a question cloudera.com/downloads Learn more • http://tinyurl.com/membtaw Tweet @cloudera Register now for Data Analysts Training • • 37 Follow Josh @josh_wills Follow Sandy @sandyliiwozniak Recording will be available on-demand at cloudera.com university.cloudera.com • • Use discount code Analytics10 to save 10% on new enrollments in classes delivered by Cloudera until May 2014* Use discount code 15off2 to save 15% on enrollments in two or more classes delivered by Cloudera until May 2014* * Excludes classes sold or delivered by Cloudera Partners

Thank You! Josh Wills @josh_wills Sandy Lii @sandyliiwozniak 38

Add a comment

Related presentations

Related pages

Cloudera Partners Show Support for Cloudera Enterprise 5 ...

... show support for cloudera enterprise 5 and new enterprise data hub. cloudera partners show support for cloudera enterprise 5 and new enterprise data hub
Read more

Cloudera Enterprise Data Hub Edition Provides Enterprise ...

Cloudera Enterprise Data Hub Edition Provides Enterprise-Ready Hadoop for the Microsoft Azure Marketplace.
Read more

Rethink Analytics with an Enterprise Data Hub - BrightTALK

Cloudera's Director of Data Science Josh Wills and Senior Manager, Solutions Marketing Sandy Lii explain how advanced analytics with an enterprise data hub ...
Read more

SAS and Cloudera – Analytics at Scale

The combination of SAS analytics and Cloudera’s enterprise data hub (EDH) ... Data Science and Advanced Analytics Product Management, ...
Read more

Cloudera Announces Open Network Insight and Open Data ...

Growing portfolio of cybersecurity ISVs and GSIs choose Cloudera’s enterprise data hub to deliver advanced cybersecurity analytics solutions
Read more

New Advanced Analytics and Data Wrangling Tutorials on ...

... (Cloudera’s open source platform ... New Advanced Analytics and Data Wrangling Tutorials ... unified enterprise data hub to launch into advanced ...
Read more

Cloudera’s Enterprise Data Hub on the AWS Cloud

Cloudera’s Enterprise Data Hub on the AWS Cloud. Quick Start Reference Deployment. Tony Vattathil and Karthik Krishnan Quick Start Reference Team
Read more

Cloudera Hadoop - it-novum GmbH

Cloudera’s Enterprise Data Hub . ... enterprise search and advanced analytics. Cloudera's ... A Cloudera based Enterprise Data Hub will ...
Read more

Introducing Open Network Insight: Accelerating ...

Introducing Open Network Insight: Accelerating Cybersecurity Analytics ... advanced threat detection using big data ... Cloudera’s Enterprise Data Hub ...
Read more