advertisement

20140228 - Singapore - BDAS - Ensuring Hadoop Production Success

50 %
50 %
advertisement
Information about 20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
Technology

Published on February 28, 2014

Author: allenday

Source: slideshare.net

advertisement

© 2014 MapR Technologies, confidential

TREND 1 Hadoop is Providing Value Across Organizations ENTERPRISE DATA HUB • Multi-structured data staging & archive • ETL / DW optimization • Mainframe optimization • Data exploration MARKETING ANALYTICS • Recommendation engines & targeting • Ad optimization • Pricing analysis • Lead scoring RISK ANALYTICS • Network security monitoring • Security information & event management • Fraudulent behavioral analysis OPERATIONS INTELLIGENCE • Supply chain & logistics • System log analysis • Manufacturing quality assurance • Preventative maintenance • Sensor analysis © 2014 MapR Technologies, confidential

Sellers Cloud Advertising Automation Cloud Buyers Cloud 90B AD AUCTIONS per day © 2014 MapR Technologies, confidential 3

TREND 2 Organizations Have Many Workload-specific Systems ENTERPRISE USERS • Mission-critical reliability • Transaction guarantees • Deep security • Real-time performance • Backup and recovery OPERATIONAL SYSTEMS ANALYTICAL SYSTEMS • Interactive SQL • Rich analytics • Mixed workload management • Data governance • Security • Backup and recovery © 2014 MapR Technologies, confidential

REALITY Hadoop Can Relieve the Pressure from Enterprise Systems ENTERPRISE USERS OPERATIONAL SYSTEMS Keys for Production Success • Data protection and recovery • Inter-operability • Read-write performance • Supports operations and analytics ANALYTICAL SYSTEMS • • • • • Data staging Archive Data transformation Data exploration Streaming, interactions © 2014 MapR Technologies, confidential

Fortune 100 Financial Services Company 104M CARD MEMBERS © 2014 MapR Technologies, confidential 6

REALITY 2 Most Hadoop Projects are Still Science Experiments Number of Companies Cluster Size Development/Testing Focus: Educ/Svc 1st Production Use Case 1 – 10 Nodes Wide-scale Production 10 – 2000 Nodes © 2014 MapR Technologies, confidential

Largest Biometric Database in the World 1.2B PEOPLE PEOPLE 8 © 2014 MapR Technologies, confidential 8

REALITY 3 Going Big Requires a Rock-Solid Architecture FOUNDATION © 2014 MapR Technologies, confidential

REALITY 3 Going Big Requires a Rock-Solid Architecture Enterprise-grade Multi-tenancy High Performance Open Standards for Interoperability Data Protection Operational & Analytical FOUNDATION © 2014 MapR Technologies, confidential

MapR Distribution for Hadoop APACHE HADOOP ECOSYSTEM Hive/ Stinger/ Tez Drill Impala Shark Hue ... Flume Mahout Cascading Solr Spark Storm Sentry Zookeeper Management Sqoop Whirr Pig YARN MapReduce Oozie HBase • High availability • Standard file access • Data protection • Standard database • Disaster recovery access Patent • Pluggable services MAPR-FS • Performance 2X-5X MAPR-FS Pending• Broad developer FILES support Enterprise-grade Performance • Ability to logically divide a cluster to support different use cases, job types, user groups, and administrators • Enterprise security authorization • Wire-level authentication • Data governance MapR Data Platform MapR Data Platform MapR Data Platform MapR Data Platform Multi-tenancy Data Protection • Ability to support predictive analytics, real-time database operations,MAPR-DB and MAPR-DB support high arrival TABLES rate data Inter-operability • Unit of work framework to provide transactional integrity Operational & Analytical © 2014 MapR Technologies, confidential

Apache Hadoop NameNode High Availability (HA) NAS Appliance HDFS HA A B C D AA A E BB Primary NameNode NameNode NameNode B HDFS Federation D E F B E C F D DA D E F NameNode F C CC NameNode NameNode F Standby NameNode NameNode NameNode DataNode Single point NameNode Only one activeof failure Multiple single points of failure w/o HA Limited to 50-200 million files Needs 20 NameNodes Performance bottleneck for 1 Billion files E DataNode DataNode DataNode DataNode DataNode Performance bottleneck Commercial NASNAS needed Commercial possibly needed Metadata must fit in memory DataNode DataNode DataNode Double the block reports Performance bottleneck HDFS-based Distributions © 2014 MapR Technologies, confidential

No NameNode Architecture A B C D E F NameNode No special config to enable HA Up to 1T files (> 5000x advantage) DataNode DataNode DataNode DataNode DataNode DataNode DataNode DataNode DataNode Automatic failover & re-replication Metadata is persisted to disk Significantly less hardware & OpEx Higher performance © 2014 MapR Technologies, confidential

Comparative Study of Hadoop Distributions: I/O Performance Read and Write Throughput Benchmarks IDH 2.4.1 262 276 212 465 MB per Second MB per Second 475 HDP 1.3 MapR M5 2.1.3 59 DFSIO Read Throughput CDH 4.3 69 64 DFSIO Write Throughput Source: Flux7 Labs Study, October 2013 © 2014 MapR Technologies, confidential

World Record Performance NEW MINUTESORT WORLD RECORD With a Fraction of the Hardware 1.65 TB IN 1 MINUTE 298 NODES PREVIOUS RECORD: 1.6 TB with 2200 nodes © 2014 MapR Technologies, confidential

Hbase Apps: High Performance with Consistent Low Latency --- M7 Read Latency --- Others Read Latency © 2014 MapR Technologies, confidential

MapR M7: The Best In-Hadoop Database HBase JVM NoSQL Columnar Store  Apache HBase API  In-Hadoop database  HDFS JVM ext3/ext4 Tables/Files Disks Disks Other Distros MapR M7 The most scalable, enterprise-grade, NoSQL database that supports online applications and analytics © 2014 MapR Technologies, confidential

MapR M7: The Best In-Hadoop Database Hbase Interface BigData Application JVM HDFS Interface NoSQL Columnar Store  Apache HBase API  In-Hadoop database  JVM ext3/ext4 Tables/Files Disks Disks Other Distros MapR M7 The most scalable, enterprise-grade, NoSQL database that supports online applications and analytics © 2014 MapR Technologies, confidential

Opportunity to Revolutionize Enterprise Data Architecture From Redundant Processing Silos and Data Science Experiments… © 2014 MapR Technologies, confidential

The Production Enterprise BigData Platform … to Consolidated Operational and Analytical Workloads © 2014 MapR Technologies, confidential

Q&A Engage with us! @allenday, @mapr linkedin.com/in/allenday allenday@mapr.com tsheng@mapr.com mdarling@mapr.com © 2014 MapR Technologies, confidential

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

Bdas | LinkedIn

View 693 Bdas posts, presentations, experts, and more. Get the professional knowledge you need on LinkedIn. LinkedIn Home What is LinkedIn? Join Today
Read more

Revolution R Enterprise | Revolution Analytics

Revolution R Enterprise is the ... from workstations to clusters and grids including Hadoop and enterprise data ... Ensuring Your Analytics Success.
Read more

Contact Us | SAS - Analytics, Business Intelligence and ...

Privacy Statement | Terms of Use | © SAS Institute Inc. All Rights Reserved. Back to Top ...
Read more

Technical white paper HP Reference Architecture for ...

... in the Hadoop infrastructure, ensuring ... adopting Hadoop in production • Reliably operate Hadoop in production with repeatable success ...
Read more

Distributed Systems Engineers - Kafka, Spark, Mesos, Storm ...

Could your expertise in Distributed Systems Engineering drive our Success? Skyscanner is Europe’s leading provider of flight and travel search sites and…
Read more

SAP Software Solutions | Technology & Applications

Singapore; South East Asia ... Production Planning and Scheduling; ... At the heart of SAP HANA’s success is its support of the digital core. What does ...
Read more