Day 1 big data & hadoop By SoApt

67 %
33 %
Information about Day 1 big data & hadoop By SoApt

Published on March 21, 2014

Author: KumarVivek10

Source: slideshare.net

Big Data & Hadoop

❑ LIVE On-Line Classes ❑ Class recordings made available for life time ❑ Quizzes and Assignments at end of each chapter ❑ Technical support ❑ Project work ❑ Assessment and Certification ❑ Post Training Guidance and Support ❑ Assistance in finding relevent Job

Day 1 Day 2 Week 1 Understanding Big Data Hadoop Architecture Hadoop Cluster Data Loading Techniques Week 2 Basic MapReduce Advanced MapReduce YARN 2.0 Week 3 PIG Latin Hive Week 4 NoSQL Databases, HBase and ZooKeeper Project Work

❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏

❏ ❏ ❏ ❏ NYSE generates about one terabyte of new trade data per day to Perform stock trading analytics to determine trends for optimal trades.

Volume Variety Velocity Big Data

❏ ❏ ❏ ❏

❏ ❏ ❏ ❏

❏ ❏ ❏ ❏ ❏

❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏

❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏

❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏

❏ ❏

Storage - Backup / Read - Write Processing (ETL) Usage / Visualization

OLTP RDBMS Soci al Logs Expensive Storage and processing Lot of Data Discarded Storage spread across. Not easily accessible. Limited storage capacity Reports Reports

OLTP RDBMS Soci al Logs Lot of Data Discarded Reports (Batch) Hadoop DW Reports

❏ ❏ ❏ ❏ ❏ ❏

❏ ❏

1 Machine 4 I/O Channels Each Channel -- 100 MBps 100 Machines 4 I/O Channels Each Channel -- 100 MBps

1 Machine 4 I/O Channels Each Channel -- 100 MBps 100 Machines 4 I/O Channels Each Channel -- 100 MBps Reading 1 TB Data 45 Minutes .45 Minutes

Story of Hadoop ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏

❏ ❏

❏ ❏ ❏

❏ ❏ ❏ ❏ ❏ ❏ ❏

Characterstics of Hadoop Hadoop Reliable Economical Scalable Fault Tolerant

❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏

Hadoop Core Components

❏ ❏ ❏ ❏ ❏

❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ ❏ Name Node: Keeps track of overall file directory structure and the placement of Data Block Name Node (Stores metadata only) METADATA: /user/doug/hinfo-> 1 3 5 /user/doug/pdetail-> 4 2

NameNode Edit Logs FSImage

❑ ❑ ❑ ❑ NameNode Sedondary Namenode File System Metadata Its been an hour ?

Quiz

Quiz

Quiz When the NameNode fails, Secondary NameNode takes over instantly and prevents Cluster Failure: ❑ TRUE ❑ FALSE

Quiz When the NameNode fails, Secondary NameNode takes over instantly and prevents Cluster Failure: ❑ TRUE ❑ FALSE False. Secondary NameNode is used for creating NameNode Checkpoints. NameNode can be manually recovered using ‘edits’ and ‘FSImage’ stored in Secondary NameNode.

JobTracker

JobTracker (cotd..)

JobTracker (cotd..)

Quiz

Quiz

Rack 1 Rack 2 Rack 3 Block A Block B Block C Topology script property topology.script.file.name in core-site.xml

❑ ❑

❑ ❑

Green - GA Versions Black - Not Released by Apache yet Red - Commercial

❏ http://blog.cloudera.com/blog/2012/01/an-update-on-apache- hadoop-1-0/ ❏ https://blogs.apache.org/bigtop/entry/all_you_wanted_to_know ❏ https://hadoop.apache.org/releases.html ❏ http://hortonworks.com/blog/apache-hadoop-2-is-ga/

❏ ❏ ❏ ❏

Class 2 Pre-work ❏ Setup hadoop environment using documents provided on google drive ❏ CDH3 (recommended) or CDH4 ❏ Execute basic linux commands ❏ Execute HDFS hands on commands ❏ Attempt the class-1 assignment

Thank You ! See you in next class

Add a comment

Related pages

Day 1 big data & hadoop By SoApt - Documents

Big Data Hadoop Briefing Hosted by Cisco, WWT and MapR: Cisco UCS For Big Data Presentation
Read more

Day 1 big data & hadoop By SoApt - Documents

Day 1 big data & hadoop By SoApt ...
Read more

Kumar Vivek - HubSlide

Day 1 big data & hadoop By SoApt 3 months ago © HubSlides 2016. About Us; Help; Terms of Use; Privacy Policy; Copyright; Contact Us ...
Read more

Big Data Hadoop Class Day 1 - YouTube

Big Data Hadoop Class Day 1 ... in experience in IT works for one of the worlds Big data/Hadoop leading ... Big Data and Hadoop 1 ...
Read more

Hadoop by sunitha - Technology - documents

Hadoop by sunitha Jan 27, 2015 ... Day 1 big data & hadoop By SoApt. Hadoop and mysql by Chris Schneider. hadoop. hadoop. Hadoop. Hadoop. Login or Join ...
Read more

2014 01 11 10 02 SoApt Hadoop Class 11 Jan Part 3 - YouTube

Standard YouTube License; ... Big Data and Hadoop Online Training by SoApt Haducation Center ... Introduction to Hadoop & Big Data - Duration: 1 ...
Read more

Hadoop Tutorial - hortonworks.com

The Apache Hadoop projects provide a series of tools designed to solve big data problems. The Hadoop cluster implements a parallel ... 3.1.2 View Data ...
Read more

Free Hadoop Online Training Resources - Tom's IT Pro

13 Free Hadoop Online Training Resources. ... free sources for Hadoop. 1. Big Data ... you learn about big data and Hadoop from industry ...
Read more