An introduction to Apache Gora

57 %
43 %
Information about An introduction to Apache Gora
Technology

Published on December 31, 2013

Author: mikejf12

Source: slideshare.net

Description

A short introduction to Apache Gora, what is it and how does it work ?
How can it provide data store abstraction and persistency for big data ?

Apache Gora ● What is it ? ● Gora – Nutch ● Supports ● Data Access ● API's www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – What is it ? ● Provides for Big Data – – Persistence – ● In memory data model Data store abstraction Supports persisting to – – Key/value stores – Document stores – ● Column stores RDBMS's Supports use of Hadoop www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – What is it ? ● Released via Apache 2 license ● Written in Java ● Offers a persistence framework ● Designed for big data applications ● Used by Nutch 2.x for web crawl data storage ● Used for – Persistence – Indexing – Analytics www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Nutch ● Nutch 2.x now uses Gora – Abstracted storage – Data store independence – Handles object to persistent mappings – Use various NoSql solutions www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Supports ● Gora supports the following – Apache Accumulo – Apache Cassandra – Apache Hbase – Amazon DynamoDB – Pig – Hive – Cascading – MapReduce www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Data Access ● Java API for data access – ● Independent of location Core Gora API's – Store – Persistency – Query – MapReduce www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Store API ● Java API – org.apache.gora.store.* – DataStore handles object persistence – DataStore methods process objects ● ● ● ● Persist Fetch Query Delete www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Persistency API ● Java API – org.apache.gora.persistency.* – Core classes ● ● ● BeanFactory – Construct keys Persistent – Persist objects State – State managed through StateManager – – NEW, CLEAN (UNMODIFIED) DIRTY (MODIFIED), DELETED www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – Query API ● Java API – org.apache.gora.query.* – Core classes ● ● ● Query – Constructed via DataStore PartitionQuery – Divide results of Query into partitions. – Run queries on data nodes. – Generate Hadoop InputSplits Result www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Apache Gora – MapReduce API ● Java API – org.apache.gora.mapreduce.* – GoraMapper – GoraReducer – ALL Record Counter – Reader – Writer – Hadoop / Avro ● ● ● Serialise De-serialise Persistent www.semtech-solutions.co.nz info@semtech-solutions.co.nz

Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems

Add a comment

Related presentations

Related pages

Apache Gora™ - Gora Tutorial

Gora Tutorial¶ Author : Enis Söztutar, enis [at] apache [dot] org. Introduction¶ This is the official tutorial for Apache Gora. For this tutorial, we ...
Read more

Introduction to Apache Gora by Lewis McGibbney on Prezi

Presented at the JPL Open Developer Meetup on Thursday 4th Feburary 2016. This presentation covers an introuction to Apache Gora including updates to ...
Read more

Apache Gora™ - Gora Configuration

Introduction. Camel-Gora is an Apache Camel component that allows you to work with NoSQL databases using the Apache Gora framework. N.B. Camel-Gora is NOT ...
Read more

Apache Gora Release Procedure HOW_TO - Apache Gora ...

Introduction. This documents the Apache Gora release procedure. It is a dynamic document and it is up to successive release managers to maintain it.
Read more

An introduction to Apache Gora - YouTube

A short introduction to Apache Gora, what is it and how does it work ? How can it provide data store abstraction and persistency for big data ?
Read more

An introduction to Apache Gora on Vimeo

A short introduction to Apache Gora, what is it and how does it work ? How can it provide data store abstraction and persistency for big data ?
Read more

Welcome to The Apache Software Foundation!

The Apache Software Foundation. ... (a.k.a. Apache James) ... Gora; Groovy; Gump; H; Hadoop; Hama; HBase; Helix; Hive; HttpComponents; I; Isis;
Read more

Giraph - Introduction to Apache Giraph

Introduction. Apache Giraph is an iterative graph processing framework, built on top of Apache Hadoop. The input to a Giraph computation is a graph ...
Read more