advertisement

Google's BigTable

50 %
50 %
advertisement
Information about Google's BigTable

Published on July 10, 2008

Author: george.james

Source: slideshare.net

advertisement

Google’s BigTable Out of the Slipstream :: July 3, 2008

The BigTable Goals Wide Applicability Used in more than 60 Google products Scalability High Performance High Availability

Wide Applicability

Used in more than 60 Google products

Scalability

High Performance

High Availability

The BigTable Arena Internet Scale Google :: BigTable and GFS Apache :: HBase and HDFS Amazon :: SimpleDB and S3 Facebook :: Cachr and Haystacks

Internet Scale

Google :: BigTable and GFS

Apache :: HBase and HDFS

Amazon :: SimpleDB and S3

Facebook :: Cachr and Haystacks

The BigTable Features Dynamic control over data layout and format Data is uninterpreted strings “ Does not support a full relational model” Locality of data Dynamic control over serving data from memory or disk Sparse, distributed, persistent multidimensional sorted map. The map is indexed by: A row key A column name A timestamp Each value in the map is an uninterpreted array of bytes Column oriented

Dynamic control over data layout and format

Data is uninterpreted strings

“ Does not support a full relational model”

Locality of data

Dynamic control over serving data from memory or disk

Sparse, distributed, persistent multidimensional sorted map.

The map is indexed by:

A row key

A column name

A timestamp

Each value in the map is an uninterpreted array of bytes

Column oriented

Architecture GFS SSTables Tables Chubby Clusters Tablets Tablet Servers

Table Structure Columns Timestamp / version Key Table Indexes Column Families Expando Columns

Google App Engine

App Engine BigTable + Python + AppEngine SDK Choice of web frameworks: webapp (pre-installed) Django CherryPy Pylons Web.py Google Accounts integration App Engine SDK for offline development Offline development environment Online runtime environment Free to get started Priced similar to Amazon S3

BigTable + Python + AppEngine SDK

Choice of web frameworks:

webapp (pre-installed)

Django

CherryPy

Pylons

Web.py

Google Accounts integration

App Engine SDK for offline development

Offline development environment

Online runtime environment

Free to get started

Priced similar to Amazon S3

Getting Started Sign-up for an account Download Python 2.5 Download AppEngine SDK Local version of BigTable Web-server Google user account simulator Webapp framework Getting started tutorial Write you application Upload to google

Sign-up for an account

Download Python 2.5

Download AppEngine SDK

Local version of BigTable

Web-server

Google user account simulator

Webapp framework

Getting started tutorial

Write you application

Upload to google

Class Definition Python code to declare a datastore class: class Patient(db.Model):   firstName = db.UserProperty() lastName = db.UserProperty() dateOfBirth = db.DateTimeProperty() sex = db.UserProperty()

Python code to declare a datastore class:

class Patient(db.Model):   firstName = db.UserProperty() lastName = db.UserProperty() dateOfBirth = db.DateTimeProperty() sex = db.UserProperty()

Create Python code to create and store an object: patient = Patient() patient.firstName=“George” patient.lastName=“James” dateOfBirth=“2008-01-01” sex=“M” patient.put()

Python code to create and store an object:

patient = Patient() patient.firstName=“George”

patient.lastName=“James”

dateOfBirth=“2008-01-01”

sex=“M”

patient.put()

Query Python code to query a class: patients = Patient.all() for patient in patients: self.response.out.write(‘Name %s %s.’, patient.firstName, patient.lastName)

Python code to query a class:

patients = Patient.all()

for patient in patients:

self.response.out.write(‘Name %s %s.’,

patient.firstName,

patient.lastName)

More complex query Python code to select the 100 youngest male patients: allPatients = Patient.all() allPatients.filter(‘sex=‘,’Male’) allPatients.order(‘dateOfBirth’) patients = allPatients.fetch(100)

Python code to select the 100 youngest male patients:

allPatients = Patient.all()

allPatients.filter(‘sex=‘,’Male’)

allPatients.order(‘dateOfBirth’)

patients = allPatients.fetch(100)

Query using GQL GQL = Google Query Language GQL code to select the 100 youngest male patients: select * from Patient where sex=‘Male’ order by dateOfBirth Cannot select specific columns No joins

GQL = Google Query Language

GQL code to select the 100 youngest male patients:

select * from Patient where sex=‘Male’ order by dateOfBirth

Cannot select specific columns

No joins

Indexes Development SDK Index definitions generated automatically based on data access within your application Index definitions uploaded to the Google server - kind: Patient properties: - name: dateOfBirth direction: asc - name: sex direction: desc

Development SDK

Index definitions generated automatically based on data access within your application

Index definitions uploaded to the Google server

- kind: Patient

properties:

- name: dateOfBirth

direction: asc

- name: sex

direction: desc

Indexes

Data Viewer

Data Viewer

Data Viewer

Conclusions BigTable is an Internet Scale solution Conventional databases are not up to the job Home grown solutions Increasing demand ??? Profit

BigTable is an Internet Scale solution

Conventional databases are not up to the job

Home grown solutions

Increasing demand

???

Profit

Thank you Questions?

Thank you

Questions?

Add a comment

Related pages

Google Research Publication: Bigtable

Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows ...
Read more

Google BigTable – Wikipedia

Geschichte. Die Entwicklung von BigTable begann 2004. Es wird mittlerweile von vielen hauseigenen Produkten, wie etwa MapReduce, Google Maps, Google ...
Read more

Google’s BigTable - Research at Google

Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows ...
Read more

Bigtable - Scalable NoSQL Database Service | Google Cloud ...

Google Cloud Bigtable is a high performance NoSQL database service for large analytical and operational workloads.
Read more

Cloud Bigtable Documentation | Cloud Bigtable | Google ...

Cloud Bigtable is Google's NoSQL Big Data database service. It's the same database that powers many core Google services, including Search, Analytics, Maps ...
Read more

Google's Bigtable vs. A Relational Database - Stack Overflow

Google's BigTable and other similar projects (ex: CouchDB, HBase) are database systems that are oriented so that data is mostly denormalized (ie ...
Read more

What is Google BigTable? - Definition from WhatIs.com

Google BigTable is a distributed, column-oriented data store created by Google Inc. to handle very large amounts of structured data associated ...
Read more

BigTable - Rutgers University

BigTable uses the Google File System (GFS) for storing both data files and logs. A cluster management system contains software for scheduling jobs, ...
Read more

Google’s Bigtable Distributed Storage System, Pt. I

Google rolls out new applications to millions of users with surprising frequency, which is pretty amazing all by itself. Yet when you look at the variety ...
Read more

Cloud Bigtable: Google öffnet seine Datenbank für alle ...

Bislang hat Google seine Datenbank Bigtable nur intern genutzt. Jetzt lässt sich Bigtable auch als Cloud-Dienst verwenden. Zuvor hatte Google ...
Read more