In Memory Data Grids, Demystified!

50 %
50 %
Information about In Memory Data Grids, Demystified!
Technology

Published on April 22, 2014

Author: uri1803

Source: slideshare.net

Description

The principles and foundations of in memory data grids

Uri Cohen Head of Product @ GigaSpaces @uri1803 github.com/uric In-Memory Data Grids, Demystified

Agenda • Why IMDG? • Brief History • How It Works – Data model & placement – HA and fault tolerance – Consistency – Internals

Why IMDG? Today, more than ever, there are many choices when it comes to storing your data

® Copyright 2011 Gigaspaces Ltd. All Rights Reserved 4 But There Many Solutions

Just A Few Years Back ® Copyright 2011 Gigaspaces Ltd. All Rights Reserved 5

So Why Indeed??

The Need for Speed, In Real Time…

Some Facts

Memory will always be faster than disk (usually by orders of magnitude)

Recent Survey

67% The ratio of IT managers that think that real time analysis is the biggest challenge for big data implementations

40% • Plan to use in memory technologies for big data projects. • Only 32% mentioned Hadoop

Stream Processing

Hell, Even Gartner Thinks So “In memory computing (IMC) … provides transformational opportunities. The execution of certain-types of hours-long batch processes can be squeezed into minutes or even seconds … Millions of events can be scanned in a matter of a few tens of millisecond to detect correlations and patterns pointing at emerging opportunities and threats "as things happen.”

And nowadays HW and SW just makes it a whole lot cheaper

Some Common Use Cases

Fast, Transactional Data Access • Inventory management • Financial reference data • Real time transactional data

Real Time Stream Processing • Fraud Detection • Click Stream Analysis • Real time analytics • Continuous calculation

Heavyweight Offline Calculations • Trade Reconciliation • Pattern analysis and detection • Number crunching

Caching • Database offloading • Content heavy websites

The Evolution of Data Grids

First There Were Local Caches Cache In process caching of Key->Value data structure Distribute Cache Partitioned cache nodes IMDG Partitioned system of record IMDG.next() Good for repetitive-data reads Limited in capacity Doesn’t handle write-heavy scenarios Reads are only part latency path

Then Came Distributed Caches Cache In process caching of Key->Value data structure Distribute Cache Partitioned cache nodes IMDG Partitioned system of record Increased Capacity Still no support for write-heavy scenarios Limited to ID-based reads Reads are only part latency path IMDG.next()

In Memory Data Grids Cache In process caching of Key->Value data structure Increased capacity Write scalability Can serve as system of record with querying & transaction semantics Still limited in capacity Latency can come from other parts of your app Distribute Cache Partitioned cache nodes IMDG Partitioned system of record IMDG.next()

How It Works

Data Models

Data Placement – Fixed Hashing 27 hash(key) % #nodes

Fixed Hashing - HA 28 hash(key) % #nodes

Fixed Hashing – Scaling 29 Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/

Data Placement – Consistent Hashing 30 Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/

Data Placement – Consistent Hashing 31 Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/

Data Placement – Consistent Hashing 32 Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/

Data Placement – Consistent Hashing 33 Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/

Data Placement – Consistent Hashing 34 Source: http://www.griddynamics.com/distributed-algorithms-in-nosql-databases/

Data Consistency Since we’re dealing with distributed data, consistency cannot be taken for granted • Read after write • Read after read • Write-write consistency

Solution 1: Single Master

Solution 2: Read/Write Quorums

Some More Concerns • Transactions • Querying • Failure detection • Leader election • Persistency • Interoperability

IMDG.next() Using IMDG for messaging, BL

IMDG.next() SSD FTW!

Thank You! docs.gigaspaces.com

Add a comment

Related presentations

Presentación que realice en el Evento Nacional de Gobierno Abierto, realizado los ...

In this presentation we will describe our experience developing with a highly dyna...

Presentation to the LITA Forum 7th November 2014 Albuquerque, NM

Un recorrido por los cambios que nos generará el wearabletech en el futuro

Um paralelo entre as novidades & mercado em Wearable Computing e Tecnologias Assis...

Microsoft finally joins the smartwatch and fitness tracker game by introducing the...

Related pages

In Memory Data Grids, Demystified! - Technology

Share In Memory Data Grids, Demystified!
Read more

Data Grids and Data Caching - Technology - documents.mx

Data Grids andData CachingGalder Zamarreño Senior Software Engineer Red Hat, ... In Memory Data Grids, Demystified! Data Grids with Oracle Coherence Comments.
Read more

Sequence assembly demystified : Article : Nature Reviews ...

Sequence assembly demystified. ... Biologists who need to assemble the sequencing data generated in their experiments ... such as computational grids, ...
Read more

Distributed in memory data grid - Documents - docslide.us

Distributed in-memory data grid Distributed in-memory data grid Features Simple Reliable Fail-safe (automatic backups) Fast (data stored in-memory) ...
Read more

Middleware demystified - datagrids by vineetreynolds

We won't discuss shared memory designs - does not aid a lot in understanding data grids. Nodes have their own memory, and share mass storage. Disk ...
Read more

Data Collections Demystified - Documents

Data Collections Demystified. Amy McLaughlin Director of IT Support Services Oregon Department of Education. So Many Collections, So Little Time. ~ 80 ...
Read more