NoSQL Slideshare Presentation

50 %
50 %
Information about NoSQL Slideshare Presentation
Technology

Published on February 25, 2014

Author: EricssonLabs

Source: slideshare.net

Description

Can No-SQL technologies hold for the specific requirements that apply to the Telco domain?

This is the Slideshare Presentation by Ericsson Researcher Nicolas Seyvet to accompany his blog "NoSQL for Telco"
http://labs.ericsson.com/blog/nosql-for-telco

for Telco Data Research Day 2013 Prepared by Nicolas Seyvet Help from N. Hari Kumar P. Matray

Who AM I? › Software Developer 10+ years at Ericsson › HLR, PGM, IMS-M, MMS, MTV, BCS › Joined Research late 2012 –BMUM -> BUSS (5+ years) –DUCI (<6 months) › Active member in various /// groups –Linux (ELX, UMWP, etc.), Agile, SWAN, EQNA › Open source contributor Ericsson Internal | 2013-06-03 | Page 2

The Plan › Why NoSQL? › CAP › Research activities › Market trends Ericsson Internal | 2013-06-03 | Page 3

NoSQL: Why? Data Research Day 2013

NoSQL: Why? Trends – Usual Suspects Gossip SDN Gartner Data Center TCO Report, June 2012. Ericsson Internal | 2013-06-03 | Page 5 Internet Hypertext, RSS, Wikis, blogs, wikis, tagging, user generated content, RDF, ontologies

NoSQL: Why? TrendS: Architecture › Multicore › Parallelization/Distribute d › Cloud › Schemaless Application Application 1980s: Mainframe applications Ericsson Internal | 2013-06-03 | Page 6 Application Application Application Application Application Application 1990s: Database as integration hub Application Application Application Application 2000s: Decoupled services Application Application

Two Ways to Scale Go BIG or many? PARTITIO N Ericsson Internal | 2013-06-03 | Page 7 (replication)

Vailability CAP artition Data Research Day 2013

CAP Theorem Brewer’s Conjecture “Of three properties of shared-data systems – data Consistency, system Availability and tolerance to network Partitions – only two can be achieved at any given moment in time .” › 2000 Prof Eric Brewer, PoDC Conference Keynote › 2002 Seth Gilbert and Nancy Lynch, ACM SIGACT News 33 (2) Ericsson Internal | 2013-06-03 | Page 9

CAP Theorem The business decision CONSISTENT Partition OR Available Ericsson Internal | 2013-06-03 | Page 10

CAP Summary Available Traditio MySQL nal relationa l: , Postg re S Q L , e t c. Consistent AP CA CP dra, as s an em s iak, C e syst or t , R lik m Volde , Dynamo hD b Couc AP: Requests will complete at any node possibly violating consistency Partition Tolerance HBase, MongoDB, Redis, BigTable like systems CP: Requests will complete at nodes that have quorum Ericsson Internal | 2013-06-03 | Page 11

Why NoSQL now? › Trends “Internet size”, Cluster friendly Rapid development / Solution oriented Polyglot Persistence Schemaless Ericsson Internal | 2013-06-03 | Page 12

Research Activities TelCO Applicability Aggregation Event Streams Data Research Day 2013

HBAse BigTable/Columnar Coordination Master selection Root region lookup Node registration … Data files Write-Ahead Log (WAL) Rack aware Default data replication x3 Region allocation Failover Log splitting Load balancing One active (elected), many stand by Holds regions Handle I/O requests In-Memory data (MemStore) Split regions Compact regions › ZooKeeper (cluster) › Hadoop (cluster) › HBase: 1 elected master / many region servers Ericsson Internal | 2013-06-03 | Page 14

TelCO Applicability Study Hbase For HLR data? ›Comprehensive report ›Using HBase is DOABLE! OK! Ericsson Internal | 2013-06-03 | Page 15

HBASE BULK Processing Event Processing & Aggregation › 100 Million rows Queries evaluated SELECT col1 FROM table SELECT SUM(col1) FROM table WHERE col2=val2 GROUP BY col3 › › › › › Map/Reduce › Scan › Co-processor Ericsson Internal | 2013-06-03 | Page 16 CPU RAM Network Schema

Bulk Processing Scaling out/Horizontally › 100 Million rows › Linear scaling! SELECT SUM(col1) FROM table WHERE col2=val2 GROUP BY col3 Ericsson Internal | 2013-06-03 | Page 17

READ/WRITE 100000 iterations Periodic degradation › 150,000,000 rows › row = key + 1 column (1K) Entire cluster up and running 8 nodes ( 1Master / 7 slaves) Ericsson Internal | 2013-06-03 | Page 18

Robustness Killing Them Softly… Master Slaves Ericsson Internal | 2013-06-03 | Page 19

How much Data can it Fit? ITK / Constellation / CEA › Network produces events – RNC, SGSN, S-&R-KPI – Traffic DPI – GTP-C › CEA (Perfmon) – Correlated events 1000+ K events/s Event Event Feeder Feeder 10+ K events/s Map/Reduce Put.. Put.. Put… 10,000,000 subscribers Staging data on HDFS HBase HBase BulkLoader BulkLoader HBase HBase PutLoader PutLoader Look Ericsson Internal | 2013-06-03 | Page 20 up d at a

The Upcoming Fight Storkluster 18 machines Ericsson Internal | 2013-06-03 | Page 21 Bigdata 2 machines

What about HDFS ? Small files (250 B) › It scales! › TestDFSIO benchmark > 3000 GB/s - Read > 2000 GB/s - Writes CPU CPU Larger files (1 KB) › But …. it is not that simple… Ericsson Internal | 2013-06-03 | Page 22 CPU and I/O CPU and I/O Larger files (1 KB) Network Network

What about End to End? writing to Hbase included 100 K events/s › It scales! › And it gets… more complicated 200 K events/s Ericsson Internal | 2013-06-03 | Page 23

But…. › Within ~2 hours – Rows/s – CPU – IO ----------+++ +++++++++ Ericsson Internal | 2013-06-03 | Page 24 7K/s x2 100%

HDFS CURSE Compaction Storm › Remember what we were doing? – Hint: Creating lots of small files to add to HBase?.. › Major compaction storm! – Manage compaction and region splitting Ericsson Internal | 2013-06-03 | Page 25 HBase HBase BulkLoader BulkLoader M/R

Conclusion › Scalability … Scalability… Scalability › It works but it is not so easy… › Recommendation: – Polyglot data storage Ericsson Internal | 2013-06-03 | Page 26

Ericsson Internal | 2013-06-03 | Page 27

NoSQL Data Research Day 2013

NoSQL: The name › It is not about saying SQL is bad or should not be used › ”An accidental neologism” – Martin Fowler › A twitter hash › No prescriptive definition, just observations of common characteristics – “Any database that is not a Relational Database” – Running well on clusters (scalable) – schemaless › Polyglot persistence – Using different stores in different circumstances Ericsson Internal | 2013-06-03 | Page 29 The term was coined at a meetup with the creators behind some prominent emerging databases ... then there was a conference ... ... and a mailing list ... ... the name caught on ... ... then there were more conferences ... ... and here we are!

NoSQL: Why? Trend No 2/4: Connectedness Internet Hypertext, RSS, Wikis, blogs, wikis, tagging, user generated content, RDF, ontologies M2M Application Ericsson Internal | 2013-06-03 | Page 30

NoSQL: Why? Trend No 3/4: Content Individualization Schemaless •Extend at runtime •De-normalize •Domain design (not schema migration) › Individualization of content › Decentralization Ericsson Internal | 2013-06-03 | Page 31

NoSQL Landscape › 4 emerging categories Key-Value Graph BigTable Document (NewSQL) DBN Ericsson Internal | 2013-06-03 | Page 32 (Object)

Consistency “A system is consistent if an update is applied to all relevant nodes at the same logical time ” Strong consistency Weak consistency Atomicity Consistency Isolation Durability (ACID) Eventual consistency (inconsistency window) NoSQL solutions DO support Transactions Standard database replication (or caching) IS NOT strongly consistent, as such any solutions making use of any of those is by definition Eventually Consistent at best Ericsson Internal | 2013-06-03 | Page 33

Partition Tolerance / Availability › “The network will be allowed to lose arbitrarily many messages sent from one node to another” [..] › “For a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response ” Gilbert and Lynch, SIGACT 2002 CP: Requests will complete at nodes that have quorum AP: Requests will complete at any node possibly violating consistency High latency ~= Partition Ericsson Internal | 2013-06-03 | Page 34

HBASE BULK Processing Event Processing & Aggregation › 100 Million rows Queries evaluated SELECT col1 FROM table SELECT SUM(col1) FROM table WHERE col2=val2 GROUP BY col3 Ericsson Internal | 2013-06-03 | Page 35

Add a comment

Related presentations

Related pages

NoSQL Slideshare Presentation - HubSlide

Can No-SQL technologies hold for the specific requirements that apply to the Telco domain? This is the Slideshare Presentation by Ericsson Researcher Nicolas
Read more

PPT – NoSQL PowerPoint presentation | free to download ...

The PowerPoint PPT presentation: "NoSQL" is the property of its rightful owner. Do you have PowerPoint slides to share? If so, ...
Read more

Thumbtack Technology - Presentation Slides

May 08, 2014 New! Practical Guide to SQL - NoSQL Migration In this slideshare presentation, we explore the main advantages of choosing a NoSQL database ...
Read more

Presentations | MongoDB

Presentations; White Papers; Datasheets; Events; Documentation; What is MongoDB; ... NoSQL Database Explained. MongoDB Architecture Guide. MongoDB ...
Read more

PPT - Introduction to NOSQL Databases PowerPoint ...

Introduction to NOSQL Databases. Adopted from slides and/or materials by P. Hoekstra, J. Lu, A. Lakshman , P. Malik, J. Lin, R. Sunderraman , T ...
Read more