Kafka overview and use cases

0 %
100 %
Information about Kafka overview and use cases

Published on June 16, 2016

Author: IndrajeetKumar34

Source: slideshare.net

1. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Kafka - Overview Indrajeet Kumar Site Reliability Engineer at LinkedIn

2. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. 2 So what is it? It is a high-throughput, low-latency messaging system

3. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. And who uses it? 3

4. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. What for? •Messaging •Website Activity Tracking •Metrics •Log Aggregation •Stream Processing •For fun ;) 4

5. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. So how does it work? ▪ Components – Producer – Broker ▪ Topic ▪ Partition – Consumer 5

6. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Broker producer producer producer B2B1P1 P2P1 R P2 R 6 consumer consumer consumer

7. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. The consumer 7 consumer B2B1P1 P2 B3P3 C1 C2 P1 R

8. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. The Producer 8 Producer B2B1P1 P2 B3P3 P1 P2 P1 R

9. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Attributes of a Kafka Cluster ▪ Durable ▪ Scalable ▪ Low Latency ▪ Finite Retention ▪ No single point of failure 9

10. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Kafka At LinkedIn ▪ Multiple Datacenters, Multiple Clusters ▪ Mirroring between clusters ▪ Message Types – Metrics – Tracking – Queuing ▪ Data transport from applications to Hadoop, and back 10

11. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Some numbers! ▪ 1800+ Broker machines ▪ 79K+ Topics ▪ 1.1M+ Partitions ▪ 1.3 Trillion messages per day ▪ 330 Terabytes in/day ▪ 1.2 Petabytes out/day ▪ Peak load for a single cluster – 2 million messages/sec – 4.7 Gigabits/sec inbound – 15 Gigabits/sec outbound 11

12. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved. Questions 12

13. SITE RELIABILITY ENGINEERING©2016 LinkedIn Corporation. All Rights Reserved.

Add a comment

Related pages

1.2 Use Cases - Apache Kafka

Here is a description of a few of the popular use cases for Apache Kafka. For an overview of a number of these areas in action, see this paper. Messaging
Read more

Kafka documentation - Apache Kafka

Kafka Connect. 8.1 Overview; 8.2 User Guide; ... More details on these guarantees are given in the design section of the documentation. 1.2 Use Cases
Read more

What’s an Appropriate Use Case for Kafka? | YMC

What’s an Appropriate Use Case for Kafka? ... Kafka. Kafka, what? Kafka ... at Facebook at Data Science London 2012 gives a brief overview about Puma and ...
Read more

Apache Kafka for Beginners - Cloudera Engineering Blog

Apache Kafka for Beginners. September 12, ... Thanks for the overview! ... Very helpful for a complete beginner to get a picture of Kafka use cases.
Read more

Apache Kafka - Hortonworks

... fault-tolerant messaging system Apache™ Kafka is a fast ... Kafka Overview. What ... work very well in combination with Kafka. Common use cases ...
Read more

Putting Apache Kafka To Use: A Practical Guide to Building ...

Putting Apache Kafka To Use: ... I’ve discussed a number of different use cases. ... Bringing together the entire Kafka community to share use cases, ...
Read more

Apache Kafka supported by Cloudera Enterprise

A flexible and secure publish-subscribe messaging system for Apache Hadoop scale, Kafka is an integrated part of CDH and supported with Cloudera Enterprise.
Read more

Real-time data ingestion in Hadoop - hortonworks.com

All data sets used in these tutorials are real data sets but modified to fit these use cases. Tutorial Overview. ... combination with Kafka. Common use ...
Read more