Synchronous Multi-Master Clusters in WAN

53 %
47 %
Information about Synchronous Multi-Master Clusters in WAN
Software

Published on September 25, 2014

Author: MariaDB

Source: slideshare.net

Description

Presented by Alexey Yurchenko, Co-founder & Developer of Galera, Codership at the MariaDB Roadshow in London, 18. Sep. 2014

Building Synchronous MySQL clusters in Cloud and WAN Alexey Yurchenko Codership Oy

A Very Dirrrty Word Sssssssssss... www.codership.com 3

A Very Dirrrty Word Synchronous. www.codership.com 4 View slide

A Very Dirrrty Word Synchronous. w h a t i s i t g o o d f o r ? ? ? www.codership.com 5 View slide

Data Safety Asynchronous Replication: Client Master Slave www.codership.com 6 COMMIT Replicate OK COMMIT Potential data loss

Data Safety Synchronous Replication: Client Master Slave COMMIT Replicate ACK www.codership.com 7 OK COMMIT Additional latency

Data Safety Disaster Recovery: Replication DC1 DC2 #1 www.codership.com 8

Multi-Master Client1 Master1 Master2 Client2 COMMIT Replicate www.codership.com 9 OK COMMIT DEADLOCK CONFLICT DETECTION CONFLICT RESOLUTION COMMIT CONFLICT DETECTION CONFLICT RESOLUTION ROLLBACK

Access Latency Elimination www.codership.com 10

Access Latency Elimination #2 www.codership.com 11

Benchmark Setup (Amazon EC2) www.codership.com 12 us-east ~ 6000 km, ~ 90 ms RTT eu-west us-east eu-west

Access Latency Elimination client location us-east server US-EU cluster change us-east 28.03 ms 119.80 ms ~4.3x eu-west 1953.89 ms 122.92 ms ~0.06x www.codership.com 13

What Happened? ~ 6000 km, ~ 90 ms RTT SQL traffic (reads, writes, etc.) www.codership.com 14 SQL traffic Replication traffic (commits only)

To Sync or Semi-sync?

Look, Ma! No 2-phase commit! Client Master Slave COMMIT Replicate ACK www.codership.com 16 OK COMMIT Slave didn't commit!

To Sync or Semi-sync? Replicate Master Slave Synchronous (master rolls back and stops): ● Data redundancy preserved (sort of: slave is dead) ● Availability compromised (!!!) Semi-synchronous (master continues): ● Data redundancy compromised ● Availability preserved www.codership.com 17

To Sync or Semi-sync? For all practical purposes (production) replication is supposed to protect against master loss, not slave loss (slave loss is mitigated by adding more slaves), to increase the availability of the service. Ironically, fully synchronous replication is not only impractically slow, it is detrimental to the availability goal. www.codership.com 18

Synchronous Replication in WAN The Latency And How To Deal With It.

The Latency And How to Deal With It Latency: 1 RTT – 1.5 RTT (100 – 500 ms) (<200 ms should be practically possible) Trx rate <= 1/Latency (10 – 2 transactions per second? Blast! ) www.codership.com 20

The Latency And How to Deal With It The way they deal with any latency: 1) Buffering: AUTOCOMMIT UPDATEs → multi-statement transactions 2) Parallelization: 1 client session → 10 client sessions www.codership.com 21

Synchronous Replication in WAN Galera Cluster for MySQL variants

Galera Cluster for MySQL variants mysqld MySQL wsrep API Galera wsrep patch Synchronous communication Cluster (other nodes) www.codership.com 24 Dynamic library wsrep API

Galera Cluster for MySQL variants www.codership.com 25

Galera Cluster for MySQL variants MySQL-wsrep MariaDB Galera Cluster www.codership.com 26 Percona XtraDB Cluster Galera Galera Galera

Galera Cluster and CAP Theorem Consistency www.codership.com 27 Availability Partition Tolerance Fixed: timeouts

Synchronous Replication in WAN Goals: ● Disaster Recovery ● Performance ● Service Availability DO's and DONT's

Synchronous Replication in WAN: DO's Invest in a good WAN link (You invest in nodes. The link is the same part of the cluster as the nodes are.) www.codership.com 29

Synchronous Replication in WAN: DO's Categorize your data: 1) Rare, small writes, frequent reads, global data – good. 2) Heavy writes, few reads, local data – bad. www.codership.com 30

Synchronous Replication in WAN: DO's Categorize your data (OpenStack): 1) KeyStone identity data, Glance image metadata: mostly reads, small writes, data of global interest. 2) Ceilometer monitoring data: almost write-only, no need to share globally – store in MongoDB. Jay Pipes, “Tales from the Field: Backend Data Storage in OpenStack Clouds” www.codership.com 31

Synchronous Replication in WAN: DO's Configure timeouts: ● All Galera timeouts and periods should be no less than WAN round trip times. ● Defaults should be suitable for networks with up to 500ms RTTs. ● The higher the timeouts – the more partition tolerant and the less available the cluster is (CAP theorem). ● Timeouts relation: RTT <= evs.suspect_timeout <= evs.inactive_timeout <= evs.install_timeout ● evs.suspect_timeout is the timeout to detect single node partition/failure ● Further info: http://galeracluster.com/documentation-webpages/configurationtips.html#wan-replication www.codership.com 32

Synchronous Replication in WAN: DO's Configure cluster segments: 2 www.codership.com 33 1 DC1 1 1 DC2 2 DC3 3 3 3 2

Synchronous Replication in WAN: DO's Choose odd number of nodes and odd number of datacenters: ● Most popular choice: 3x3 ● Also observed in the field: 5x3 and 3x5 www.codership.com 34

Synchronous Replication in WAN: DO's 3 is better than 2! DC1 DC2 DC3 www.codership.com 35

Synchronous Replication in WAN: DONT's 1) Hot Spots www.codership.com 36

Synchronous Replication in WAN: DONT's hotspot 1 RTT www.codership.com 37

Synchronous Replication in WAN: DONT's 1) Hot Spots 2) Poor Links www.codership.com 38

Synchronous Replication in WAN: DONT's Synchronous – with who? Full packet loss → the node is not with us www.codership.com 39 No packet loss → the node is with us ???

Synchronous Replication in WAN: DONT's 1) Hot Spots 2) Poor Links 3) Huge Transactions www.codership.com 40

Synchronous Replication in WAN: DONT's Huge transactions kill concurrency: a) Long to replicate b) Long to certify c) Long to apply on slave → SLAVE LAG www.codership.com 41

Synchronous Replication in WAN: DONT's 1) Hot Spots 2) Poor Links 3) Huge Transactions 4) No Primary Keys www.codership.com 42

Synchronous Replication in WAN: DONT's No PRIMARY KEY: mysql> DELETE FROM 10M_rows_no_PK_table; => 50 000 000 000 000 rows scan. www.codership.com 43

If Synchronous Doesn't Work Out Native MySQL Asynchronous Replication (log_slave_updates = ON) A Between Galera Clusters www.codership.com 44 1 Galera1 3 2 Galera2 C B async Master Slave

If Synchronous Doesn't Work Out Native MySQL Asynchronous Replication A Between Galera Clusters www.codership.com 45 1 Galera1 3 2 Galera2 C B async Master Slave

If Synchronous Doesn't Work Out Native MySQL Asynchronous Replication A Between Galera Clusters www.codership.com 46 1 Galera1 3 Galera2 C B async Master Slave

If Synchronous Doesn't Work Out Native MySQL Asynchronous Replication Between MariaDB Galera Clusters (log_slave_updates = OFF) A www.codership.com 47 1 Galera1 3 2 Galera2 C B async Master Slave

Synchronous Replication in WAN Q & A www.codership.com 48

Add a comment

Related presentations

Speaker: Matt Stine Developing for the Cloud Track Marc Andressen has famou...

This presentation explains how to develop a Web API in Java using (JAX-RS or Restl...

1 App,

1 App,

November 10, 2014

How to bring innovation to your organization by streamlining the deployment proces...

Cisco Call-control solutions can handle voice, video and data

Nathan Sharp of Siemens Energy recently spoke at the SAP Project Management in Atl...

Related pages

Synchronous multi-master clusters with MySQL: an ...

Synchronous multi-master clusters with MySQL: an introduction to Galera Henrik Ingo OUGF Harmony conference ... # Here we increase window size for a WAN setup
Read more

Building Synchronous Multi-Master MySQL Clusters in the ...

In this technical presentation Alex Yurchenko, cluster-developer and expert, will cover: Synchronous multi-master features and functionality; Optimized WAN ...
Read more

⭐Synchronous multi-master clusters with MySQL: an ...

1 Synchronous multi-master clusters with : ... * Load balancing and other options * How network partitioning is handled * WAN replication How does it perform?
Read more

Galera - Synchronous Multi-master Replication For InnoDB ...

Galera - Synchronous Multi-master Replication For ... Galera in cloud * WAN ... Synchronous multi-master clusters ...
Read more

Replication, Clustering, and Connection Pooling - PostgreSQL

Replication, Clustering, and Connection Pooling. ... that are synchronous, ... to support large multi-master clusters and single ...
Read more

Galera Synchronous Multi-Master Replication for MySQL ...

Galera Synchronous Multi-Master Replication ... This is a tutorial about how to install, configure and operate Galera synchronous multi-master ... WAN ...
Read more