Published on March 9, 2014
Global State Recording Distributed Systems Hamed Naeemaei 1
Models of communication 1. In FIFO model, each channel acts as a first-in first-out message queue and thus, message ordering is preserved by a channel. 2. In non-FIFO model, a channel acts like a set in which the sender process adds messages and the receiver process removes messages from it in a random order. 3. A system that supports causal delivery of messages satisfies the following property: “For any two messages mij and mkj , if send(mij ) −→ send(mkj ), then rec(mij ) −→ rec(mkj )”. 2
System model • Cij denotes the channel from process Pi to process Pj and its state is denoted by SCij . • At any instant, the state of process Pi , denoted by LSi • The actions performed by a process are modeled as three types of events: Internal events , the message send event and the message receive event. 3
Consistent global state 4
Cuts Instant of local observation Time P1 5 8 3 initial value P2 5 3 2 7 4 1 P3 4 5 ideal (vertical) cut (15) not attainable 0 consistent cut (15) equivalent to a vertical cut (rubber band transformation) inconsistent cut (19) can’t be made vertical (message from the future) 5
Issues in recording a global state I1: How to distinguish between the messages to be recorded in the snapshot from those not to be recorded. – Any message that is sent by a process before recording its snapshot, must be recorded in the global snapshot (from C1). – Any message that is sent by a process after recording its snapshot, must not be recorded in the global snapshot (from C2). I2: How to determine the instant when a process takes its snapshot. – A process must record its snapshot before processing a message that was sent by process after recording its snapshot. 6
Spezialetti-Kearns algorithm(FIFO) Chandy-Lamport Extension Exploit concurrently initiated snapshots to reduce overhead of local snapshot exchange • There are two phases in obtaining a global snapshot: locally recording the snapshot at every process and distributing the resultant global snapshot to all the initiators. snapshot recording : • In the Spezialetti-Kearns algorithm, a markers carries the identifier of the initiator of the algorithm. • Each process has a variable master to keep track of the initiator of the algorithm. 7
Spezialetti-Kearns algorithm(FIFO) Chandy-Lamport Extension • A key notion used by the optimizations is that of a region in the system. A region encompasses all the processes whose master field contains the identifier of the same initiator. • The identifier of the concurrent initiator is recorded in a local variable id-border -set. 8
Spezialetti-Kearns algorithm (cont.) • The state of the channel is recorded just as in the Chandy-Lamport algorithm. • Snapshot recording at a process is complete after it has received a marker along each of its channels. • After every process has recorded its snapshot, the system is partitioned into as many regions as the number of concurrent initiations of the algorithm. • Variable id-border-set at a process contains the identifiers of the neighboring regions. 9
Spezialetti-Kearns algorithm (cont.) Efficient dissemination of the recorded snapshot • In the snapshot recording phase, a forest of spanning trees is implicitly created in the system. The initiator of the algorithm is the root of a spanning tree and all processes in its region belong to its spanning tree. 10
Spezialetti-Kearns algorithm (cont.) 1) If Pi receives its first marker from Pj then: process Pj is the parent of process Pi in the spanning tree. 2) When an intermediate process in a spanning tree has received the recorded states from all its child processes and has recorded the states of all incoming channels, it forwards its locally recorded state and the locally recorded states of all its descendent processes to its parent. 11
Spezialetti-Kearns algorithm (cont.) 3) When the initiator receives the locally recorded states of all its descendants from its children processes, it assembles the snapshot for all the processes in its region and the channels incident on these processes. 4) The initiator exchanges the snapshot of its region with the initiators in adjacent regions in rounds. 12
Spezialetti-Kearns algorithm (cont.) Example 13
Snapshot Algorithms for Non-FIFO Channels Lai-Yang Algorithm • In a non-FIFO system, a marker cannot be used to delineate messages into those to be recorded in the global state from those not to be recorded in the global state. • In a non-FIFO system, either some degree of inhibition or piggybacking of control information on computation messages to capture out-of-sequence messages is necessary to record a consistent global snapshot. • The Lai-Yang algorithm fulfills this role of a marker in a non-FIFO system by using a coloring scheme on computation messages that works as follows: 14
Snapshot Algorithms for Non-FIFO Channels Lai-Yang Algorithm – Every process is initially white and turns red while taking a snapshot. The equivalent of the “Marker Sending Rule” is executed when a process turns red. – Every message sent by a white (red) process is colored white (red). – Thus, a white (red) message is a message that was sent before (after) the sender of that message recorded its local snapshot. – Every white process takes its snapshot at its convenience, but no later than the instant it receives a red message. 15
Snapshot Algorithms for Non-FIFO Channels Lai-Yang Algorithm send (mij ) | send (mij ) LSi rec(mij ) | rec(mij ) LS j 16
Snapshot Algorithms for Non-FIFO Channels Lai-Yang Algorithm Example P rw rr ww wr Q 17
Snapshot Algorithms for Non-FIFO Channels Mattern’s Algorithm • Mattern’s algorithm is based on vector clocks and assumes a single initiator process and works as follows: – The initiator “ticks” its local clock and selects a future vector time s at which it would like a global snapshot to be recorded. It then broadcasts this time s and freezes all activity until it receives all acknowledgements of the receipt of this broadcast. – When a process receives the broadcast, it remembers the value s and returns an acknowledgement to the initiator. – After having received an acknowledgement from every process, the initiator increases its vector clock to s and broadcasts a dummy message to all processes. 18
Snapshot Algorithms for Non-FIFO Channels Mattern’s Algorithm – The receipt of this dummy message forces each recipient to increase its clock to a value ≥ s if not already ≥ s. – Each process takes a local snapshot and sends it to the initiator when (just before) its clock increases from a value less than s to a value ≥ s. – The state of Cij is all messages sent along Cij , whose timestamp is smaller than s and which are received by Pj after recording LSj . 19
Chapter 4: Global State and Snapshot Recording Algorithms Ajay Kshemkalyani and Mukesh Singhal Distributed Computing: Principles, Algorithms, and Systems
The snapshot algorithm is an algorithm used in distributed systems for recording a consistent global state of an asynchronous system. The algorithm ...
Distributed Snapshots: Determining Global States of Distributed Systems K. MANI CHANDY University of Texas at Austin and LESLIE LAMPORT
Checkpoints in Mobile Distributed Systems: Introduction of Mobile Distributed Systems and Recording of Global State [Ajay Khunteta] on Amazon.com. *FREE ...
Time and Global States. Overview. There are two formal models of distributed systems: synchronous and asynchronous.
Time and State in Distributed Systems ... After recording its local state, ... Determining Global States of Distributed Systems.
Consistent Global States of Distributed Systems: ... 93-1 Consistent Global States of Distributed ... based on all processes recording their states at ...
Checkpointing andRollback-Recovery forDistributed ... CONSISTENT GLOBAL STATES IN DISTRIBUTED SYSTEMS Thenotionofaconsistent global state is central to rea-
Distributed Snapshots: Determining Global States of ... results in the system performing "the same" distributed ... recording global state, ...