2 3pc

75 %
25 %
Information about 2 3pc
Education

Published on January 25, 2008

Author: Monica

Source: authorstream.com

ICS 214B: Transaction Processing and Distributed Data Management:  ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems So far: Centralized DB systems:  So far: Centralized DB systems Software: P M ... Simplifications: single front end one place to keep locks if processor fails, system fails, ... Next: distributed database systems:  Next: distributed database systems Multiple processors ( + memories) Heterogeneity and autonomy of “components” Why do we need Distributed Databases?:  Why do we need Distributed Databases? Example: Big Corp. has offices in London, New York, and Hong Kong. Employee data: EMP(ENO, NAME, TITLE, SALARY, …) Where should the employee data table reside? Big Corp. Data Access Pattern:  Big Corp. Data Access Pattern Mostly, employee data is managed at the office where the employee works E.g., payroll, benefits, hire and fire Periodically, Big Corp needs consolidated access to employee data E.g., Big Corp. changes benefit plans and that affects all employees. E.g., Annual bonus depends on global net profit. Slide6:  EMP London Payroll app London New York Payroll app New York Hong Kong Payroll app Hong Kong Problem: NY and HK payroll apps run very slowly! Slide7:  London Emp London Payroll app London New York Payroll app New York Hong Kong Payroll app Hong Kong HK Emp NY Emp Much better!! Slide8:  London Payroll app Annual Bonus app London New York Payroll app New York Hong Kong Payroll app Hong Kong London Emp NY Emp HK Emp Distribution provides opportunities for parallel execution Slide9:  London Payroll app Annual Bonus app London New York Payroll app New York Hong Kong Payroll app Hong Kong London Emp NY Emp HK Emp Slide10:  London Payroll app Annual Bonus app London New York Payroll app New York Hong Kong Payroll app Hong Kong Lon, NY Emp NY, HK Emp HK, Lon Emp Replication improves availability Heterogeneity and Autonomy:  Heterogeneity and Autonomy Application RDBMS Files Stock ticker tape Portfolio History of dividends, ratios,... Slide12:  data management with multiple processors and possible autonomy, heterogeneity Impact on: Data organization Query processing Access structures Concurrency control Recovery Slide13:  transaction monitors Coordinate transaction execution Multiple DBMSs High performance Have workflow facilities Manage communications with client “terminals” DB architectures:  DB architectures (1) Shared memory P P P ... M DB architectures:  DB architectures (2) Shared disk ... ... M M DB architectures:  DB architectures (3) Shared nothing ... DB architectures:  DB architectures (4) Hybrid example – Hierarchical or Clustered Issues for selecting architecture:  Issues for selecting architecture Reliability Scalability Geographic distribution of data Data “clusters” Performance Cost Parallel or distributed DB system?:  Parallel or distributed DB system? More similarities than differences! Slide20:  Typically, parallel DBs: Fast interconnect Homogeneous software High performance is goal Transparency is goal Slide21:  Typically, distributed DBs: Geographically distributed Data sharing is goal (may run into heterogeneity, autonomy) Disconnected operation possible Distributed Database Challenges:  Distributed Database Challenges Distributed Database Design Deciding what data goes where Depends on data access patterns of major applications Two subproblems: Fragmentation: partition tables into fragments Allocation: allocate fragments to nodes Distributed Database Challenges:  Distributed Database Challenges Distributed Query Processing Centralized query plan goal: minimize number of disk I/Os Additional factors in distributed scenario: Communication costs Opportunity for parallelism Space of possible query plans is much larger! Distributed Database Challenges:  Distributed Database Challenges Distributed Concurrency Control Transactions span nodes Must be globally serializable Two main approaches: Locking Timestamps Distributed Deadlock Management Multiple data copies – need to be kept in sync when updates occur Distributed Database Challenges:  Distributed Database Challenges Reliability of Distributed Databases Centralized database failure model: processor fails Distributed database failure model: One or more processors may fail Network may fail Network may be partitioned Data must be kept in sync Slide26:   To illustrate synchronization problems: “Two Generals” Problem The one general problem (Trivial!):  The one general problem (Trivial!)  Battlefield G Troops The two general problem::  The two general problem: <-------------------------------> messengers Blue army Red army Blue G Red G Enemy Slide29:  Blue and red army must attack at same time Blue and red generals synchronize through messengers Messengers can be lost Rules: Slide30:  Application RDBMS Files Stock ticker tape Portfolio History of dividends, ratios,... Distributed Database Challenges Heterogeneity Slide31:  Example: unable to get statistics for query optimization Example: blue general may have mind of his (or her) own! Distributed Database Challenges Autonomy Slide32:  Distributed DB Design Distributed DB Design:  Distributed DB Design Top-down approach: - have DB… - how to split and allocate the sites Bottom-up approach: - multi-database (possibly heterogeneous, autonomous) - no design issues! Two issues in DDB design::  Two issues in DDB design: Fragmentation Allocation Note: issues not independent, but will cover separately Slide35:  Employee relation E (#,name,loc,sal,…) 40% of queries: 40% of queries: Qa: select * Qb: select * from E from E where loc=Sa where loc=Sb and… and ... Slide36:  # NM Loc Sal E 5 7 8 Sa 10 Sally Sb 25 Tom Sa 15 Joe # NM Loc Sal # NM Loc Sal 5 8 Sa 10 Tom Sa 15 Joe 7 Sb 25 Sally .. .. .. .. F At Sa At Sb Slide37:  F = { F1, F2 } F1 = loc=Sa(E) F2 = loc=Sb(E)  called primary horizontal fragmentation Fragmentation:  Fragmentation Horizontal Primary depends on local attributes R Derived depends on foreign relation Vertical R Three common horizontal fragmentation techniques:  Three common horizontal fragmentation techniques Round robin Hash partitioning Range partitioning Round robin:  Round robin R D0 D1 D2 t1 t1 t2 t2 t3 t3 t4 t4 ... t5 Evenly distributes data Good for scanning full relation Not good for point or range queries Not suitable for databases distributed over WAN Hash partitioning:  Hash partitioning R D0 D1 D2 t1h(k1)=2 t1 t2h(k2)=0 t2 t3h(k3)=0 t3 t4h(k4)=1 t4 ... Good for point queries on key; also for joins on key Not good for range queries; point queries not on key If hash function good, even distribution Not suitable for databases distributed over a WAN Range partitioning:  Range partitioning R D0 D1 D2 t1: A=5 t1 t2: A=8 t2 t3: A=2 t3 t4: A=3 t4 ... 4 7 partitioning vector V0 V1 Good for point queries on A; also for joins on A Good for some range queries on A Need to select good vector: else unbalanced data skew, execution skew Which are good fragmentations?:  Which are good fragmentations? Example: F = { F1, F2 } F1 =  sal<10 E F2 =  sal>20 E  Problem: Some tuples lost! Which are good fragmentations?:  Which are good fragmentations? Second example: F = { F3, F4 } F3 =  sal<10 E F4 =  sal>5 E  Tuples with 5 < sal < 10 are duplicated... Slide45:  Better design Example: F = { F5, F6, F7 } F5 = sal  5 E F6 = 5<sal<10 E F7 = sal  10 E  Then replicate F6 if convenient (part of allocation problem) Desired properties for fragmentation:  Desired properties for fragmentation R  F = {F1, F2, …, Fn} Completeness For every data item x  R,  FiF such that xFi Disjointness xFi,  Fj such that xFj, i  j Reconstruction There is function g such that R = g(F1, F2, …, Fn) Desired properties for horizontal fragmentation:  Desired properties for horizontal fragmentation R  F = {F1, F2, …, Fn} Completeness For every tuple tR,  FiF such that tFi Disjointness tFi,  Fj such that tFj, i  j Reconstruction – can safely ignore Completeness  R = How do we get completeness and disjointness?:  How do we get completeness and disjointness? (1) Check it “manually”! e.g., F1 = sal<10 E ; F2 = sal10 E How do we get completeness and disjointness?:  How do we get completeness and disjointness? (2) “Automatically” generate fragments with these properties Horizontal fragments are defined by selection predicates Generate a set of selection predicates with the desired properties Example of generation:  Example of generation Say queries use predicates: A<10, A>5, Loc = SA, Loc = SB Next: - generate “minterm” predicates - eliminate useless ones Given simple predicates Pr= { p1, p2,.. pn } minterm predicates are of the form p1*  p2*  …  pn* where pk* is pk or is ¬pk Minterm predicates (part I):  Minterm predicates (part I) (1) A<10  A>5  Loc=SA  Loc=SB (2) A<10  A>5  Loc=SA  ¬(Loc=SB) (3) A<10  A>5  ¬(Loc=SA)  Loc=SB (4) A<10  A>5  ¬(Loc=SA)  ¬(Loc=SB) (5) A<10  ¬(A>5)  Loc=SA  Loc=SB (6) A<10  ¬(A>5)  Loc=SA  ¬(Loc=SB) (7) A<10  ¬(A>5)  ¬(Loc=SA)  Loc=SB (8) A<10  ¬(A>5)  ¬(Loc=SA)  ¬(Loc=SB) Minterm predicates (part II):  Minterm predicates (part II) (9) ¬(A<10)  A>5  Loc=SA  Loc=SB (10) ¬(A<10)  A>5  Loc=SA ¬(Loc=SB) (11) ¬(A<10)  A>5 ¬(Loc=SA)  Loc=SB (12) ¬(A<10)  A>5 ¬(Loc=SA) ¬(Loc=SB) (13) ¬(A<10) ¬(A>5)  Loc=SA  Loc=SB (14) ¬(A<10) ¬(A>5)  Loc=SA ¬(Loc=SB) (15) ¬(A<10) ¬(A>5) ¬(Loc=SA)  Loc=SB (16) ¬(A<10) ¬(A>5) ¬(Loc=SA) ¬(Loc=SB) Final fragments::  Final fragments: F2: 5 < A < 10  Loc=SA F3: 5 < A < 10  Loc=SB F6: A  5  Loc=SA F7: A  5  Loc=SB F10: A  10  Loc=SA F11: A  10  Loc=SB Note: elimination of useless fragments depends on application semantics::  Note: elimination of useless fragments depends on application semantics: e.g.: if LOC could be  SA,  SB, we need to add fragments F4: 5 <A <10  Loc  SA  Loc  SB F8: A  5  Loc  SA  Loc  SB F12: A  10  Loc  SA  Loc  SB Why does this algorithm work?:  Why does this algorithm work? Must prove that the set of fragments is: Complete Disjoint Summary:  Summary Given simple predicates Pr= { p1, p2,.. pn } minterm predicates are M={m | m =  pk*, 1  k n } where pk* is pk or is ¬ pk pkPr Fragments m R for all m  M are complete and disjoint Distributed commit problem:  . Distributed commit problem Action: a1,a2 Action: a3 Action: a4,a5 Transaction T Commit must be atomic Distributed commit problem:  Distributed commit problem Commit must be atomic site failures communication failures network partitions timeout failures Solution: Atomic commit protocol must ensure that despite failures, if all failures repaired, then transactions commits or aborts at all sites. Most common ACP: Two-phase commit (2PC) Centralized 2PC Distributed 2PC Linear 2PC Many other variants… Terminology:  Terminology Resource Managers (RMs) Usually databases Participants RMs that did work on behalf of transaction Coordinator Component that runs two-phase commit on behalf of transaction Slide60:  Coordinator Slide61:  Coordinator States of the Transaction :  States of the Transaction At Coordinator: Initiated (I) -- transaction known to system Preparing (P) -- prepare message sent to participants committed (C) -- has committed Aborted (A) -- has aborted At participant: Initiated (I) Prepared (P) -- prepared to commit, if the coordinator so desires committed (C) Aborted (A) Protocol Database:  Protocol Database Coordinator maintains a protocol database (in main memory) for each transaction Protocol database enables coordinator to execute 2PC answers inquiries by participants about status of transaction cohorts may make such inquiries if they fail during recovery entry for transaction deleted when coordinator is sure that no one will ever inquire about transaction again (when it has been acked by all the participants) two-phase commit (messages):  two-phase commit (messages) Coordinator Participant I P C A I P C A commit-request request-prepare* no abort* prepared* Commit* commit ack request-prepare prepared request-prepare no abort ack F ack* ack* Slide65:  Notation: Incoming message Outgoing message ( * = everyone) When participant enters “P” state: it must have acquired all resources it can only abort or commit if so instructed by a coordinator Coordinator only enters “C” state if all participants are in “P”, i.e., it is certain that all will eventually commit Two phase commit -- normal actions (coordinator):  Two phase commit -- normal actions (coordinator) make entry into protocol database for transaction marking its status as initiated when coordinator first learns about transaction Add participant to the cohort list in protocol database when coordinator learns about the cohorts Change status of transaction to preparing before sending prepare message. (it is assumed that coordinator will know about all the participants before this step) On receipt of PREPARE message from cohort, mark cohort as PREPARED. If all cohorts PREPARED, then change status to COMMITTED and send COMMIT message. must force a commit log record to disk before sending commit message. on receipt of ACK message from cohort, mark cohort as ACKED. When all cohorts have acked, then delete entry of transaction from protocol database. Must write a completed log record to disk before deletion from protocol database. No need to force the write though. Two Phase Commit - normal actions (participant):  Two Phase Commit - normal actions (participant) On receipt of PREPARE message, write PREPARED log record before sending PREPARED message needs to be forced to disk since coordinator may now commit. On receipt of COMMIT message, write COMMIT log record before sending ACK to coordinator cohort must ensure log forced to disk before sending ack -- but no great urgency for doing so. Timeout actions:  Timeout actions At various stages of protocol, transaction waits from messages at both coordinator and participants. If message not received, on timeout, timeout action is executed: Coordinator Timeout Actions waiting for votes of participants: ABORT transaction, send aborts to all. waiting for ack from some participant: forward the transaction to recovery process that periodically will send COMMIT to participant. When participant will recover, and all participants send an ACK, coordinator writes a completion log record and deletes entry from protocol database. Cohort timeout actions: waiting for prepare: abort the transaction, send abort message to coordinator. Alternatively, it could wait for the coordinator to ask for prepare. Waiting for decision: forward transaction to recovery process. Recovery process executes status-transaction call to the coordinator. Such a transaction is blocked for recovery of failure. The participant could have used a different termination protocol -- e.g., polling other participants. (cooperative Termination) 2PC is blocking:  2PC is blocking Sample scenario: Coord P2 W P1 P3 W P4 W Slide70:  Case I: P1  “W”; coordinator sent commits P1  “C” Case II: P1  NO; P1  A  P2, P3, P4 (surviving participants) cannot safely abort or commit transaction Recovery Actions (cohort):  Recovery Actions (cohort) All sites execute REDO-UNDO pass Detection: A site knows it is a cohort if it finds a prepared log record for a transaction If the log does not contain a commit log record: reacquire all locks for the transaction ask coordinator for the status of transaction If log contains a commit log record do nothing Recovery Action (coordinator):  Recovery Action (coordinator) If protocol database was made fault-tolerant by logging every change, simply reconstruct the protocol database and restart 2PC from the point of failure. However, since we have only logged the commit and completion transitions and nothing else: if the log does not contain a commit. Simply abort the transaction. If a cohort asks for status in the future, its status is not in the protocol database and it will be considered as aborted. If commit log record, but no completion log record, recreate transactions entry committed in the protocol database and the recovery process will ask all the participants if they are still waiting for a commit message. If no one is waiting, the completion entry will be written. If commit log record + completion log record do nothing. 2PC analysis:  2PC analysis Count number of messages, and log writes and number of forced log writes Normal Processing overhead Coordinator: 2 log writes (commit/Abort, complete) 1 forced + 2 messages per cohort Cohort 2 log writes both forced (prepared, committed/aborted) 2 messages to coordinator Presumed Abort Optimization: if no entry in the protocol database, the transaction is presumed to have aborted. If transaction aborts, delete entry from protocol database. No log record written and no ACKs required from cohorts since absence of transaction from protocol database is same as abort. Variants of 2PC:  Variants of 2PC Linear Coord Hierarchical ok ok ok commit commit commit Variants of 2PC:  Distributed Nodes broadcast all messages Every node knows when to commit Variants of 2PC Cooperative Termination Protocol:  Cooperative Termination Protocol Bad case Participant P recovers from failure Has prepared record for transaction T No commit or abort record for T Coordinator is down Participant P is blocked until coordinator recovers Cooperative termination protocol:  Cooperative termination protocol But perhaps some other participant can help? Requires participants “know” each other! Cooperative Termination Protocol:  Cooperative Termination Protocol Participant P sends a DECISION-REQUEST message to other participants Alive participants respond with COMMIT, ABORT, or UNCERTAIN If any participant replies with a decision (COMMIT or ABORT), P acts on decision And sends decision to UNCERTAIN participants Cooperative Termination Protocol:  Cooperative Termination Protocol When P receives a DECISION-REQUEST If it knows decision, responds with COMMIT or ABORT If it has not prepared transaction, responds ABORT If it is prepared but does not know decision, responds UNCERTAIN Cooperative Termination:  Cooperative Termination Sample scenario: Coord P1 C P2 W P3 W Cooperative Termination:  Cooperative Termination Sample scenario: Coord P1 W P2 W P3 A Cooperative Termination:  Cooperative Termination Sample scenario: Coord P1 W P2 W P3 W Is there a non-blocking protocol?:  Is there a non-blocking protocol? Theorem: If communications failure or total site failures (i.e., all sites are down simultaneously) are possible, then every atomic protocol may cause processes to become blocked. Two exceptions: if we ignore communication failures, it is possible to design such a protocol (Skeen et. al. 83) If we impose some restrictions on transactions (I.e., what data they can read/write) such a protocol can also be designed (Mehrotra et. al. 92) Next…:  Next… Three-phase commit (3PC) Nonblocking if reliable network (no communications failure) and no total site failures Handling communications failures Why 2PC blocks?:  Why 2PC blocks? Since operational site on timeout in prepare state does not know if the failed site(s) had committed or aborted the transaction. Polling all operational sites does not work since all the operational sites might be in doubt. Approach to Making ACP Non-blocking:  Approach to Making ACP Non-blocking For a given state S of a transaction T in the ACP, let the concurrency set of S be the set of states that other sites could be in. For example, in 2PC, the concurrency set of PREPARE state is {PREPARE, ABORT, COMMIT} We develop non-blocking protocol, we will ensures that concurrency set of a transaction does not contain both a commit and an abort There exists no non-committable state whose concurrency set contains a commit. A state is committable if occupancy of the state by any site implies everyone has voted to commit the transaction. Necessity of these conditions illustrated by considering a situation with only 1 site operational. If either of the above violated, there will be blocking. Sufficiency illustrated by designing a termination protocol that will terminate the protocol correctly if the above assumptions hold. Three-Phase Commit:  Three-Phase Commit Sample scenario: Coord P1 W P2 W P3 W Slide88:  Coordinator Participant Uncertainty period 3PC Principle:  3PC Principle If ANY operational site is in the “uncertain” state, NO site (operational or failed) could have decided to commit Reminder: Assume reliable network Slide90:  Coordinator Slide91:  Coordinator Slide92:  Coordinator Participant Log start-3PC record (participant list) Log commit record (state C) Log prepared record (state W) Log committed record (state C) Slide93:  Coordinator Participant 1. Timeout: Abort 2. Timeout: ignore 1. Timeout: abort 2. Timeout Termination Protocol 3. Timeout Termination Protocol Process categories:  Process categories Three categories Operational Process has been up since start of 3PC Failed Process has halted since start of 3PC, or is recovering Recovered Process that failed and has completed recovery Three Phase Commit - Termination Protocol:  Three Phase Commit - Termination Protocol Choose a backup coordinator from the remaining operational sites. Backup coordinator sends messages to other operational sites to make transition to its local state (or to find out that such a transition is not feasible) and waits for response. Based on response as well as its local state, it continues to commit or abort the transaction. It commits, if its concurrency set includes a commit state. Else, it aborts. Termination Protocol:  Termination Protocol Only operational processes participate in termination protocol. Recovered processes wait until decision is reached and then learn decision Slide97:  Coordinator Participant REQUEST-PREPARE Abortable (A) Uncertain (U) Precommitted (PC) Committed (C) Termination Protocol:  Termination Protocol Elect new coordinator Use Election Protocol (coming soon…) New coordinator sends STATE-REQUEST to participants Makes decision using termination rules Communicates to participants Slide99:  Coordinator Slide100:  Coordinator Slide101:  Coordinator Slide102:  Coordinator Termination Protocol:  Termination Protocol Sample scenario: Coord P1 W P2 W P3 W Termination Protocol:  Termination Protocol Sample scenario: Coord P1 W P2 W P3 PC Slide105:  Note: 3PC unsafe with communication failures! W W W P P abort commit Slide106:  After coordinator receives DONE message, it can forget about the transaction E.g., cleanup control structures

Add a comment

Related presentations

Related pages

Internetagentur Berlin: TYPO3, Webdesign, App-Entwicklung ...

Webdesign und TYPO3-Agentur aus Berlin. Suchmaschinenoptimierung, Konzeption, Programmierung, CMS, Hosting, Redaktion und Online-Marketing von 3pc.
Read more

heise online Preisvergleich

Diese Seite von einem externen Dienstleister ist zur Zeit nicht erreichbar. Bitte versuchen Sie es später noch einmal. Der heise Preisvergleich ist ein ...
Read more

NIS 3PC 2 Jahre | Norton Community

Tach zusammen, ich habe bisher immer NIS für 3PC und zwei Jahre gekauft, aber aktuell kann ich keinen Anbieter finden, der das noch im Programm hat.
Read more

Norton Antivirus und Sicherheit Computer Software | eBay

Norton Internet Security 2015 - [3 PC] [1 Jahr] - VOLLVERSION - License ESD ... EUR 2,00; 0 Gebote; Keine Angaben zum Versand; Norton 360. EUR 1,00; 1 Gebot;
Read more

Fable III (uncut): Pc: Amazon.de: Games

2 von 2 Kunden fanden die folgende Rezension hilfreich. Ein tolles Spiel mit holpriger Installation. Von Chris am 8. Januar 2014. Edition: ...
Read more

Kaspersky Internet Security 2015 (3PC 1Jahr) - Preis ab ...

Kaspersky Internet Security 2015 (3PC 1Jahr) kaufen. Preis ab €29,89 (27.11.15) bei ausgewählten Shops im Preisvergleich von CHIP.
Read more

Kaspersky Internet Security 2015 3PC bei idealo.de

116 Angebote zu Kaspersky Internet Security 2015 3PC im Antivirus Software Preisvergleich. Bei idealo.de günstige Preise für Kaspersky Internet Security ...
Read more

Antivirus und Sicherheit Softwares | eBay

G DATA AntiVirus 3 PC 2016 VOLLVERSION GData AntiVirus Upgrade 2015. ... G Data TotalProtectio n 2016 VOLLVERSION 2 PC Upgrade Total Security 2015.
Read more

Kaspersky Internet Security 2015, 3 PC - 2 Jahre, ESD ...

Kaspersky Internet Security 2015 (auch für 2016), 3 PC - 2 Jahre, ESD, Download
Read more

bpb: Redaktionstool

Bitte Einloggen! Sie müssen sich einloggen, um auf das Redaktionstool zugreifen zu können. Bei Fragen wenden Sie sich bitte an Armin Berger: Login
Read more