advertisement

Tut0 Assumed 2008 1 w3

100 %
0 %
advertisement
Information about Tut0 Assumed 2008 1 w3
Education

Published on May 2, 2008

Author: Desiderio

Source: authorstream.com

advertisement

Tutorial 0: Assumed Knowledge:  Tutorial 0: Assumed Knowledge INFS3200/INFS7907 Advanced Database Systems Semester 1, 2008 Whom I am:  Whom I am Name: Kwok Cheung E-mail: kwokc@itee.uq.edu.au Assessment:  Assessment Quiz 20% Group assignment 20% groups of up to 3 people Final Exam 60% (You must pass the final exam to pass this course (i.e. to obtain at least 50%)) How can I help, mates?:  How can I help, mates? Tutorial Sessions Consultations Each two scheduled sessions for the quiz, assignment and final exam By appointment Please, Please, Please do not turn up in my office without notification! Question 1:  Question 1 Quantitative Estimation 2004 Olympic Games: 11,099 athletes 202 countries 301 events in 28 sports 15 days of competition. There have been 26 earlier Olympic games. Q1a How many records are there in the current and historical athlete tables? How many bytes?:  Q1a How many records are there in the current and historical athlete tables? How many bytes? Assumption: 1 kilobytes for each athlete record. on average each of these games was HALF the size of the 2004 Olympics in all dimensions. The Total amount of data is 154 MB 2004 Olympic Game: 11,099 (rounded up to 11,000) records have 11 MB (11,000 *1KB ÷ 1024 KB) approximately. 26 earlier Games: 11,000/2*26 records have 143 MB approximately. Q1b(1) How many records are there in the current and historical results and medals tables? How many bytes?:  Q1b(1) How many records are there in the current and historical results and medals tables? How many bytes? Result Table Schemas Result (EventID, AthleteID, Place) The Result table is for event results. Assumption: Each athlete has 5 events on average. Athletes, events and places are identified by an 8 byte identifier. The size of each record in Result table is 24 bytes (rounded up to 20 bytes). Q1b(2) How many records are there in the current and historical results and medals tables? How many bytes?:  Q1b(2) How many records are there in the current and historical results and medals tables? How many bytes? The total records in the current and historical result tables are 15 MB. 2004 Olympic Game There are around 55,000 results (11,000 athletes*5 events/athlete). The size of the current results is around 1 megabytes (55,000 results*20 bytes/result). The earlier Olympic Games There were around 715,000 results (55,000 results*26/2). The size of the historic results is around 14 megabytes (715,000 results*20 bytes/result). Q1b(3) How many records are there in the current and historical results and medals tables? How many bytes?:  Q1b(3) How many records are there in the current and historical results and medals tables? How many bytes? Medal Table Schema Medal (EventID, AthleteID, Medal) The Medal table is for the top three positions for each event. Obviously Gold, Silver and Bronze correspond to top one, two and three places respectively. As a result, the Medal table is a proper subset of the Result table. The Medal table is an additional dataset, isn’t it ? Not really, we can have a Medal table by View Q1b(1) How many records are there in the current and historical results and medals tables? How many bytes?:  Q1b(1) How many records are there in the current and historical results and medals tables? How many bytes? The View for the Medal CREATE VIEW Medal (EventID, AthleteID, Medal) AS SELECT EventID, AthleteID, “Gold” FROM Result WHERE Place=‘1’ UNION SELECT EventID, AthleteID, “Silver” FROM Result WHERE Place=‘2’ UNION SELECT EventID, AthleteID, “Bronze” FROM Result WHERE Place=‘3’ Q1c How many updates would there be to the current results and medals tables during the 2004 Olympics? How many updates per hour at peak rate?:  Q1c How many updates would there be to the current results and medals tables during the 2004 Olympics? How many updates per hour at peak rate? 2004 Olympic Game: Assumption: 10 hours per day The peak rate of updates is three times the average rate The number of events per hour: 2 events (300 events ÷ 15 days ÷ 10 hour) The number of events per peak hour: 6 events (2 events * 3), i.e. one event per 10 minute. The number of results per event: around 180 results (55,000 results/300 events) The number of the result updates per minute at peak time: 18 results (180 results/10 minute) i.e. Each update takes around 3 seconds Q1d How many queries per second would you expect at the peak rate?:  Q1d How many queries per second would you expect at the peak rate? Assumption: There are 1 billion (109) web users checking the Olympics website each day. The peak rate is 10 times the average rate. The number of queries per second: a little more than 104 [109 queries/(24hours *60 minutes*60seconds)] The number of queries per second at peak rate: around 105 (104*10) Q1e Consider the computers on silde 8 of Lecture 2 as small, medium and large. Which computer would handle the update (at peak rate) of Q1c? Would any of them handle the queries (at peak rate) of Q1d?:  Q1e Consider the computers on silde 8 of Lecture 2 as small, medium and large. Which computer would handle the update (at peak rate) of Q1c? Would any of them handle the queries (at peak rate) of Q1d? Assumption: Each update or query takes 10,000 steps. According to Q1c answer, there are 18-result updates per minute at peak rate, or 3,000 steps/sec (18 * 10,000 steps/update divided by 60) According to Q1d answer, there are 105 queries per second at peak rate, or 109 steps/sec (105 * 10,000 steps) Basic Processes:  Basic Processes Executing a query has a cost, typically measured in time Query executed on a computer, which can perform a certain number of steps per second (processing capacity) Computers differ in processing capacity 100,000 steps/sec, 1 million steps/sec, 10 million steps/sec More capable computers cost more 1 million steps/sec costs say 20 x 100,000 steps/sec 10 million steps/sec costs say 1000 x 100,000 steps/sec So faster computers cost more per step If cost of step on 100,000 step/sec processor = 1 Then step on 1 million s/s costs 2 And step on 10 million s/s costs 10 Query Processing Database Transactions Concurrency Control Question 1e (continues):  Question 1e (continues) Q1f Consider that 1 gigabyte is a negligible amount of disk storage, From a) and b), is the total amount of data for current and historical games an issue?:  Q1f Consider that 1 gigabyte is a negligible amount of disk storage, From a) and b), is the total amount of data for current and historical games an issue? The total amount of data for current and historical games is 169 MB 154 MB for the athlete records (Q1A) 15 MB for the result and medal records (Q1B) Thus, the total storage is not an issue Question 2:  Question 2 Approach First, you are given the statements in a natural language and required to formulate SQL-queries. Next, you need to identify the tables and attributes involved in the queries. Finally, you need to map the statements into the queries accordingly. Q2a:  Q2a Question Formulate a query in SQL for the from the Australian team in the who have competed in . Schemas: CurrentMedals (EventID, AthleteID, Medal) The 2004 Olympics Medal records HistoricalResults (EventID, AthleteID, Placing) The earlier Games Result records CurrentAthletes (AthleteID, Name, Country) The 2004 Games Athlete records medal winner current Olympics previous Olympics Q2a SQL queries Formulate a query in SQL for the medal winners from the Australian team in the current Olympics who have competed in previous Olympics.:  Q2a SQL queries Formulate a query in SQL for the medal winners from the Australian team in the current Olympics who have competed in previous Olympics. Schemas: CurrentMedals (EventID, AthleteID, Medal) The 2004 Olympics Medal records HistoricalResults (EventID, AthleteID, Placing) The earlier Games Result records CurrentAthletes (AthleteID, Name, Country) The 2004 Games Athlete records Solution: SELECT CA.Name FROM CurrentMedals CM, HistoricalResults HR, CurrentAthletes CA WHERE CM.AthleteID = CA.AthleteID AND CM.AthleteID = HR.AthleteID AND CA.Country = ‘Australia’ Q2b:  Q2b Question: Formulate a query in SQL giving the total number of won by in the . Schemas: CurrentMedals (EventID, AthleteID, Medal) CurrentAthletes (AthleteID, Name, Country) medals current Olympics countries Q2b Formulate a query in SQL giving the total number of medals won by countries in the current Olympics.:  Q2b Formulate a query in SQL giving the total number of medals won by countries in the current Olympics. Schemas: CurrentMedals (EventID, AthleteID, Medal) CurrentAthletes (AthleteID, Name, Country) Solution: SELECT CA.Country, COUNT(*) FROM CurrentMedals CM, CurrentAthletes CA WHERE CM.AthleteID=CA.AthleteID GROUP BY CA.Country Question 3:  Question 3 Question Draw an optimized query tree for the query of 2a). Approach The cheapest operation: Select The cheaper operation: Project The expensive operation: Join The most expensive operation: Cartesian Product SELECT CA.Name FROM CurrentMedals CM, HistoricalResults HR, CurrentAthletes CA WHERE CM.AthleteID = CA.AthleteID AND CM.AthleteID = HR.AthleteID AND CA.Country = ‘Australia’ :  SELECT CA.Name FROM CurrentMedals CM, HistoricalResults HR, CurrentAthletes CA WHERE CM.AthleteID = CA.AthleteID AND CM.AthleteID = HR.AthleteID AND CA.Country = ‘Australia’ CM HR x CM.AID =CA.AID CA CA.Ctry = ‘AU’ CA.Ctry = ‘AU’ ∏AID, Name ∏AID ∏AID CM.AID = HR.AID x ∏Name Q3b What effect does doing the selection Country=‘Australia’ early have on the size of the join following it?:  Q3b What effect does doing the selection Country=‘Australia’ early have on the size of the join following it? CM HR CA CA.Ctry = ‘AU’ ∏AID, Name ∏AID ∏AID ∏Name CM HR CA CA.Ctry = ‘AU’ ∏AID, Name ∏AID ∏AID ∏Name The RHS is far more efficient – The Australian team is perhaps 300 in 11,000, so it reduces the size of the table by more than 97% What’s difference between the query trees ? Question 4 – Two-Phase Locking (2PL) for Concurrency Control:  Question 4 – Two-Phase Locking (2PL) for Concurrency Control 2PL – A transaction is said to follow the 2PL if all locking operations (read_lock, write_lock) precede the first unlock operation in the transaction. Serializability guaranteed by 2PL Phases: First: an expanding or growing phase For example, adding read_lock or write lock, or upgrading from read_lock to write lock. Second: shrinking phase For example, unlock or downgrading from write_lock to read_lock. Q4a Sketch the transaction which updates Tally from the results of a completed event. Include reads, writes, locks and unlocks on rows as resources. Make sure the transaction follows two-phase locking.:  Q4a Sketch the transaction which updates Tally from the results of a completed event. Include reads, writes, locks and unlocks on rows as resources. Make sure the transaction follows two-phase locking. Schema Tally (Country, MedalType, Number) Q4a What might happen if two-phase locking were not used and two events interfered with each other? Show a concrete example.:  Solution One of the updates, say the gold medal count, might be lost Q4a What might happen if two-phase locking were not used and two events interfered with each other? Show a concrete example. Thank You:  Thank You

Add a comment

Related presentations

Related pages

2008 - W3C

Comments are welcome through 1 December 2008. ... and incorporates the already broader role that Mauro has assumed at the Consortium since he was hired in ...
Read more

W3C XML Schema Definition Language (XSD) 1.1 Part 1 ...

http://www.w3.org/TR/2008/WD-xmlschema11-1 ... This specification depends on XML Schema Definition Language 1.1 Part ... bindings are assumed.
Read more

W3 Schools' SQL Tutorial - W3Schools Online Web Tutorials

SQL Tutorial « W3Schools Home. ... W3.CSS Reference Browser Statistics PHP Reference HTML Colors HTML Character Sets jQuery Reference AngularJS Reference.
Read more

Form W-2 Electronic Filing Requirements For Tax Year 2008

IP 2008(17.1) Department of ... (Form CT-W3, Line 1). ... Publication supersedes IP 2008(17), Form W-2 Electronic Filing Requirements for Tax Year 2008.
Read more

On Coincidence and Fixed-Point Theorems in Symmetric Spaces

On Coincidence and Fixed-Point Theorems in ... the authors assumed axiom W3 ,andin 6 the author assumed axioms W3 , ... 9/10/2008 1:52:07 PM ...
Read more

W3C Markup Validation Service -- Ampersand (&) in URL from ...

W3C home > Mailing lists > Public > www-validator@w3.org > June 2008. W3C Markup Validation Service -- Ampersand (&) in ... chapter=1 §ion=2 ...
Read more

XML Syntax Rules - W3Schools Online Web Tutorials

XML HOME XML Introduction XML How to use XML Tree XML Syntax XML Elements XML Attributes XML ... Read more

spec/Overview.html 1.1234 Update CVS server to latest ...

spec/Overview.html 1.1234 Update CVS server to latest ... poot Date: Tue, 12 Aug 2008 18:07 ... is + assumed to be 0..1, ...
Read more

Mixed-effects modeling with crossed random effects for ...

Available online 3 March 2008 ... s2 w2 Long 566 522.2 0 14.2 29.7 0 0.1 s2 w3 Long 577 522 ... Mixed-effects modeling with crossed random effects for ...
Read more

windows - D9024 make unrecognized source file type - Stack ...

D9024 make unrecognized source file type. ... source file type 'C:/MinGW/msys/1.0/W3', object file assumed cl : ... file assumed cl is the MS VS 2008 ...
Read more