Kushal Data Warehousing PPT

33 %
67 %
Information about Kushal Data Warehousing PPT

Published on March 6, 2014

Author: KushalSingh7

Source: slideshare.net


In computing, a data warehouse (DW, DWH), or an enterprise data warehouse (EDW), is a database used for reporting (1) and data analysis (2). Integrating data from one or more disparate sources creates a central repository of data, a data warehouse (DW). Data warehouses store current and historical data and are used for creating trending reports for senior management reporting such as annual and quarterly comparisons.

BY Kushal Singh Acute Informatics Pvt

What is Business Intelligence? BI is an abbreviation of the two words      Business Intelligence, bringing the right  information at the right time to the right  people in the right format.

What is Data Warehousing? Data Warehouse is a subject-oriented, integrated, nonvolatile and timevariant collection of data in support of management’s decisions.

What is Business Intelligence?

 The architecture  Operational data source1 High summarized data Meta-data Operational data source 2 Reporting, query, application development, and EIS(executive information system) tools Query Manage Lightly summarized data Load Manager Operational data source n Operational data store (ods) DBMS Detailed data OLAP(online analytical processing) tools Warehouse Manager Operational data store (ODS) Data mining Archive/backup data Typical architecture of a data warehouse End-user access tools

 The benefits of data warehousing • The potential benefits of data warehousing are high returns on investment.. • substantial competitive advantage.. • increased productivity of corporate decision-makers..

Data Warehouse Characteristics  Key Characteristics of a Data Warehouse  Subject-oriented  Integrated  Time-variant  Non-volatile 8

Subject Oriented • Example for an insurance company : Applications Area Data Warehouse Auto and Fire Auto and Fire Policy Policy Processing Processing Systems Systems Commercial Commercial and Life and Life Insurance Insurance Systems Systems Data Data Accounting Accounting System System Billing Billing System System Policy Policy Customer Customer Claims Claims Processing Processing System System Losses Losses Premium Premium 9

Integrated • Data is stored once in a single integrated location (e.g. insurance company) Auto Policy Auto Policy Processing Processing System System Customer data stored in several databases Data Warehouse Database Fire Policy Fire Policy Processing Processing System System FACTS, LIFE FACTS, LIFE Commercial, Accounting Commercial, Accounting Applications Applications Subject = Customer 10

Time - Variant Data is stored as a series of snapshots or views which record how it is collected across time. Data Warehouse Data Time Data   { • Key   Data is tagged with some element of time -  creation date, as of  date, etc. Data is available on-line for long periods of time for trend  analysis and forecasting. For example, five or more years 11

Non-Volatile • Existing data in the warehouse is not overwritten or updated. External Sources Production Databases Data Data Warehouse Warehouse Environment Environment Production Production Applications Applications • Update • Insert • Delete Data Warehouse Database • Load • Read-Only 12

Comparision of OLTP systems and data warehousing system OLTP systems Hold current data Stores detailed data Data is dynamic Repetitive processing High level of transaction throughput Predictable pattern of usage Transaction-driven Application-orented Supports day-to-day decisions Serves large number of clerical/operation users Data warehousing systems Holds historical data Stores detailed, lightly, and highly summarized data Data is largely static Ad hoc, unstructured, and heuristic processing Medium to how level of transaction throughput Unpredictable pattern of usage Analysis driven Subject-oriented supports strategic decisions Serves relatively how number of managerial users

OLTP Online Transaction Processing

On Line Transaction Processing • What is a Transaction ? – A Logical unit of work – – – Examples: Drawing Money from a bank account Booking a seat on an airline

Transactions • It is a unit of program execution that accesses & possibly updates various data items. • A transaction is a logical unit of work that performs some useful function for a user. • In end of the transaction the system must be: – in the prior state (if the transaction fails) or – the status of the system should reflect the successful completion (if the transaction succeeded). • May take a database from one consistent

Characteristics of Transactions A tomicity C onsistency I solation D urability

OLAP Online Analytical Processing

Types of OLAP • ROLAP (Relational Online Analytical Processing) • MOLAP (Multidimensional Online Analytical Processing) • HOLAP (Hybrid Online Analytical Processing)

ROLAP • ROLAP (Relational online analytical Processing) • Used for reporting • Tools: Report studio

MOLAP • MOLAP (Multidimensional online Analytical processing) • Used to build cubes • Tools: Powerplay, Transformer

HOLAP • HOLAP (Hybrid online analytical Processing) • Used for Data modeling • This will support both MOLAP and ROLAP • Tools: Framework manager, Query Studio.

Dimensions • It’s descriptive information about a measures like product, location, customer etc.

Types of Dimensions • Confirmed Dimensions • Degenerated Dimensions • Junk Dimensions

Facts • Fact is containing measures and IDs. • Ex; Revenue, Cost, Amount etc

Measure Types • Additive Measures: Which can be added across all the dimensions • Non Additive Measures: Which can not be added across all the dimensions • Semi Additive Measures: Which can be added across some dimensions and which can not be added across some other dimensions


Star Schema Dimension Tables Region_Dimension_Table region _id NE NW SE SW Product_Dimension_Table prod_grp_id prod_id prod_grp_desc prod_desc 10 20 30 100 140 220 Fewer devices Circuit boards Components region _doc Northeast Northwest Southeast Southwest account _id Power supply Motherboard Co-processor 100000 110000 120000 130000 140000 account _doc ABC Electronics Midway Electric Victor Components Washburn, Inc. Zerox Account_Dimension_Table month prod_id region_id account_id vend_id net-sales gross_sales 01-1996 02-1996 03-1996 100 140 220 SW NE SW 100000 110000 100000 100 200 300 30,000 23,000 32,000 50,000 42,000 49,000 Fact Table Monthly_Sales_Summary_Table month 01-1996 02-1996 03-1996 mo_in_fiscal_yr 4 5 6 month_name January February March Time_Dimension_Table Vendor_Dimension_Table vend_id 100 200 300 vendor_desc PowerAge, Inc. Advanced Micro Devices Farad Incorporated 28


Factless Fact Table • It’s just a bridge between table where we used to join tables. • In this scenario we can only track the event.

SCD (Slowly Changing Dimensions) • • • • TYPE 0 TYPE 1 TYPE 2 TYPE 3

ETL (Extract, Transform and Loading) INFORMATICA

Designing FRAMEWORK MANAGER Relational Database & DMR


Add a comment

Related presentations

Related pages

Data Warehousing PPT - Documents - docslide.us

Data Warehousing Basics Presentation Outline 1) Data Warehousing Overview • The purpose of Data Warehousing • The history of Data Warehousing 2) The ...
Read more

Data Warehousing Ppt - Documents

Data Warehousing Ppt. by yogesh-kumar. on Oct 30, 2014. Report Category:
Read more

Introduction to Data Mining, (First Edition)

Jayanta Basak , Kushal Wadhwani , Kaladhar Voruganti, Storage Workload Identification, ... International Journal of Data Warehousing and Mining, ...
Read more

~ Advantages & Disadvantages of Data Mining?~ | ~ Learning ...

ADVANTAGES OF DATA MINING Marking/Retailing Data mining can aid direct marketers by providing them with useful and accurate trends about their ...
Read more

Types of Is - scribd.com

Types of Is - Download as Powerpoint Presentation (.ppt), PDF File (.pdf), Text File (.txt) or view presentation slides online.
Read more

CAPP Presentation - scribd.com

Computer Aided Process PlanningDate: 2/27/03 Room: MSE Computer Lab Presenters: - Cem Toma - Anthony Nguyen Agend...
Read more

Profile Corporate - recruit.naukri.com

Data Warehousing and Maintenance . ... competitive cost through PPT - People, ... Kushal Group , INDIA ISD Cement , INDIA
Read more

Blue Lines and Gradients - University of Minnesota:蓝线和梯度 ...

文档格式:PPT | 浏览次数:0 | ... 2011/08/big data 2/Presented By :Kushal MittalDhruv SharmaTeam 14 What ... Model Data Warehousing Data Mining ...
Read more