Data Warehouse Project Lifecycle
Here is the typical lifecycle for data warehouse deployment project:
Phase-1: Project Scoping and Planning
Project Triangle - Scope, Time and Resource.
* Determine the scope of the project - what you would like to accomplish? This can be defined by questions to be answered. The number of logical star and number of the OLTP sources
* Time - What is the target date for the system to be available to the users
* Resource - What is our budget? What is the role and profile requirement of the resources needed to make this happen.
Phase-2: Requirement
* What are the business questions? How does the answers of these questions can change the business decision or trigger actions.
* What are the role of the users? How often do they use the system? Do they do any interactive reporting or just view the defined reports in guided navigation?
* How do you measure? What are the metrics?
Phase-3: Front-End Design
* The front end design needs for both interactive analysis and the designed analytics workflow.
* How does the user interact with the system?
* What are their analysis process?
Phase-4: Warehouse Schema Design
* Dimensional modeling - define the dimensions and fact and define the grain of each star schema.
* Define the physical schema - depending on the technology decision. If you use the relational tecknology, design the database tables
Phase-5: OLTP to data warehouse mapping
* Logical mapping - table to table and column to column mapping. Also define the transformation rules
* You may need to perform OLTP data profiling. How often the data changes? What are the data distribution?
* ETL Design -include data staging and the detail ETL process flow.
Phase-6: Implementation
* Create the warehouse and ETL staging schema
* Develop the ETL programs
* Create the logical to physical mapping in the repository
* Build the end user dashboard and reports
Phase-7: Deployment
* Install the Analytics reporting and the ETL tools.
* Specific Setup and Configuration for OLTP, ETL, and data warehouse.
* Sizing of the system and database
* Performance Tuning and Optimization