微軟數據倉庫介紹

Slide Title,Body Text,Second level,Third level,Fourth level,Fifth level,Module,1,Introduction to,D,ata Warehousing,Module Overview,數據倉庫概述,考慮數據倉庫解決方案,Lesson 1,:,數據倉庫概述,商業(yè)難題,什么是數據倉庫?,數據倉庫架構,數據倉庫解決方案組件,數據倉庫項目,數據倉庫項目角色,SQL,Server,作為數據倉庫平臺,業(yè)務難題,關鍵業(yè)務數據分布在多個業(yè)務系統(tǒng),找到業(yè)務決策的信息是耗時的和容易出錯的,基本的業(yè)務問題很難回答,?,What Is,a,Data Warehouse?,一,個集中存放用于報表和數據的信息容器,通常,一個數據倉庫,:,包含大量的歷史數據,優(yōu)化了數據查詢,(,而,不是插入和更新,),定期加載新的業(yè)務數據,為企業(yè)商務智能解決方案提供依據,Data Warehouse,Architectures,Centralized Data Warehouse,Departmental Data Mart,Hub and,Spoke,Components of a Data,Warehousing,Solution,從業(yè)務系統(tǒng)和其他數據源抽取數據加載,數據通常最終加載到數據倉庫,數據清洗和重復數據的刪除,確保數據倉庫中數據的質量,MDM,提供確切的業(yè)務數據實體,Data Warehouse,Staging Database,ETL Load Process,ETL Staging Process,Master,D,ata Management,10,1,10,00110,Data Cleansing,Reporting and Analysis,Data Sources,Data,Warehousing Projects,首先確定數據倉庫需要解決的業(yè)務問題,確定回答這些問題所需的數據,識別所需數據的數據源,評估關鍵業(yè)務目標價值可行性,從現(xiàn)在的數據回答每個問題,對大量數據的項目,使用增量更新比較有效,:,把項目分解為多個子項目,每,個子項目處理一個特定的主題,Data,Warehousing Project,Roles,Project manager,Solution architect,Data modeler,Database,administrator,Infrastructure,specialist,ETL,developer,Business users/analyst,Testers,Data stewards,SQL Server As a Data Warehousing Platform,SQL Server,Analysis Services,SQL Server Database Engine,Microsoft SQL Server Integration Services,SQL Server Master,D,ata Services,10,1,10,00110,SQL Server Data Quality Services,Microsoft SQL Azure,and the Windows Azure,M,arketplace,Microsoft SharePoint Server,Microsoft PowerPivot Technologies,Microsoft Excel,Data Mining Add-In,PowerPivot Add-In,MDS Add-In,Power View,SQL Server,Reporting Services,Reports,KPIs,and Dashboards,Interactive data visualizations,Interactive data analysis,Data Warehousing,Business Intelligence,Lesson,2,:Considerations for a,Data Warehouse Solution,Data Warehouse Database and Storage,Data Sources,Extract,Transform,and Load Processes,Data Quality,and Master,Data,Management,Data Warehouse Database,and Storage,考慮數據倉庫包括,:,Database schema,Logical:typically denormalized,for optimal read performance,Physical:,often partitioned for performance and management,Hardware,Query processing and memory,Storage,Network,High,availability,and,disaster recovery,Hardware redundancy,Backup strategy,Security,Server access,Data permissions,Data Sources,數據源連接類型,憑證和權限,數據格式,數據采集窗口,Extract,Transform,and Load,Processes,臨時表:,存放,臨時數據,所需的轉換,:,提取數據時所需的數據轉換和清洗,增量,ETL:,數據的變化加載,Data Quality,and Master Data Management,Data quality:,Cleansing data:,Validating data values,Ensuring data consistency,Identifying missing values,Deduplicating data,Master data management:,Ensuring consistent business entity definitions across multiple systems,Applying business rules to ensure data validity,10,1,10,00110,Module Review and Takeaways,Why might you consider including a staging area in your ETL solution?,What options might you consider for performing data transformations in an ETL solution?,Why would you assign the data steward role to a business user rather than a database technology specialist?,。