Building a scalable data warehouse with data vault 2.0

The Data Vault was invented by Dan Linstedt at the U.S. Department of Defense, and the standard has been successfully applied to data warehousing projects at organizations of different sizes, from small to large-size corporations. Due to its simplified design, which is adapted from nature, the Data...

Full description

Bibliographic Details
Other Authors:	Linstedt, Daniel, author (author), Olschimke, Michael, author
Format:	eBook
Language:	Inglés
Published:	Amsterdam, [Netherlands] : Morgan Kaufmann 2016.
Edition:	1st edition
Subjects:	Data warehousing.
See on Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009629896106719

Table of Contents:

Cover
Title Page
Copyright Page
Contents
Authors Biography
Daniel Linstedt
Michael Olschimke
Foreword
Preface
Acknowledgments
Daniel Linstedt
Michael Olschimke
Chapter 1 - Introduction to Data Warehousing
1.1 - History of Data Warehousing
1.1.1 - Decision Support Systems
1.1.2 - Data Warehouse Systems
1.2 - The Enterprise Data Warehouse Environment
1.2.1 - Access
1.2.2 - Multiple Subject Areas
1.2.3 - Single Version of Truth
1.2.4 - Single Version of Facts
1.2.5 - Mission Criticality
1.2.6 - Scalability
1.2.7 - Big Data
1.2.8 - Performance Issues
1.2.9 - Complexity
1.2.10 - Auditing and Compliance
1.2.11 - Costs
1.2.12 - Other Business Requirements
1.3 - Introduction to Data Vault 2.0
1.4 - Data Warehouse Architecture
1.4.1 - Typical Two-Layer Architecture
1.4.2 - Typical Three-Layer Architecture
References
Chapter 2 - Scalable Data Warehouse Architecture
2.1 - Dimensions of Scalable Data Warehouse Architectures
2.1.1 - Workload
2.1.2 - Data Complexity
2.1.3 - Analytical Complexity
2.1.4 - Query Complexity
2.1.5 - Availability
2.1.6 - Security
2.2 - Data Vault 2.0 Architecture
2.2.1 - Business Rules Definition
2.2.2 - Business Rules Application
2.2.3 - Staging Area Layer
2.2.4 - Data Warehouse Layer
2.2.5 - Information Mart Layer
2.2.6 - Metrics Vault
2.2.7 - Business Vault
2.2.8 - Operational Vault
2.2.9 - Managed Self-Service BI
2.2.10 - Other Features
References
Chapter 3 - The Data Vault 2.0 Methodology
3.1 - Project Planning
3.1.1 - Capability Maturity Model Integration
3.1.1.1 - Capability Levels
3.1.1.2 - Maturity Levels
3.1.1.3 - Advancing to Maturity Level 5
3.1.1.4 - Integrating CMMI in the Data Vault 2.0 Methodology
3.1.2 - Managing the Project.
3.1.2.1 - Scrum
3.1.2.2 - Iterative Approach
3.1.2.3 - Product and Sprint Backlog
3.1.2.4 - Integrating Scrum with the Data Vault 2.0 Methodology
3.1.3 - Defining the Project
3.1.3.1 - Agile Requirements Gathering
3.1.4 - Estimation of the Project
3.1.4.1 - Function Point Analysis
3.1.4.2 - Measuring with Function Points
3.1.4.3 - Function Point Analysis for Data Warehousing
3.1.4.4 - Boundaries in Data Warehousing
3.1.4.5 - Estimating Size
3.1.4.6 - Assessing ETL Complexity Factors
3.1.4.7 - Applying Function Point Analysis to Data Warehousing
3.1.4.8 - Function Points for Enterprise Data Warehouse
3.2 - Project Execution
3.2.1 - Traditional Software Development Life-Cycle
3.2.1.1 - Requirements Engineering
3.2.1.2 - Design
3.2.1.3 - Implementation and Unit Testing
3.2.1.4 - Integration and System Testing
3.2.1.5 - Operation and Maintenance
3.2.2 - Applying Software Development Life-Cycle to the Data Vault 2.0 Methodology
3.2.3 - Parallel Teams
3.2.4 - Technical Numbering
3.3 - Review and Improvement
3.3.1 - Six Sigma
3.3.1.1 - Applying Six Sigma to Software
3.3.1.2 - Six Sigma Framework
3.3.1.3 - DMAIC Improvement
3.3.1.4 - Applying Six Sigma to Data Warehousing
3.3.2 - Total Quality Management
3.3.2.1 - Data Quality Dimensions
3.3.2.2 - Total Data Quality Management
3.3.2.3 - Data Warehouse Quality
3.3.2.4 - Integrating TQM with the Data Vault 2.0 Methodology
References
Chapter 4 - Data Vault 2.0 Modeling
4.1 - Introduction to Data Vault Modeling
4.2 - Data Vault Modeling Vocabulary
4.2.1 - Hub Entities
4.2.2 - Link Entities
4.2.3 - Satellite Entities
4.3 - Hub Definition
4.3.1 - Definition of a Business Key
4.3.1.1 - Composite Keys (aka Smart Keys, Intelligent Keys)
4.3.1.2 - Business Key Identification Process.
4.3.1.3 - Scope of Business Keys
4.3.1.4 - Difference between Business Keys and Surrogate Keys
4.3.2 - Hub Entity Structure
4.3.2.1 - Hash Key
4.3.2.2 - Business Key
4.3.2.3 - Load Date
4.3.2.4 - Record Source
4.3.2.5 - Last Seen Date
4.3.3 - Hub Examples
4.4 - Link Definition
4.4.1 - Reasons for Many-to-Many Relationships
4.4.2 - Flexibility of Links
4.4.3 - Granularity of Links
4.4.4 - Link Unit-of-Work
4.4.5 - Link Entity Structure
4.4.5.1 - Hash Key
4.4.5.2 - Dependent Child Key
4.4.6 - Link Examples
4.5 - Satellite Definition
4.5.1 - Importance of Keeping History
4.5.2 - Splitting Satellites
4.5.2.1 - Splitting by Source System
4.5.2.2 - Splitting by Rate of Change
4.5.3 - Satellite Entity Structure
4.5.3.1 - Parent Hash Key
4.5.3.2 - Load Date
4.5.3.3 - Load End Date
4.5.3.4 - Hash Difference
4.5.3.5 - Extract Date
4.5.4 - Satellite Examples
4.5.5 - Link Driving Key
References
Chapter 5 - Intermediate Data Vault Modeling
5.1 - Hub Applications
5.1.1 - Business Key Consolidation
5.2 - Link Applications
5.2.1 - Link-on-link
5.2.2 - Same-as Links
5.2.3 - Hierarchical Links
5.2.4 - Nonhistorized Links
5.2.5 - Nondescriptive Links
5.2.6 - Computed Aggregate Links
5.2.7 - Exploration Links
5.3 - Satellite Applications
5.3.1 - Overloaded Satellites
5.3.2 - Multi-Active Satellites
5.3.3 - Status Tracking Satellites
5.3.4 - Effectivity Satellites
5.3.5 - Record Tracking Satellites
5.3.6 - Computed Satellites
References
Chapter 6 - Advanced Data Vault Modeling
6.1 - Point-in-Time Tables
6.1.1 - Point-in-Time Table Structure
6.1.2 - Managed PIT Window
6.2 - Bridge Tables
6.2.1 - Bridge Table Structure
6.2.2 - Comparing PIT Tables with Bridge Tables
6.3 - Reference Tables.
6.3.1 - No-History Reference Tables
6.3.2 - History-Based Reference Tables
6.3.3 - Code and Descriptions
6.3.3.1 - Code and Descriptions with History
Reference
Chapter 7 - Dimensional Modeling
7.1 - Introduction
7.2 - Star Schemas
7.2.1 - Fact Tables
7.2.1.1 - The Grain of a Fact Table
7.2.2 - Dimension Tables
7.2.3 - Querying Star Schemas
7.3 - Multiple Stars
7.3.1 - Conformed Dimensions
7.4 - Dimension Design
7.4.1 - Slowly Changing Dimensions
7.4.2 - Hierarchies
7.4.3 - Snowflake Design
References
Chapter 8 - Physical Data Warehouse Design
8.1 - Database Workloads
8.1.1 - Workload Characteristics
8.2 - Separate Environments for Development, Testing, and Production
8.2.1 - Blue-Green Deployment
8.3 - Microsoft Azure Cloud Computing Platform
8.4 - Physical Data Warehouse Architecture on Premise
8.4.1 - Hardware Architectures and Databases
8.4.2 - Processor Options
8.4.3 - Memory Options
8.4.4 - Storage Options
8.4.5 - Network Options
8.5 - Database Options
8.5.1 - tempdb Options
8.5.2 - Partitioning
8.5.3 - Filegroups
8.5.4 - Data Compression
8.6 - Setting up the Data Warehouse
8.6.1 - Setting up the Stage Area
8.6.1.1 - Hardware Considerations for Stage Area
8.6.1.2 - Stage Database Setup
8.6.2 - Setting up the Data Vault
8.6.2.1 - Hardware Considerations for Data Vault Layer
8.6.2.2 - Backing Up the Data Vault
8.6.2.3 - Data Vault Database Setup
8.6.3 - Setting up Information Marts
8.6.3.1 - Hardware Considerations for Information Marts
8.6.3.2 - Information Mart Database Setup
8.6.4 - Setting up the Meta, Metrics, and Error Marts
8.6.4.1 - Hardware Considerations for Meta, Metrics, and Error Marts
8.6.4.2 - Meta, Metrics, and Error Marts Database Setup
References.
Chapter 9 - Master Data Management
9.1 - Definitions
9.1.1 - Master Data
9.1.2 - Data Management
9.1.3 - Master Data Management
9.2 - Master Data Management Goals
9.3 - Drivers for Managing Master Data
9.4 - Operational vs. Analytical Master Data Management
9.5 - Master Data Management as an Enabler for Managed Self-Service BI
9.6 - Master Data Management as an Enabler for Total Quality Management
9.6.1 - MDS Object Model
9.6.1.1 - Models
9.6.1.2 - Entities
9.6.1.3 - Members
9.6.1.4 - Attributes
9.6.1.5 - Attribute Groups
9.6.1.6 - Hierarchies
9.6.2 - Master Data Manager
9.6.3 - Explorer
9.6.3.1 - Microsoft Excel and MDS
9.6.4 - Version Management
9.6.5 - Integration Management
9.6.6 - System Administration
9.6.7 - User and Group Permissions
9.7 - Creating a Model
9.7.1 - Creating Entities
9.7.2 - Creating Business Rules
9.8 - Importing a Model
9.9 - Integrating MDS with the Data Vault and Operational Systems
9.9.1 - Stage Tables
9.9.1.1 - Using T-SQL to Stage Master Data into Microsoft Master Data Services
9.9.1.2 - Using SQL Server Integration Services to Stage Master Data into Microsoft Master Data Services
9.9.2 - Subscription Views
References
Chapter 10 - Metadata Management
10.1 - What Is Metadata?
10.1.1 - Business Metadata
10.1.2 - Technical Metadata
10.1.3 - Process Execution Metadata
10.2 - Implementing the Meta Mart
10.2.1 - SQL Server BI Metadata Toolkit
10.2.2 - Naming Conventions
10.2.3 - Capturing Source System Definitions
10.2.4 - Capturing Hard Rules
10.2.5 - Capturing Metadata for the Staging Area
10.2.6 - Capturing Requirements to Source Tables
10.2.7 - Capturing Source Tables to Data Vault Tables
10.2.7.1 - Metadata for Loading Hub Entities
10.2.7.2 - Metadata for Loading Link Entities.
10.2.7.3 - Metadata for Loading Satellite Entities on Hubs.

Building a scalable data warehouse with data vault 2.0

Similar Items