Engineering Data Mesh in Azure Cloud Implement Data Mesh Using Microsoft Azure's Cloud Adoption Framework
Overcome data mesh adoption challenges using the cloud-scale analytics framework and make your data analytics landscape agile and efficient by using standard architecture patterns for diverse analytical workloads Key Features Delve into core data mesh concepts and apply them to real-world situations...
Otros Autores: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England :
Packt Publishing
[2024]
|
Edición: | First edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009810645406719 |
Tabla de Contenidos:
- Cover
- Title Page
- Copyright
- Dedication
- Contributors
- Table of Contents
- Preface
- Part 1: Rolling Out the Data Mesh in the Azure Cloud
- Chapter 1: Introducing Data Meshes
- Exploring the evolution of modern data analytics
- Discovering the challenges of modern-day enterprises
- DaaP
- Data domains
- The data mesh solution
- Summary
- Chapter 2: Building a Data Mesh Strategy
- Is a data mesh for everybody?
- Aligning your analytics strategy with your business strategy
- Understanding data maturity models
- Stage 1
- Stage 2
- Stage 3
- Stage 4
- Building the technology stack
- The analytics team
- Data governance
- Approaches to building your data mesh
- Summary
- Chapter 3: Deploying a Data Mesh Using the Azure Cloud-Scale Analytics Framework
- Introduction to Azure CSA
- Understanding landing zones
- Organizing resources
- Designing a cloud management structure
- Hierarchical policies
- Diving deeper into landing zones in CSA
- Data management landing zone
- Data landing zone
- Automating landing zone deployment
- IaC
- Organizing resources in a landing zone
- Networking topologies
- Security and access control
- Streamlining deployment through DevOps
- Summary
- Chapter 4: Building the Data Mesh Governance Framework Using Microsoft Azure Services
- Data mesh governance requirements
- Data catalog
- Collecting and managing metadata
- Step 1 - ensure accuracy and completeness
- Step 2 - verify data classification
- Step 3 - add a business glossary
- Step 4 - add lineage information
- Monitoring and managing data quality
- Implementing data observability
- Summary
- Chapter 5: Security Architecture for Data Meshes
- Understanding the security requirements of data mesh architecture
- Understanding authentication and authorization in Azure
- Managing data access
- SQL Database.
- Data lakes
- Data lake structure
- Managing data privacy
- Data masking
- Data retention
- Summary
- Chapter 6: Automating Deployment through Azure Resource Manager and Azure DevOps
- Azure Resource Manager templates for landing zones
- Understanding the ARM template structure
- Source code control for ARM templates
- Azure DevOps pipelines for deploying infrastructure
- Base data product templates
- T-shirt sizing
- Landing zone requests
- Landing zone approval
- Landing zone deployment
- Self-service portal
- Customized templates
- Summary
- Chapter 7: Building a Self-Service Portal for Common Data Mesh Operations
- Why do we need a self-service portal?
- Gathering requirements for the self-service portal
- Requesting a data product zone
- Browse and reuse pipeline
- Data discovery
- Access management
- Requesting landing zones or data products
- Data catalog
- Hosting common data pipeline templates
- Azure Data Factory
- Azure Data Factory instance
- Integration runtime
- Creating linked services
- Create a sequence of activities
- Parameterize the pipeline
- Continuous integration/continuous development
- Data mesh portal integration
- Other common features of a self-service portal
- Architecting the self-service portal
- Active Directory and Domain Name System (DNS)
- Application Gateway
- Azure App Service
- Azure Cosmos DB
- Git Repo and Azure DevOps pipelines
- Network and security
- Azure Cache for Redis (optional)
- Azure SQL DB (optional)
- Summary
- Part 2: Practical Challenges of Implementing a Data Mesh
- Chapter 8: How to Design, Build, and Manage Data Contracts
- What are data contracts?
- What are the contents of a data contract?
- Who creates and owns a data contract?
- Who consumes the data contract?
- How do we store data and access contracts?.
- How do we link data contracts to data consumption or pipelines?
- Catalog and contract document design
- Set up Cosmos DB
- Write the integration code
- Searching contracts and data assets
- Put the pieces together
- Summary
- Chapter 9: Data Quality Management
- Why is data quality important?
- How is data quality defined?
- How to manage data quality
- Accuracy
- Completeness
- Consistency
- Timeliness
- Validity
- Uniqueness
- Reliability
- Data quality management systems
- Completely decentralized
- Completely centralized
- The hybrid approach
- Build versus buy
- Popular data quality frameworks and tools
- Summary
- Chapter 10: Master Data Management
- Single source of truth
- What causes discrepancies in master data?
- MDM design patterns
- MDM architecture for a data mesh
- Build versus buy
- Popular MDM tools
- Summary
- Chapter 11: Monitoring and Data Observability
- Piecing it all together - the importance of data mesh monitoring and data observability
- How data mesh monitoring differs
- Baking diagnostic logging into the landing zone templates
- Azure Platform Metrics
- Azure platform logs
- Enabling diagnostic settings in an ARM template
- Designing a data mesh operations center
- Step 1 - collection
- Step 2 - rank the critical metrics and events
- Step 3 - build a threshold logic for each service in a data product
- Step 4 - build a monitoring view for each resource
- Step 5 - build a threshold logic for each data product
- Step 6 - build a threshold logic for each data landing zone
- Step 7 - set up alerts for critical metrics
- Step 8 - host the dashboards in one location
- Tooling for the DMOC
- Azure Monitor
- Log Analytics
- Azure Data Explorer
- Grafana
- Power BI
- Data observability
- Setting up alerts
- Piecing it all together
- Summary.
- Chapter 12: Monitoring Data Mesh Costs and Building a Cross-Charging Model
- Components of data mesh costs
- Cost models in a data mesh
- Overview of cost management in Azure
- Allocating costs to different data product groups and domains
- How to determine the cost of shared resources
- Summary
- Chapter 13: Understanding Data-Sharing Topologies in a Data Mesh
- What is in-place sharing?
- Understanding data-sharing challenges in a data mesh
- Latency
- Security and access control
- Data formats and protocols
- Exploring different methods available for sharing data
- In-place access
- Data pipelines
- Data APIs
- Data Share
- Picking the right data-sharing topologies
- In-place sharing
- Data pipelines
- Data APIs
- Data sharing
- Summary
- Part 3: Popular Data Product Architectures
- Chapter 14: Advanced Analytics Using Azure Machine Learning, Databricks, and the Lakehouse Architecture
- Requirements
- Architecture
- Components
- Source data
- Azure Data Factory
- Azure Data Lake Storage Gen2
- Azure Databricks
- Azure Machine Learning
- Azure Kubernetes Service (AKS)
- Power BI
- Azure Data Share
- Data flow
- Scenarios
- Summary
- Chapter 15: Big Data Analytics Using Azure Synapse Analytics
- Requirements
- Architecture
- Components
- Source data
- Azure Synapse pipelines
- Azure Data Lake Storage Gen2
- Azure Synapse
- Azure Cosmos DB
- Azure AI Search
- Power BI
- Azure Data Share
- Data flow
- Scenarios
- Summary
- Chapter 16: Event-Driven Analytics Using Azure Event Hubs, Azure Stream Analytics, and Azure Machine Learning
- Requirements
- Architecture
- Components
- Source data
- Azure Event Hubs
- Azure IoT Hub
- Azure Stream Analytics
- Azure Data Explorer
- Azure Machine Learning
- Azure Cosmos DB
- Power BI
- Data flow.
- Combining architectures for real-time and big data analytics
- Scenarios
- Summary
- Chapter 17: AI Using Azure Cognitive Services and Azure OpenAI
- Requirements
- Architecture
- Components
- Source data
- Azure Data Factory
- Azure Translator
- Azure AI Document Intelligence
- Azure OpenAI embedding models
- Azure Redis Cache
- Azure App Service
- Semantic Kernel
- Azure OpenAI
- Bing search
- Content filtering and security
- Data flow/interactions
- Scenarios
- Summary
- Index
- About PACKT
- Other Books You May Enjoy.