MCA Microsoft Certified Associate Azure Data Engineer Study Guide Exam DP-203

Prepare for the Azure Data Engineering certification--and an exciting new career in analytics--with this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, prac...

Descripción completa

Detalles Bibliográficos
Autor principal: Perkins, Benjamin (-)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Newark : John Wiley & Sons, Incorporated 2023.
Edición:1st ed
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009769038006719
Tabla de Contenidos:
  • Cover Page
  • Title Page
  • Copyright Page
  • Acknowledgments
  • About the Author
  • About the Technical Editor
  • Contents at a Glance
  • Contents
  • Table of Exercises
  • Introduction
  • Part I Azure Data Engineer Certification and Azure Products
  • Chapter 1 Gaining the Azure Data Engineer Associate Certification
  • The Journey to Certification
  • How to Pass Exam DP-203
  • Understanding the Exam Expectations and Requirements
  • Use Azure Daily
  • Read Azure Articles to Stay Current
  • Have an Understanding of All Azure Products
  • Azure Product Name Recognition
  • Azure Data Analytics
  • Azure Synapse Analytics
  • Azure Databricks
  • Azure HDInsight
  • Azure Analysis Services
  • Azure Data Factory
  • Azure Event Hubs
  • Azure Stream Analytics
  • Other Products
  • Azure Storage Products
  • Azure Data Lake Storage
  • Azure Storage
  • Other Products
  • Azure Databases
  • Azure Cosmos DB
  • Azure SQL Server Products
  • Additional Azure Databases
  • Other Products
  • Azure Security
  • Azure Active Directory
  • Role-Based Access Control
  • Attribute-Based Access Control
  • Azure Key Vault
  • Other Products
  • Azure Networking
  • Virtual Networks
  • Other Products
  • Azure Compute
  • Azure Virtual Machines
  • Azure Virtual Machine Scale Sets
  • Azure App Service Web Apps
  • Azure Functions
  • Azure Batch
  • Azure Management and Governance
  • Azure Monitor
  • Azure Purview
  • Azure Policy
  • Azure Blueprints (Preview)
  • Azure Lighthouse
  • Azure Cost Management and Billing
  • Other Products
  • Summary
  • Exam Essentials
  • Review Questions
  • Chapter 2 CREATE DATABASE dbName
  • The Brainjammer
  • A Historical Look at Data
  • Variety
  • Velocity
  • Volume
  • Data Locations
  • Data File Formats
  • Data Structures, Types, and Concepts
  • Data Structures
  • Data Types and Management
  • Data Concepts
  • Data Programming and Querying for Data Engineers.
  • Data Programming
  • Querying Data
  • Understanding Big Data Processing
  • Big Data Stages
  • ETL, ELT, ELTL
  • Analytics Types
  • Big Data Layers
  • Summary
  • Exam Essentials
  • Review Questions
  • Part II Design and Implement Data Storage
  • Chapter 3 Data Sources and Ingestion
  • Where Does Data Come From?
  • Design a Data Storage Structure
  • Design an Azure Data Lake Solution
  • Recommended File Types for Storage
  • Recommended File Types for Analytical Queries
  • Design for Efficient Querying
  • Design for Data Pruning
  • Design a Folder Structure That Represents the Levels of Data Transformation
  • Design a Distribution Strategy
  • Design a Data Archiving Solution
  • Design a Partition Strategy
  • Design a Partition Strategy for Files
  • Design a Partition Strategy for Analytical Workloads
  • Design a Partition Strategy for Efficiency and Performance
  • Design a Partition Strategy for Azure Synapse Analytics
  • Identify When Partitioning Is Needed in Azure Data Lake Storage Gen2
  • Design the Serving/Data Exploration Layer
  • Design Star Schemas
  • Design Slowly Changing Dimensions
  • Design a Dimensional Hierarchy
  • Design a Solution for Temporal Data
  • Design for Incremental Loading
  • Design Analytical Stores
  • Design Metastores in Azure Synapse Analytics and Azure Databricks
  • The Ingestion of Data into a Pipeline
  • Azure Synapse Analytics
  • Azure Data Factory
  • Azure Databricks
  • Event Hubs and IoT Hub
  • Azure Stream Analytics
  • Apache Kafka for HDInsight
  • Migrating and Moving Data
  • Summary
  • Exam Essentials
  • Review Questions
  • Chapter 4 The Storage of Data
  • Implement Physical Data Storage Structures
  • Implement Compression
  • Implement Partitioning
  • Implement Sharding
  • Implement Different Table Geometries with Azure Synapse Analytics Pools
  • Implement Data Redundancy
  • Implement Distributions.
  • Implement Data Archiving
  • Azure Synapse Analytics Develop Hub
  • Implement Logical Data Structures
  • Build a Temporal Data Solution
  • Build a Slowly Changing Dimension
  • Build a Logical Folder Structure
  • Build External Tables
  • Implement File and Folder Structures for Efficient Querying and Data Pruning
  • Implement a Partition Strategy
  • Implement a Partition Strategy for Files
  • Implement a Partition Strategy for Analytical Workloads
  • Implement a Partition Strategy for Streaming Workloads
  • Implement a Partition Strategy for Azure Synapse Analytics
  • Design and Implement the Data Exploration Layer
  • Deliver Data in a Relational Star Schema
  • Deliver Data in Parquet Files
  • Maintain Metadata
  • Implement a Dimensional Hierarchy
  • Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster
  • Recommend Azure Synapse Analytics Database Templates
  • Implement Azure Synapse Analytics Database Templates
  • Additional Data Storage Topics
  • Storing Raw Data in Azure Databricks for Transformation
  • Storing Data Using Azure HDInsight
  • Storing Prepared, Trained, and Modeled Data
  • Summary
  • Exam Essentials
  • Review Questions
  • Part III Develop Data Processing
  • Chapter 5 Transform, Manage, and Prepare Data
  • Ingest and Transform Data
  • Transform Data Using Azure Synapse Pipelines
  • Transform Data Using Azure Data Factory
  • Transform Data Using Apache Spark
  • Transform Data Using Transact-SQL
  • Transform Data Using Stream Analytics
  • Cleanse Data
  • Split Data
  • Shred JSON
  • Encode and Decode Data
  • Configure Error Handling for the Transformation
  • Normalize and Denormalize Values
  • Transform Data by Using Scala
  • Perform Exploratory Data Analysis
  • Transformation and Data Management Concepts
  • Transformation
  • Data Management
  • Azure Databricks
  • Data Modeling and Usage.
  • Data Modeling with Machine Learning
  • Usage
  • Summary
  • Exam Essentials
  • Review Questions
  • Chapter 6 Create and Manage Batch Processing and Pipelines
  • Design and Develop a Batch Processing Solution
  • Design a Batch Processing Solution
  • Develop Batch Processing Solutions
  • Create Data Pipelines
  • Handle Duplicate Data
  • Handle Missing Data
  • Handle Late-Arriving Data
  • Upsert Data
  • Configure the Batch Size
  • Configure Batch Retention
  • Design and Develop Slowly Changing Dimensions
  • Design and Implement Incremental Data Loads
  • Integrate Jupyter/IPython Notebooks into a Data Pipeline
  • Revert Data to a Previous State
  • Handle Security and Compliance Requirements
  • Design and Create Tests for Data Pipelines
  • Scale Resources
  • Design and Configure Exception Handling
  • Debug Spark Jobs Using the Spark UI
  • Implement Azure Synapse Link and Query the Replicated Data
  • Use PolyBase to Load Data to a SQL Pool
  • Read from and Write to a Delta Table
  • Manage Batches and Pipelines
  • Trigger Batches
  • Schedule Data Pipelines
  • Validate Batch Loads
  • Implement Version Control for Pipeline Artifacts
  • Manage Data Pipelines
  • Manage Spark Jobs in a Pipeline
  • Handle Failed Batch Loads
  • Summary
  • Exam Essentials
  • Review Questions
  • Chapter 7 Design and Implement a Data Stream Processing Solution
  • Develop a Stream Processing Solution
  • Design a Stream Processing Solution
  • Create a Stream Processing Solution
  • Process Time Series Data
  • Design and Create Windowed Aggregates
  • Process Data Within One Partition
  • Process Data Across Partitions
  • Upsert Data
  • Handle Schema Drift
  • Configure Checkpoints/Watermarking During Processing
  • Replay Archived Stream Data
  • Design and Create Tests for Data Pipelines
  • Monitor for Performance and Functional Regressions.
  • Optimize Pipelines for Analytical or Transactional Purposes
  • Scale Resources
  • Design and Configure Exception Handling
  • Handle Interruptions
  • Ingest and Transform Data
  • Transform Data Using Azure Stream Analytics
  • Monitor Data Storage and Data Processing
  • Monitor Stream Processing
  • Summary
  • Exam Essentials
  • Review Questions
  • Part IV Secure, Monitor, and Optimize Data Storage and Data Processing
  • Chapter 8 Keeping Data Safe and Secure
  • Design Security for Data Policies and Standards
  • Design a Data Auditing Strategy
  • Design a Data Retention Policy
  • Design for Data Privacy
  • Design to Purge Data Based on Business Requirements
  • Design Data Encryption for Data at Rest and in Transit
  • Design Row-Level and Column-Level Security
  • Design a Data Masking Strategy
  • Design Access Control for Azure Data Lake Storage Gen2
  • Implement Data Security
  • Implement a Data Auditing Strategy
  • Manage Sensitive Information
  • Implement a Data Retention Policy
  • Encrypt Data at Rest and in Motion
  • Implement Row-Level and Column-Level Security
  • Implement Data Masking
  • Manage Identities, Keys, and Secrets Across Different Data Platform Technologies
  • Implement Access Control for Azure Data Lake Storage Gen2
  • Implement Secure Endpoints (Private and Public)
  • Implement Resource Tokens in Azure Databricks
  • Load a DataFrame with Sensitive Information
  • Write Encrypted Data to Tables or Parquet Files
  • Develop a Batch Processing Solution
  • Handle Security and Compliance Requirements
  • Design and Implement the Data Exploration Layer
  • Browse and Search Metadata in Microsoft Purview Data Catalog
  • Push New or Updated Data Lineage to Microsoft Purview
  • Summary
  • Exam Essentials
  • Review Questions
  • Chapter 9 Monitoring Azure Data Storage and Processing
  • Monitoring Data Storage and Data Processing.
  • Implement Logging Used by Azure Monitor.