MCA Microsoft Certified Associate Azure Data Engineer Study Guide Exam DP-203

Prepare for the Azure Data Engineering certification--and an exciting new career in analytics--with this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, prac...

Full description

Bibliographic Details
Main Author:	Perkins, Benjamin (-)
Format:	eBook
Language:	Inglés
Published:	Newark : John Wiley & Sons, Incorporated 2023.
Edition:	1st ed
Subjects:	Microsoft Azure (Computing platform) > Examinations > Study guides. Database management > Examinations > Study guides.
See on Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009769038006719

Table of Contents:

Cover Page
Title Page
Copyright Page
Acknowledgments
About the Author
About the Technical Editor
Contents at a Glance
Contents
Table of Exercises
Introduction
Part I Azure Data Engineer Certification and Azure Products
Chapter 1 Gaining the Azure Data Engineer Associate Certification
The Journey to Certification
How to Pass Exam DP-203
Understanding the Exam Expectations and Requirements
Use Azure Daily
Read Azure Articles to Stay Current
Have an Understanding of All Azure Products
Azure Product Name Recognition
Azure Data Analytics
Azure Synapse Analytics
Azure Databricks
Azure HDInsight
Azure Analysis Services
Azure Data Factory
Azure Event Hubs
Azure Stream Analytics
Other Products
Azure Storage Products
Azure Data Lake Storage
Azure Storage
Other Products
Azure Databases
Azure Cosmos DB
Azure SQL Server Products
Additional Azure Databases
Other Products
Azure Security
Azure Active Directory
Role-Based Access Control
Attribute-Based Access Control
Azure Key Vault
Other Products
Azure Networking
Virtual Networks
Other Products
Azure Compute
Azure Virtual Machines
Azure Virtual Machine Scale Sets
Azure App Service Web Apps
Azure Functions
Azure Batch
Azure Management and Governance
Azure Monitor
Azure Purview
Azure Policy
Azure Blueprints (Preview)
Azure Lighthouse
Azure Cost Management and Billing
Other Products
Summary
Exam Essentials
Review Questions
Chapter 2 CREATE DATABASE dbName
The Brainjammer
A Historical Look at Data
Variety
Velocity
Volume
Data Locations
Data File Formats
Data Structures, Types, and Concepts
Data Structures
Data Types and Management
Data Concepts
Data Programming and Querying for Data Engineers.
Data Programming
Querying Data
Understanding Big Data Processing
Big Data Stages
ETL, ELT, ELTL
Analytics Types
Big Data Layers
Summary
Exam Essentials
Review Questions
Part II Design and Implement Data Storage
Chapter 3 Data Sources and Ingestion
Where Does Data Come From?
Design a Data Storage Structure
Design an Azure Data Lake Solution
Recommended File Types for Storage
Recommended File Types for Analytical Queries
Design for Efficient Querying
Design for Data Pruning
Design a Folder Structure That Represents the Levels of Data Transformation
Design a Distribution Strategy
Design a Data Archiving Solution
Design a Partition Strategy
Design a Partition Strategy for Files
Design a Partition Strategy for Analytical Workloads
Design a Partition Strategy for Efficiency and Performance
Design a Partition Strategy for Azure Synapse Analytics
Identify When Partitioning Is Needed in Azure Data Lake Storage Gen2
Design the Serving/Data Exploration Layer
Design Star Schemas
Design Slowly Changing Dimensions
Design a Dimensional Hierarchy
Design a Solution for Temporal Data
Design for Incremental Loading
Design Analytical Stores
Design Metastores in Azure Synapse Analytics and Azure Databricks
The Ingestion of Data into a Pipeline
Azure Synapse Analytics
Azure Data Factory
Azure Databricks
Event Hubs and IoT Hub
Azure Stream Analytics
Apache Kafka for HDInsight
Migrating and Moving Data
Summary
Exam Essentials
Review Questions
Chapter 4 The Storage of Data
Implement Physical Data Storage Structures
Implement Compression
Implement Partitioning
Implement Sharding
Implement Different Table Geometries with Azure Synapse Analytics Pools
Implement Data Redundancy
Implement Distributions.
Implement Data Archiving
Azure Synapse Analytics Develop Hub
Implement Logical Data Structures
Build a Temporal Data Solution
Build a Slowly Changing Dimension
Build a Logical Folder Structure
Build External Tables
Implement File and Folder Structures for Efficient Querying and Data Pruning
Implement a Partition Strategy
Implement a Partition Strategy for Files
Implement a Partition Strategy for Analytical Workloads
Implement a Partition Strategy for Streaming Workloads
Implement a Partition Strategy for Azure Synapse Analytics
Design and Implement the Data Exploration Layer
Deliver Data in a Relational Star Schema
Deliver Data in Parquet Files
Maintain Metadata
Implement a Dimensional Hierarchy
Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster
Recommend Azure Synapse Analytics Database Templates
Implement Azure Synapse Analytics Database Templates
Additional Data Storage Topics
Storing Raw Data in Azure Databricks for Transformation
Storing Data Using Azure HDInsight
Storing Prepared, Trained, and Modeled Data
Summary
Exam Essentials
Review Questions
Part III Develop Data Processing
Chapter 5 Transform, Manage, and Prepare Data
Ingest and Transform Data
Transform Data Using Azure Synapse Pipelines
Transform Data Using Azure Data Factory
Transform Data Using Apache Spark
Transform Data Using Transact-SQL
Transform Data Using Stream Analytics
Cleanse Data
Split Data
Shred JSON
Encode and Decode Data
Configure Error Handling for the Transformation
Normalize and Denormalize Values
Transform Data by Using Scala
Perform Exploratory Data Analysis
Transformation and Data Management Concepts
Transformation
Data Management
Azure Databricks
Data Modeling and Usage.
Data Modeling with Machine Learning
Usage
Summary
Exam Essentials
Review Questions
Chapter 6 Create and Manage Batch Processing and Pipelines
Design and Develop a Batch Processing Solution
Design a Batch Processing Solution
Develop Batch Processing Solutions
Create Data Pipelines
Handle Duplicate Data
Handle Missing Data
Handle Late-Arriving Data
Upsert Data
Configure the Batch Size
Configure Batch Retention
Design and Develop Slowly Changing Dimensions
Design and Implement Incremental Data Loads
Integrate Jupyter/IPython Notebooks into a Data Pipeline
Revert Data to a Previous State
Handle Security and Compliance Requirements
Design and Create Tests for Data Pipelines
Scale Resources
Design and Configure Exception Handling
Debug Spark Jobs Using the Spark UI
Implement Azure Synapse Link and Query the Replicated Data
Use PolyBase to Load Data to a SQL Pool
Read from and Write to a Delta Table
Manage Batches and Pipelines
Trigger Batches
Schedule Data Pipelines
Validate Batch Loads
Implement Version Control for Pipeline Artifacts
Manage Data Pipelines
Manage Spark Jobs in a Pipeline
Handle Failed Batch Loads
Summary
Exam Essentials
Review Questions
Chapter 7 Design and Implement a Data Stream Processing Solution
Develop a Stream Processing Solution
Design a Stream Processing Solution
Create a Stream Processing Solution
Process Time Series Data
Design and Create Windowed Aggregates
Process Data Within One Partition
Process Data Across Partitions
Upsert Data
Handle Schema Drift
Configure Checkpoints/Watermarking During Processing
Replay Archived Stream Data
Design and Create Tests for Data Pipelines
Monitor for Performance and Functional Regressions.
Optimize Pipelines for Analytical or Transactional Purposes
Scale Resources
Design and Configure Exception Handling
Handle Interruptions
Ingest and Transform Data
Transform Data Using Azure Stream Analytics
Monitor Data Storage and Data Processing
Monitor Stream Processing
Summary
Exam Essentials
Review Questions
Part IV Secure, Monitor, and Optimize Data Storage and Data Processing
Chapter 8 Keeping Data Safe and Secure
Design Security for Data Policies and Standards
Design a Data Auditing Strategy
Design a Data Retention Policy
Design for Data Privacy
Design to Purge Data Based on Business Requirements
Design Data Encryption for Data at Rest and in Transit
Design Row-Level and Column-Level Security
Design a Data Masking Strategy
Design Access Control for Azure Data Lake Storage Gen2
Implement Data Security
Implement a Data Auditing Strategy
Manage Sensitive Information
Implement a Data Retention Policy
Encrypt Data at Rest and in Motion
Implement Row-Level and Column-Level Security
Implement Data Masking
Manage Identities, Keys, and Secrets Across Different Data Platform Technologies
Implement Access Control for Azure Data Lake Storage Gen2
Implement Secure Endpoints (Private and Public)
Implement Resource Tokens in Azure Databricks
Load a DataFrame with Sensitive Information
Write Encrypted Data to Tables or Parquet Files
Develop a Batch Processing Solution
Handle Security and Compliance Requirements
Design and Implement the Data Exploration Layer
Browse and Search Metadata in Microsoft Purview Data Catalog
Push New or Updated Data Lineage to Microsoft Purview
Summary
Exam Essentials
Review Questions
Chapter 9 Monitoring Azure Data Storage and Processing
Monitoring Data Storage and Data Processing.
Implement Logging Used by Azure Monitor.

MCA Microsoft Certified Associate Azure Data Engineer Study Guide Exam DP-203

Similar Items