Hands-On data Wwarehousing with Azure Data Factory ETL techniques to load and transform data from various sources, both on-premises and on cloud

Leverage the power of Microsoft Azure Data Factory v2 to build hybrid data solutions About This Book Combine the power of Azure Data Factory v2 and SQL Server Integration Services Design and enhance performance and scalability of a modern ETL hybrid solution Interact with the loaded data in data war...

Descripción completa

Detalles Bibliográficos
Otros Autores: Cote, Christian, author (author), Gutzait, Michelle Kamrat, author, Ciaburro, Giuseppe, author
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham ; Mumbai : Packt [2018]
Edición:1st edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630755306719
Tabla de Contenidos:
  • Cover
  • Title Page
  • Copyright and Credits
  • Packt Upsell
  • Contributors
  • Table of Contents
  • Preface
  • Chapter 1: The Modern Data Warehouse
  • The need for a data warehouse
  • Driven by IT
  • Self-service BI
  • Cloud-based BI - big data and artificial intelligence
  • The modern data warehouse
  • Main components of a data warehouse
  • Staging area
  • Data warehouse
  • Cubes
  • Consumption layer - BI and analytics
  • What is Azure Data Factory
  • Limitations of ADF V1.0
  • What's new in V2.0?
  • Integration runtime
  • Linked services
  • Datasets
  • Pipelines
  • Activities
  • Parameters
  • Expressions
  • Controlling the flow of activities
  • SSIS package deployment in Azure
  • Spark cluster data store
  • Summary
  • Chapter 2: Getting Started with Our First Data Factory
  • Resource group
  • Azure Data Factory
  • Datasets
  • Linked services
  • Integration runtimes
  • Activities
  • Monitoring the data factory pipeline runs
  • Azure Blob storage
  • Blob containers
  • Types of blobs
  • Block blobs
  • Page blobs
  • Replication of storage
  • Creating an Azure Blob storage account
  • SQL Azure database
  • Creating the Azure SQL Server
  • Attaching the BACPAC to our database
  • Copying data using our data factory
  • Summary
  • Chapter 3: SSIS Lift and Shift
  • SSIS in ADF
  • Sample setup
  • Sample databases
  • SSIS components
  • Integration services catalog setup
  • Sample solution in Visual Studio
  • Deploying the project on-premises
  • Leveraging our package in ADF V2
  • Integration runtimes
  • Azure integration runtime
  • Self-hosted runtime
  • SSIS integration runtime
  • Adding an SSIS integration runtime to the factory
  • SSIS execution from a pipeline
  • Summary
  • Chapter 4: Azure Data Lake
  • Creating and configuring Data Lake Store
  • Next Steps
  • Ways to copy/import data from a database to the Data Lake.
  • Ways to store imported data in files in the Data Lake
  • Easily moving data to the Data Lake Store
  • Ways to directly copy files into the Data Lake
  • Prerequisites for the next steps
  • Creating a Data Lake Analytics resource
  • Using the data factory to manipulate data in the Data Lake
  • Task 1 - copy/import data from SQL Server to a blob storage file using data factory
  • Task 2 - run a U-SQL task from the data factory pipeline to summarize data
  • Service principal authentication
  • Run U-SQL from a job in the Data Lake Analytics
  • Summary
  • Chapter 5: Machine Learning on the Cloud
  • Machine learning overview
  • Machine learning algorithms
  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning
  • Machine learning tasks
  • Making predictions with regression algorithms
  • Automated classification using machine learning
  • Identifying groups using clustering methods
  • Dimensionality reduction to improve performance
  • Feature selection
  • Feature extraction
  • Azure Machine Learning Studio
  • Azure Machine Learning Studio account
  • Azure Machine Learning Studio experiment
  • Dataset
  • Module
  • Work area
  • Breast cancer detection
  • Get the data
  • Prepare the data
  • Train the model
  • Score and evaluate the model
  • Summary
  • Chapter 6: Introduction to Azure Databricks
  • Azure Databricks setup
  • Prepare the data to ingest
  • Setting up the folder in the Azure storage account
  • Self-hosted integration runtime
  • Linked service setup
  • Datasets setup
  • SQL Server dataset
  • Blob storage dataset
  • Linked service
  • Dataset
  • Copy data from SQL Server to sales-data
  • Publish and trigger the copy activity
  • Databricks notebook
  • Calling Databricks notebook execution in ADF
  • Summary
  • Chapter 7: Reporting on the Modern Data Warehouse
  • Different types of BI
  • Self-service - personal.
  • Team BI - sharing personal BI data
  • Corporate BI
  • Power BI Premium
  • Power BI Report Server
  • Power BI consumption
  • Creating our Power BI reports
  • Reporting with on-premise data sources
  • Incorporating Spark data
  • Summary
  • Index.