Databricks ML in Action Learn How Databricks Supports the Entire ML Lifecycle End to End from Data Ingestion to the Model Deployment

Get to grips with autogenerating code, deploying ML algorithms, and leveraging various ML lifecycle features on the Databricks Platform, guided by best practices and reusable code for you to try, alter, and build on Key Features Build machine learning solutions faster than peers only using documenta...

Descripción completa

Detalles Bibliográficos
Otros Autores: Rivera, Stephanie, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham : Packt Publishing Ltd [2024]
Edición:First edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009820531206719
Tabla de Contenidos:
  • Cover
  • Title Page
  • Copyright and Credits
  • Dedication
  • Contributors
  • Table of Contents
  • Part 1: Overview of the Databricks Unified Lakehouse Platform
  • Chapter 1: Getting Started with This Book and Lakehouse Concepts
  • The components of the Data Intelligence Platform
  • The advantages of the Databricks Platform
  • Open source features
  • Databricks AutoML
  • Reusability and reproducibility
  • Open file formats give you flexibility
  • Applying our learning
  • Technical requirements
  • Getting to know your data
  • Project - streaming transactions
  • Project - Favorita sales forecasting
  • Project - multilabel image classification
  • Project - a retrieval augmented generation chatbot
  • Summary
  • Questions
  • Answers
  • Further reading
  • Chapter 2: Designing Databricks: Day One
  • Planning your platform
  • Defining a workspace
  • Selecting the metastore
  • Defining where the data lives, and cloud object storage
  • Discussing source control
  • Discussing data preparation
  • Planning to create features
  • Modeling in Databricks
  • Monitoring data and models
  • Applying our learning
  • Technical requirements
  • Setting up your workspace
  • Kaggle setup
  • Starting the projects
  • Project: Favorita store sales - time series forecasting
  • Project: Streaming Transactions
  • Project: Retrieval-Augmented Generation Chatbot
  • Project: Multilabel Image Classification
  • Summary
  • Questions
  • Answers
  • Further reading
  • Chapter 3: Building Out Our Bronze Layer
  • Revisiting the Medallion architecture pattern
  • Transforming data to Delta with Auto Loader
  • Schema evolution
  • DLT, starting with Bronze
  • DLT benefits and features
  • Bronze data with DLT
  • Maintaining and optimizing Delta tables
  • VACUUM
  • Liquid clustering
  • OPTIMIZE
  • Predictive optimization
  • Applying our learning
  • Technical requirements.
  • Project - streaming transactions
  • Project - Favorita store sales - time series forecasting
  • Project - a retrieval augmented generation chatbot
  • Project - multilabel image classification
  • Summary
  • Questions
  • Answers
  • Further reading
  • Part 2: Heavily Use Case-Focused
  • Chapter 4: Getting to Know Your Data
  • Improving data integrity with DLT
  • Monitoring data quality with Databricks Lakehouse Monitoring
  • Mechanics of Lakehouse Monitoring
  • Visualization and alerting
  • Creating a monitor
  • Exploring data with Databricks Assistant
  • Generating data profiles with AutoML
  • Using embeddings to understand unstructured data
  • Enhancing data retrieval with Databricks Vector Search
  • Flexibility in embedding model support
  • Setting up a vector search
  • Applying our learning
  • Technical requirements
  • Project - Favorita Store Sales - time-series forecasting
  • Project - streaming transactions
  • Project - RAG chatbot
  • Project - multilabel image classification
  • Summary
  • Questions
  • Answers
  • Further reading
  • Chapter 5: Feature Engineering on Databricks
  • Databricks Feature Engineering in Unity Catalog
  • Feature engineering on a stream
  • Employing point-in-time lookups for time series feature tables
  • Computing on-demand features
  • Publishing features to the Databricks Online Store
  • Applying our learning
  • Technical requirements
  • Project - Streaming Transactions
  • Project - Favorita Store Sales - time series forecasting
  • Summary
  • Questions
  • Answers
  • Further reading
  • Chapter 6: Searching for a Signal
  • Technical requirements
  • Baselining with AutoML
  • Tracking experiments with MLflow
  • Classifying beyond the basic
  • Integrating innovation
  • Applying our learning
  • Parkinson's FOG
  • Forecasting Favorita sales
  • Summary
  • Questions
  • Answers
  • Further reading.
  • Chapter 7: Productionizing ML on Databricks
  • Deploying the MLOps inner loop
  • Registering a model
  • Collaborative development
  • Deploying the MLOps outer loop
  • Workflows
  • DABs
  • REST API
  • Deploying your model
  • Model Inference
  • Model serving
  • Applying our learning
  • Technical requirements
  • Project - Favorita Sales forecasting
  • Project - Streaming Transactions
  • Project - multilabel image classification
  • Project - retrieval augmented generation chatbot
  • Summary
  • Questions
  • Answers
  • Further reading
  • Chapter 8: Monitoring, Evaluating, and More
  • Monitoring your models
  • Building gold layer visualizations
  • Leveraging Lakeview dashboards
  • Visualizing big data with Databricks SQL dashboards
  • Python UDFs
  • Connecting your applications
  • Incorporating LLMs for analysts with SQL AI Functions
  • Applying our learning
  • Technical requirements
  • Project: Favorita store sales
  • Project -streaming transactions
  • Project: retrieval-augmented generation chatbot
  • Summary
  • Questions
  • Answers
  • Further reading
  • Index
  • Other Books You May Enjoy.