Machine Learning Engineering on AWS Build, Scale, and Secure Machine Learning Systems and MLOps Pipelines in Production
Work seamlessly with production-ready machine learning systems and pipelines on AWS by addressing key pain points encountered in the ML life cycle Key Features Gain practical knowledge of managing ML workloads on AWS using Amazon SageMaker, Amazon EKS, and more Use container and serverless services...
Autor principal: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham :
Packt Publishing, Limited
2022.
|
Edición: | 1st ed |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009710831306719 |
Tabla de Contenidos:
- Cover
- Title Page
- Copyright and Credits
- Contributors
- Table of Contents
- Preface
- Part 1: Getting Started with Machine Learning Engineering on AWS
- Chapter 1: Introduction to ML Engineering on AWS
- Technical requirements
- What is expected from ML engineers?
- How ML engineers can get the most out of AWS
- Essential prerequisites
- Creating the Cloud9 environment
- Increasing Cloud9's storage
- Installing the Python prerequisites
- Preparing the dataset
- Generating a synthetic dataset using a deep learning model
- Exploratory data analysis
- Train-test split
- Uploading the dataset to Amazon S3
- AutoML with AutoGluon
- Setting up and installing AutoGluon
- Performing your first AutoGluon AutoML experiment
- Getting started with SageMaker and SageMaker Studio
- Onboarding with SageMaker Studio
- Adding a user to an existing SageMaker Domain
- No-code machine learning with SageMaker Canvas
- AutoML with SageMaker Autopilot
- Summary
- Further reading
- Chapter 2: Deep Learning AMIs
- Technical requirements
- Getting started with Deep Learning AMIs
- Launching an EC2 instance using a Deep Learning AMI
- Locating the framework-specific DLAMI
- Choosing the instance type
- Ensuring a default secure configuration
- Launching the instance and connecting to it using EC2 Instance Connect
- Downloading the sample dataset
- Training an ML model
- Loading and evaluating the model
- Cleaning up
- Understanding how AWS pricing works for EC2 instances
- Using multiple smaller instances to reduce the overall cost of running ML workloads
- Using spot instances to reduce the cost of running training jobs
- Summary
- Further reading
- Chapter 3: Deep Learning Containers
- Technical requirements
- Getting started with AWS Deep Learning Containers
- Essential prerequisites.
- Preparing the Cloud9 environment
- Downloading the sample dataset
- Using AWS Deep Learning Containers to train an ML model
- Serverless ML deployment with Lambda's container image support
- Building the custom container image
- Testing the container image
- Pushing the container image to Amazon ECR
- Running ML predictions on AWS Lambda
- Completing and testing the serverless API setup
- Summary
- Further reading
- Part 2: Solving Data Engineering and Analysis Requirements
- Chapter 4: Serverless Data Management on AWS
- Technical requirements
- Getting started with serverless data management
- Preparing the essential prerequisites
- Opening a text editor on your local machine
- Creating an IAM user
- Creating a new VPC
- Uploading the dataset to S3
- Running analytics at scale with Amazon Redshift Serverless
- Setting up a Redshift Serverless endpoint
- Opening Redshift query editor v2
- Creating a table
- Loading data from S3
- Querying the database
- Unloading data to S3
- Setting up Lake Formation
- Creating a database
- Creating a table using an AWS Glue Crawler
- Using Amazon Athena to query data in Amazon S3
- Setting up the query result location
- Running SQL queries using Athena
- Summary
- Further reading
- Chapter 5: Pragmatic Data Processing and Analysis
- Technical requirements
- Getting started with data processing and analysis
- Preparing the essential prerequisites
- Downloading the Parquet file
- Preparing the S3 bucket
- Automating data preparation and analysis with AWS Glue DataBrew
- Creating a new dataset
- Creating and running a profile job
- Creating a project and configuring a recipe
- Creating and running a recipe job
- Verifying the results
- Preparing ML data with Amazon SageMaker Data Wrangler
- Accessing Data Wrangler
- Importing data
- Transforming the data.
- Analyzing the data
- Exporting the data flow
- Turning off the resources
- Verifying the results
- Summary
- Further reading
- Part 3: Diving Deeper with Relevant Model Training and Deployment Solutions
- Chapter 6: SageMaker Training and Debugging Solutions
- Technical requirements
- Getting started with the SageMaker Python SDK
- Preparing the essential prerequisites
- Creating a service limit increase request
- Training an image classification model with the SageMaker Python SDK
- Creating a new Notebook in SageMaker Studio
- Downloading the training, validation, and test datasets
- Uploading the data to S3
- Using the SageMaker Python SDK to train an ML model
- Using the %store magic to store data
- Using the SageMaker Python SDK to deploy an ML model
- Using the Debugger Insights Dashboard
- Utilizing Managed Spot Training and Checkpoints
- Cleaning up
- Summary
- Further reading
- Chapter 7: SageMaker Deployment Solutions
- Technical requirements
- Getting started with model deployments in SageMaker
- Preparing the pre-trained model artifacts
- Preparing the SageMaker script mode prerequisites
- Preparing the inference.py file
- Preparing the requirements.txt file
- Preparing the setup.py file
- Deploying a pre-trained model to a real-time inference endpoint
- Deploying a pre-trained model to a serverless inference endpoint
- Deploying a pre-trained model to an asynchronous inference endpoint
- Creating the input JSON file
- Adding an artificial delay to the inference script
- Deploying and testing an asynchronous inference endpoint
- Cleaning up
- Deployment strategies and best practices
- Summary
- Further reading
- Part 4: Securing, Monitoring, and Managing Machine Learning Systems and Environments
- Chapter 8: Model Monitoring and Management Solutions
- Technical prerequisites.
- Registering models to SageMaker Model Registry
- Creating a new notebook in SageMaker Studio
- Registering models to SageMaker Model Registry using the boto3 library
- Deploying models from SageMaker Model Registry
- Enabling data capture and simulating predictions
- Scheduled monitoring with SageMaker Model Monitor
- Analyzing the captured data
- Deleting an endpoint with a monitoring schedule
- Cleaning up
- Summary
- Further reading
- Chapter 9: Security, Governance, and Compliance Strategies
- Managing the security and compliance of ML environments
- Authentication and authorization
- Network security
- Encryption at rest and in transit
- Managing compliance reports
- Vulnerability management
- Preserving data privacy and model privacy
- Federated Learning
- Differential Privacy
- Privacy-preserving machine learning
- Other solutions and options
- Establishing ML governance
- Lineage Tracking and reproducibility
- Model inventory
- Model validation
- ML explainability
- Bias detection
- Model monitoring
- Traceability, observability, and auditing
- Data quality analysis and reporting
- Data integrity management
- Summary
- Further reading
- Part 5: Designing and Building End-to-end MLOps Pipelines
- Chapter 10: Machine Learning Pipelines with Kubeflow on Amazon EKS
- Technical requirements
- Diving deeper into Kubeflow, Kubernetes, and EKS
- Preparing the essential prerequisites
- Preparing the IAM role for the EC2 instance of the Cloud9 environment
- Attaching the IAM role to the EC2 instance of the Cloud9 environment
- Updating the Cloud9 environment with the essential prerequisites
- Setting up Kubeflow on Amazon EKS
- Running our first Kubeflow pipeline
- Using the Kubeflow Pipelines SDK to build ML workflows
- Cleaning up
- Recommended strategies and best practices
- Summary
- Further reading.
- Chapter 11: Machine Learning Pipelines with SageMaker Pipelines
- Technical requirements
- Diving deeper into SageMaker Pipelines
- Preparing the essential prerequisites
- Running our first pipeline with SageMaker Pipelines
- Defining and preparing our first ML pipeline
- Running our first ML pipeline
- Creating Lambda functions for deployment
- Preparing the Lambda function for deploying a model to a new endpoint
- Preparing the Lambda function for checking whether an endpoint exists
- Preparing the Lambda function for deploying a model to an existing endpoint
- Testing our ML inference endpoint
- Completing the end-to-end ML pipeline
- Defining and preparing the complete ML pipeline
- Running the complete ML pipeline
- Cleaning up
- Recommended strategies and best practices
- Summary
- Further reading
- Index
- Other Books You May Enjoy.