Machine learning bookcamp build a portfolio of real-life projects

Machine Learning Bookcamp presents realistic, practical machine learning scenarios, along with crystal-clear coverage of key concepts. In it, you'll complete engaging projects, such as creating a car price predictor using linear regression and deploying a churn prediction service. You'll g...

Descripción completa

Detalles Bibliográficos
Otros Autores: Grigorev, Alexey, author (author), Massaron, Luca, writer of foreword (writer of foreword)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Shelter Island, New York : Manning Publications [2021]
Edición:1st edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009644271106719
Tabla de Contenidos:
  • Intro
  • inside front cover
  • Machine Learning Bookcamp
  • Copyright
  • brief contents
  • contents
  • front matter
  • foreword
  • preface
  • acknowledgments
  • about this book
  • Who should read this book
  • How this book is organized: a roadmap
  • About the code
  • liveBook discussion forum
  • Other online resources
  • about the author
  • about the cover illustration
  • 1 Introduction to machine learning
  • 1.1 Machine learning
  • 1.1.1 Machine learning vs. rule-based systems
  • 1.1.2 When machine learning isn't helpful
  • 1.1.3 Supervised machine learning
  • 1.2 Machine learning process
  • 1.2.1 Business understanding
  • 1.2.2 Data understanding
  • 1.2.3 Data preparation
  • 1.2.4 Modeling
  • 1.2.5 Evaluation
  • 1.2.6 Deployment
  • 1.2.7 Iterate
  • 1.3 Modeling and model validation
  • Summary
  • 2 Machine learning for regression
  • 2.1 Car-price prediction project
  • 2.1.1 Downloading the dataset
  • 2.2 Exploratory data analysis
  • 2.2.1 Exploratory data analysis toolbox
  • 2.2.2 Reading and preparing data
  • 2.2.3 Target variable analysis
  • 2.2.4 Checking for missing values
  • 2.2.5 Validation framework
  • 2.3 Machine learning for regression
  • 2.3.1 Linear regression
  • 2.3.2 Training linear regression model
  • 2.4 Predicting the price
  • 2.4.1 Baseline solution
  • 2.4.2 RMSE: Evaluating model quality
  • 2.4.3 Validating the model
  • 2.4.4 Simple feature engineering
  • 2.4.5 Handling categorical variables
  • 2.4.6 Regularization
  • 2.4.7 Using the model
  • 2.5 Next steps
  • 2.5.1 Exercises
  • 2.5.2 Other projects
  • Summary
  • Answers to exercises
  • 3 Machine learning for classification
  • 3.1 Churn prediction project
  • 3.1.1 Telco churn dataset
  • 3.1.2 Initial data preparation
  • 3.1.3 Exploratory data analysis
  • 3.1.4 Feature importance
  • 3.2 Feature engineering
  • 3.2.1 One-hot encoding for categorical variables.
  • 3.3 Machine learning for classification
  • 3.3.1 Logistic regression
  • 3.3.2 Training logistic regression
  • 3.3.3 Model interpretation
  • 3.3.4 Using the model
  • 3.4 Next steps
  • 3.4.1 Exercises
  • 3.4.2 Other projects
  • Summary
  • Answers to exercises
  • 4 Evaluation metrics for classification
  • 4.1 Evaluation metrics
  • 4.1.1 Classification accuracy
  • 4.1.2 Dummy baseline
  • 4.2 Confusion table
  • 4.2.1 Introduction to the confusion table
  • 4.2.2 Calculating the confusion table with NumPy
  • 4.2.3 Precision and recall
  • 4.3 ROC curve and AUC score
  • 4.3.1 True positive rate and false positive rate
  • 4.3.2 Evaluating a model at multiple thresholds
  • 4.3.3 Random baseline model
  • 4.3.4 The ideal model
  • 4.3.5 ROC Curve
  • 4.3.6 Area under the ROC curve (AUC)
  • 4.4 Parameter tuning
  • 4.4.1 K-fold cross-validation
  • 4.4.2 Finding best parameters
  • 4.5 Next steps
  • 4.5.1 Exercises
  • 4.5.2 Other projects
  • Summary
  • Answers to exercises
  • 5 Deploying machine learning models
  • 5.1 Churn-prediction model
  • 5.1.1 Using the model
  • 5.1.2 Using Pickle to save and load the model
  • 5.2 Model serving
  • 5.2.1 Web services
  • 5.2.2 Flask
  • 5.2.3 Serving churn model with Flask
  • 5.3 Managing dependencies
  • 5.3.1 Pipenv
  • 5.3.2 Docker
  • 5.4 Deployment
  • 5.4.1 AWS Elastic Beanstalk
  • 5.5 Next steps
  • 5.5.1 Exercises
  • 5.5.2 Other projects
  • Summary
  • 6 Decision trees and ensemble learning
  • 6.1 Credit risk scoring project
  • 6.1.1 Credit scoring dataset
  • 6.1.2 Data cleaning
  • 6.1.3 Dataset preparation
  • 6.2 Decision trees
  • 6.2.1 Decision tree classifier
  • 6.2.2Decision tree learning algorithm
  • 6.2.3 Parameter tuning for decision tree
  • 6.3 Random forest
  • 6.3.1 Training a random forest
  • 6.3.2 Parameter tuning for random forest
  • 6.4 Gradient boosting
  • 6.4.1 XGBoost: Extreme gradient boosting.
  • 6.4.2 Model performance monitoring
  • 6.4.3 Parameter tuning for XGBoost
  • 6.4.4 Testing the final model
  • 6.5 Next steps
  • 6.5.1 Exercises
  • 6.5.2 Other projects
  • Summary
  • Answers to exercises
  • 7 Neural networks and deep learning
  • 7.1 Fashion classification
  • 7.1.1 GPU vs. CPU
  • 7.1.2 Downloading the clothing dataset
  • 7.1.3 TensorFlow and Keras
  • 7.1.4 images
  • 7.2 Convolutional neural networks
  • 7.2.1 Using a pretrained model
  • 7.2.2 Getting predictions
  • 7.3 Internals of the model
  • 7.3.1 Convolutional layers
  • 7.3.2 Dense layers
  • 7.4 Training the model
  • 7.4.1 Transfer learning
  • 7.4.2 Loading the data
  • 7.4.3 Creating the model
  • 7.4.4 Training the model
  • 7.4.5 Adjusting the learning rate
  • 7.4.6 Saving the model and checkpointing
  • 7.4.7 Adding more layers
  • 7.4.8 Regularization and dropout
  • 7.4.9 Data augmentation
  • 7.4.10 Training a larger model
  • 7.5 Using the model
  • 7.5.1 Loading the model
  • 7.5.2 Evaluating the model
  • 7.5.3 Getting the predictions
  • 7.6 Next steps
  • 7.6.1 Exercises
  • 7.6.2 Other projects
  • Summary
  • Answers to exercises
  • 8 Serverless deep learning
  • 8.1 Serverless: AWS Lambda
  • 8.1.1 TensorFlow Lite
  • 8.1.2 Converting the model to TF Lite format
  • 8.1.3 Preparing the images
  • 8.1.4 Using the TensorFlow Lite model
  • 8.1.5 Code for the lambda function
  • 8.1.6 Preparing the Docker image
  • 8.1.7 Pushing the image to AWS ECR
  • 8.1.8 Creating the lambda function
  • 8.1.9 Creating the API Gateway
  • 8.2 Next steps
  • 8.2.1 Exercises
  • 8.2.2 Other projects
  • Summary
  • 9 Serving models with Kubernetes and Kubeflow
  • 9.1 Kubernetes and Kubeflow
  • 9.2 Serving models with TensorFlow Serving
  • 9.2.1 Overview of the serving architecture
  • 9.2.2 The saved_model format
  • 9.2.3 Running TensorFlow Serving locally
  • 9.2.4 Invoking the TF Serving model from Jupyter.
  • 9.2.5 Creating the Gateway service
  • 9.3 Model deployment with Kubernetes
  • 9.3.1 Introduction to Kubernetes
  • 9.3.2 Creating a Kubernetes cluster on AWS
  • 9.3.3 Preparing the Docker images
  • 9.3.4 Deploying to Kubernetes
  • 9.3.5 Testing the service
  • 9.4 Model deployment with Kubeflow
  • 9.4.1 the model: Uploading it to S3
  • 9.4.2 Deploying TensorFlow models with KFServing
  • 9.4.3 Accessing the model
  • 9.4.4 KFServing transformers
  • 9.4.5 Testing the transformer
  • 9.4.6 Deleting the EKS cluster
  • 9.5 Next steps
  • 9.5.1 Exercises
  • 9.5.2 Other projects
  • Summary
  • Appendix A. Preparing the environment
  • A.1 Installing Python and Anaconda
  • A.1.1 Installing Python and Anaconda on Linux
  • A.1.2 Installing Python and Anaconda on Windows
  • A.1.3 Installing Python and Anaconda on macOS
  • A.2 Running Jupyter
  • A.2.1 Running Jupyter on Linux
  • A.2.2 Running Jupyter on Windows
  • A.2.3 Running Jupyter on MacOS
  • A.3 Installing the Kaggle CLI
  • A.4 Accessing the source code
  • A.5 Installing Docker
  • A.5.1 Installing Docker on Linux
  • A.5.2 Installing Docker on Windows
  • A.5.3 Installing Docker on MacOS
  • A.6 Renting a server on AWS
  • A.6.1 Registering on AWS
  • A.6.2 Accessing billing information
  • A.6.3 Creating an EC2 instance
  • A.6.4 Connecting to the instance
  • A.6.5 Shutting down the instance
  • A.6.6 Configuring AWS CLI
  • Appendix B. Introduction to Python
  • B.1 Variables
  • B.1.1 Control flow
  • B.1.2 Collections
  • B.1.3 Code reusability
  • B.1.4 Installing libraries
  • B.1.5 Python programs
  • Appendix C. Introduction to NumPy
  • C.1 NumPy
  • C.1.1 NumPy arrays
  • C.1.2 Two-dimensional NumPy arrays
  • C.1.3 Randomly generated arrays
  • C.2 NumPy operations
  • C.2.1 Element-wise operations
  • C.2.2 Summarizing operations
  • C.2.3 Sorting
  • C.2.4 Reshaping and combining
  • C.2.5 Slicing and filtering.
  • C.3 Linear algebra
  • C.3.1 Multiplication
  • C.3.2 Matrix inverse
  • C.3.3 Normal equation
  • Appendix D. Introduction to Pandas
  • D.1 Pandas
  • D.1.1 DataFrame
  • D.1.2 Series
  • D.1.3 Index
  • D.1.4 Accessing rows
  • D.1.5 Splitting a DataFrame
  • D.2 Operations
  • D.2.1 Element-wise operations
  • D.2.2 Filtering
  • D.2.3 String operations
  • D.2.4 Summarizing operations
  • D.2.5 Missing values
  • D.2.6 Sorting
  • D.2.7 Grouping
  • Appendix E. AWS SageMaker
  • E.1 AWS SageMaker Notebooks
  • E.1.1 Increasing the GPU quota limits
  • E.1.2 Creating a notebook instance
  • E.1.3 Training a model
  • E.1.4 Turning off the notebook
  • index.