Python Machine Learning by Example Unlock Machine Learning Best Practices with Real-World Use Cases

The fourth edition of Python Machine Learning By Example is a comprehensive guide for beginners and experienced machine learning practitioners who want to learn more advanced techniques, such as multimodal modeling. Written by experienced machine learning author and ex-Google machine learning engine...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Liu, Yuxi author (author)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham, England : Packt Publishing [2024]
Edición:	Fourth edition
Colección:	Expert insight.
Materias:	Machine learning. Python (Computer program language) Data mining.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009841739206719

Tabla de Contenidos:

Cover
Copyright
Contributors
Table of Contents
Preface
Chapter 1: Getting Started with Machine Learning and Python
An introduction to machine learning
Understanding why we need machine learning
Differentiating between machine learning and automation
Machine learning applications
Knowing the prerequisites
Getting started with three types of machine learning
A brief history of the development of machine learning algorithms
Digging into the core of machine learning
Generalizing with data
Overfitting, underfitting, and the bias-variance trade-off
Overfitting
Underfitting
The bias-variance trade-off
Avoiding overfitting with cross-validation
Avoiding overfitting with regularization
Avoiding overfitting with feature selection and dimensionality reduction
Data preprocessing and feature engineering
Preprocessing and exploration
Dealing with missing values
Label encoding
One-hot encoding
Dense embedding
Scaling
Feature engineering
Polynomial transformation
Binning
Combining models
Voting and averaging
Bagging
Boosting
Stacking
Installing software and setting up
Setting up Python and environments
Installing the main Python packages
NumPy
SciPy
pandas
scikit-learn
TensorFlow
PyTorch
Summary
Exercises
Chapter 2: Building a Movie Recommendation Engine with Naïve Bayes
Getting started with classification
Binary classification
Multiclass classification
Multi-label classification
Exploring Naïve Bayes
Bayes' theorem by example
The mechanics of Naïve Bayes
Implementing Naïve Bayes
Implementing Naïve Bayes from scratch
Implementing Naïve Bayes with scikit-learn
Building a movie recommender with Naïve Bayes
Preparing the data
Training a Naïve Bayes model.
Evaluating classification performance
Tuning models with cross-validation
Summary
Exercises
References
Chapter 3: Predicting Online Ad Click-Through with Tree-Based Algorithms
A brief overview of ad click-through prediction
Getting started with two types of data - numerical and categorical
Exploring a decision tree from the root to the leaves
Constructing a decision tree
The metrics for measuring a split
Gini Impurity
Information gain
Implementing a decision tree from scratch
Implementing a decision tree with scikit-learn
Predicting ad click-through with a decision tree
Ensembling decision trees - random forests
Ensembling decision trees - gradient-boosted trees
Summary
Exercises
Chapter 4: Predicting Online Ad Click-Through with Logistic Regression
Converting categorical features to numerical - one-hot encoding and ordinal encoding
Classifying data with logistic regression
Getting started with the logistic function
Jumping from the logistic function to logistic regression
Training a logistic regression model
Training a logistic regression model using gradient descent
Predicting ad click-through with logistic regression using gradient descent
Training a logistic regression model using stochastic gradient descent (SGD)
Training a logistic regression model with regularization
Feature selection using L1 regularization
Feature selection using random forest
Training on large datasets with online learning
Handling multiclass classification
Implementing logistic regression using TensorFlow
Summary
Exercises
Chapter 5: Predicting Stock Prices with Regression Algorithms
What is regression?
Mining stock price data
A brief overview of the stock market and stock prices
Getting started with feature engineering.
Acquiring data and generating features
Estimating with linear regression
How does linear regression work?
Implementing linear regression from scratch
Implementing linear regression with scikit-learn
Implementing linear regression with TensorFlow
Estimating with decision tree regression
Transitioning from classification trees to regression trees
Implementing decision tree regression
Implementing a regression forest
Evaluating regression performance
Predicting stock prices with the three regression algorithms
Summary
Exercises
Chapter 6: Predicting Stock Prices with Artificial Neural Networks
Demystifying neural networks
Starting with a single-layer neural network
Layers in neural networks
Activation functions
Backpropagation
Adding more layers to a neural network: DL
Building neural networks
Implementing neural networks from scratch
Implementing neural networks with scikit-learn
Implementing neural networks with TensorFlow
Implementing neural networks with PyTorch
Picking the right activation functions
Preventing overfitting in neural networks
Dropout
Early stopping
Predicting stock prices with neural networks
Training a simple neural network
Fine-tuning the neural network
Summary
Exercises
Chapter 7: Mining the 20 Newsgroups Dataset with Text Analysis Techniques
How computers understand language - NLP
What is NLP?
The history of NLP
NLP applications
Touring popular NLP libraries and picking up NLP basics
Installing famous NLP libraries
Corpora
Tokenization
PoS tagging
NER
Stemming and lemmatization
Semantics and topic modeling
Getting the newsgroups data
Exploring the newsgroups data
Thinking about features for text data
Counting the occurrence of each word token
Text preprocessing.
Dropping stop words
Reducing inflectional and derivational forms of words
Visualizing the newsgroups data with t-SNE
What is dimensionality reduction?
t-SNE for dimensionality reduction
Representing words with dense vectors - word embedding
Building embedding models using shallow neural networks
Utilizing pre-trained embedding models
Summary
Exercises
Chapter 8: Discovering Underlying Topics in the Newsgroups Dataset with Clustering and Topic Modeling
Learning without guidance - unsupervised learning
Getting started with k-means clustering
How does k-means clustering work?
Implementing k-means from scratch
Implementing k-means with scikit-learn
Choosing the value of k
Clustering newsgroups dataset
Clustering newsgroups data using k-means
Describing the clusters using GPT
Discovering underlying topics in newsgroups
Topic modeling using NMF
Topic modeling using LDA
Summary
Exercises
Chapter 9: Recognizing Faces with Support Vector Machine
Finding the separating boundary with SVM
Scenario 1 - identifying a separating hyperplane
Scenario 2 - determining the optimal hyperplane
Scenario 3 - handling outliers
Implementing SVM
Scenario 4 - dealing with more than two classes
One-vs-rest
One-vs-one
Multiclass cases in scikit-learn
Scenario 5 - solving linearly non-separable problems with kernels
Choosing between linear and RBF kernels
Classifying face images with SVM
Exploring the face image dataset
Building an SVM-based image classifier
Boosting image classification performance with PCA
Estimating with support vector regression
Implementing SVR
Summary
Exercises
Chapter 10: Machine Learning Best Practices
Machine learning solution workflow
Best practices in the data preparation stage.
Best practice 1 - Completely understanding the project goal
Best practice 2 - Collecting all fields that are relevant
Best practice 3 - Maintaining the consistency and normalization of field values
Best practice 4 - Dealing with missing data
Best practice 5 - Storing large-scale data
Best practices in the training set generation stage
Best practice 6 - Identifying categorical features with numerical values
Best practice 7 - Deciding whether to encode categorical features
Best practice 8 - Deciding whether to select features and, if so, how to do so
Best practice 9 - Deciding whether to reduce dimensionality and, if so, how to do so
Best practice 10 - Deciding whether to rescale features
Best practice 11 - Performing feature engineering with domain expertise
Best practice 12 - Performing feature engineering without domain expertise
Binarization and discretization
Interaction
Polynomial transformation
Best practice 13 - Documenting how each feature is generated
Best practice 14 - Extracting features from text data
tf and tf-idf
Word embedding
Word2Vec embedding
Best practices in the model training, evaluation, and selection stage
Best practice 15 - Choosing the right algorithm(s) to start with
Naïve Bayes
Logistic regression
SVM
Random forest (or decision tree)
Neural networks
Best practice 16 - Reducing overfitting
Best practice 17 - Diagnosing overfitting and underfitting
Best practice 18 - Modeling on large-scale datasets
Best practices in the deployment and monitoring stage
Best practice 19 - Saving, loading, and reusing models
Saving and restoring models using pickle
Saving and restoring models in TensorFlow
Saving and restoring models in PyTorch
Best practice 20 - Monitoring model performance
Best practice 21 - Updating models regularly.
Summary.

Python Machine Learning by Example Unlock Machine Learning Best Practices with Real-World Use Cases

Ejemplares similares