Human-in-the-loop machine learning active learning and annotation for human-centered AI

Bibliographic Details
Other Authors:	Monarch, Robert, author (author), Manning, Christopher D., writer of foreword (writer of foreword)
Format:	eBook
Language:	Inglés
Published:	Shelter Island, New York : Manning Publications [2021]
Subjects:	Machine learning. Human-computer interaction.
See on Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009634718706719

Table of Contents:

Intro
inside front cover
Human-in-the-Loop Machine Learning
Copyright
brief contents
contents
front matter
foreword
preface
acknowledgments
about this book
Who should read this book
How this book is organized: A road map
About the code
liveBook discussion forum
Other online resources
about the author
Part 1 First steps
1 Introduction to human-in-the-loop machine learning
1.1 The basic principles of human-in-the-loop machine learning
1.2 Introducing annotation
1.2.1 Simple and more complicated annotation strategies
1.2.2 Plugging the gap in data science knowledge
1.2.3 Quality human annotation: Why is it hard?
1.3 Introducing active learning: Improving the speed and reducing the cost of training data
1.3.1 Three broad active learning sampling strategies: Uncertainty, diversity, and random
1.3.2 What is a random selection of evaluation data?
1.3.3 When to use active learning
1.4 Machine learning and human-computer interaction
1.4.1 User interfaces: How do you create training data?
1.4.2 Priming: What can influence human perception?
1.4.3 The pros and cons of creating labels by evaluating machine learning predictions
1.4.4 Basic principles for designing annotation interfaces
1.5 Machine-learning-assisted humans vs. human-assisted machine learning
1.6 Transfer learning to kick-start your models
1.6.1 Transfer learning in computer vision
1.6.2 Transfer learning in NLP
1.7 What to expect in this text
Summary
2 Getting started with human-in-the-loop machine learning
2.1 Beyond hacktive learning: Your first active learning algorithm
2.2 The architecture of your first system
2.3 Interpreting model predictions and data to support active learning
2.3.1 Confidence ranking
2.3.2 Identifying outliers.
2.3.3 What to expect as you iterate
2.4 Building an interface to get human labels
2.4.1 A simple interface for labeling text
2.4.2 Managing machine learning data
2.5 Deploying your first human-in-the-loop machine learning system
2.5.1 Always get your evaluation data first
2.5.2 Every data point gets a chance
2.5.3 Select the right strategies for your data
2.5.4 Retrain the model and iterate
Summary
Part 2 Active learning
3 Uncertainty sampling
3.1 Interpreting uncertainty in a machine learning model
3.1.1 Why look for uncertainty in your model?
3.1.2 Softmax and probability distributions
3.1.3 Interpreting the success of active learning
3.2 Algorithms for uncertainty sampling
3.2.1 Least confidence sampling
3.2.2 Margin of confidence sampling
3.2.3 Ratio sampling
3.2.4 Entropy (classification entropy)
3.2.5 A deep dive on entropy
3.3 Identifying when different types of models are confused
3.3.1 Uncertainty sampling with logistic regression and MaxEnt models
3.3.2 Uncertainty sampling with SVMs
3.3.3 Uncertainty sampling with Bayesian models
3.3.4 Uncertainty sampling with decision trees and random forests
3.4 Measuring uncertainty across multiple predictions
3.4.1 Uncertainty sampling with ensemble models
3.4.2 Query by Committee and dropouts
3.4.3 The difference between aleatoric and epistemic uncertainty
3.4.4 Multilabeled and continuous value classification
3.5 Selecting the right number of items for human review
3.5.1 Budget-constrained uncertainty sampling
3.5.2 Time-constrained uncertainty sampling
3.5.3 When do I stop if I'm not time- or budget-constrained?
3.6 Evaluating the success of active learning
3.6.1 Do I need new test data?
3.6.2 Do I need new validation data?
3.7 Uncertainty sampling cheat sheet
3.8 Further reading.
3.8.1 Further reading for least confidence sampling
3.8.2 Further reading for margin of confidence sampling
3.8.3 Further reading for ratio of confidence sampling
3.8.4 Further reading for entropy-based sampling
3.8.5 Further reading for other machine learning models
3.8.6 Further reading for ensemble-based uncertainty sampling
Summary
4 Diversity sampling
4.1 Knowing what you don't know: Identifying gaps in your model's knowledge
4.1.1 Example data for diversity sampling
4.1.2 Interpreting neural models for diversity sampling
4.1.3 Getting information from hidden layers in PyTorch
4.2 Model-based outlier sampling
4.2.1 Use validation data to rank activations
4.2.2 Which layers should I use to calculate model-based outliers?
4.2.3 The limitations of model-based outliers
4.3 Cluster-based sampling
4.3.1 Cluster members, centroids, and outliers
4.3.2 Any clustering algorithm in the universe
4.3.3 K-means clustering with cosine similarity
4.3.4 Reduced feature dimensions via embeddings or PCA
4.3.5 Other clustering algorithms
4.4 Representative sampling
4.4.1 Representative sampling is rarely used in isolation
4.4.2 Simple representative sampling
4.4.3 Adaptive representative sampling
4.5 Sampling for real-world diversity
4.5.1 Common problems in training data diversity
4.5.2 Stratified sampling to ensure diversity of demographics
4.5.3 Represented and representative: Which matters?
4.5.4 Per-demographic accuracy
4.5.5 Limitations of sampling for real-world diversity
4.6 Diversity sampling with different types of models
4.6.1 Model-based outliers with different types of models
4.6.2 Clustering with different types of models
4.6.3 Representative sampling with different types of models.
4.6.4 Sampling for real-world diversity with different types of models
4.7 Diversity sampling cheat sheet
4.8 Further reading
4.8.1 Further reading for model-based outliers
4.8.2 Further reading for cluster-based sampling
4.8.3 Further reading for representative sampling
4.8.4 Further reading for sampling for real-world diversity
Summary
5 Advanced active learning
5.1 Combining uncertainty sampling and diversity sampling
5.1.1 Least confidence sampling with cluster-based sampling
5.1.2 Uncertainty sampling with model-based outliers
5.1.3 Uncertainty sampling with model-based outliers and clustering
5.1.4 Representative sampling cluster-based sampling
5.1.5 Sampling from the highest-entropy cluster
5.1.6 Other combinations of active learning strategies
5.1.7 Combining active learning scores
5.1.8 Expected error reduction sampling
5.2 Active transfer learning for uncertainty sampling
5.2.1 Making your model predict its own errors
5.2.2 Implementing active transfer learning
5.2.3 Active transfer learning with more layers
5.2.4 The pros and cons of active transfer learning
5.3 Applying active transfer learning to representative sampling
5.3.1 Making your model predict what it doesn't know
5.3.2 Active transfer learning for adaptive representative sampling
5.3.3 The pros and cons of active transfer learning for representative sampling
5.4 Active transfer learning for adaptive sampling
5.4.1 Making uncertainty sampling adaptive by predicting uncertainty
5.4.2 The pros and cons of ATLAS
5.5 Advanced active learning cheat sheets
5.6 Further reading for active transfer learning
Summary
6 Applying active learning to different machine learning tasks
6.1 Applying active learning to object detection.
6.1.1 Accuracy for object detection: Label confidence and localization
6.1.2 Uncertainty sampling for label confidence and localization in object detection
6.1.3 Diversity sampling for label confidence and localization in object detection
6.1.4 Active transfer learning for object detection
6.1.5 Setting a low object detection threshold to avoid perpetuating bias
6.1.6 Creating training data samples for representative sampling that are similar to your predictions
6.1.7 Sampling for image-level diversity in object detection
6.1.8 Considering tighter masks when using polygons
6.2 Applying active learning to semantic segmentation
6.2.1 Accuracy for semantic segmentation
6.2.2 Uncertainty sampling for semantic segmentation
6.2.3 Diversity sampling for semantic segmentation
6.2.4 Active transfer learning for semantic segmentation
6.2.5 Sampling for image-level diversity in semantic segmentation
6.3 Applying active learning to sequence labeling
6.3.1 Accuracy for sequence labeling
6.3.2 Uncertainty sampling for sequence labeling
6.3.3 Diversity sampling for sequence labeling
6.3.4 Active transfer learning for sequence labeling
6.3.5 Stratified sampling by confidence and tokens
6.3.6 Create training data samples for representative sampling that are similar to your predictions
6.3.7 Full-sequence labeling
6.3.8 Sampling for document-level diversity in sequence labeling
6.4 Applying active learning to language generation
6.4.1 Calculating accuracy for language generation systems
6.4.2 Uncertainty sampling for language generation
6.4.3 Diversity sampling for language generation
6.4.4 Active transfer learning for language generation
6.5 Applying active learning to other machine learning tasks
6.5.1 Active learning for information retrieval
6.5.2 Active learning for video.
6.5.3 Active learning for speech.

Human-in-the-loop machine learning active learning and annotation for human-centered AI

Similar Items