Modern Computer Vision with Pytorch A Practical Roadmap from Deep Learning Fundamentals to Advanced Applications and Generative AI

Whether you are a beginner or are looking to progress in your computer vision career, this book guides you through the fundamentals of neural networks (NNs) and PyTorch and how to implement state-of-the-art architectures for real-world tasks. The second edition of Modern Computer Vision with PyTorch...

Descripción completa

Detalles Bibliográficos
Autor principal:	Ayyadevara, V. Kishore (-)
Otros Autores:	Reddy, Yeshwanth
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham : Packt Publishing, Limited 2023.
Edición:	2nd ed
Colección:	Expert insight.
Materias:	PyTorch. Computer vision. Machine learning.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009828023106719

Tabla de Contenidos:

Cover
Copyright
Contributors
Table of Contents
Preface
Section 1: Fundamentals of Deep Learning for Computer Vision
Chapter 1: Artificial Neural Network Fundamentals
Comparing AI and traditional machine learning
Learning about the ANN building blocks
Implementing feedforward propagation
Calculating the hidden layer unit values
Applying the activation function
Calculating the output layer values
Calculating loss values
Calculating loss during continuous variable prediction
Calculating loss during categorical variable prediction
Feedforward propagation in code
Activation functions in code
Loss functions in code
Implementing backpropagation
Gradient descent in code
Implementing backpropagation using the chain rule
Putting feedforward propagation and backpropagation together
Understanding the impact of the learning rate
Learning rate of 0.01
Learning rate of 0.1
Learning rate of 1
Summarizing the training process of a neural network
Summary
Questions
Chapter 2: PyTorch Fundamentals
Installing PyTorch
PyTorch tensors
Initializing a tensor
Operations on tensors
Auto gradients of tensor objects
Advantages of PyTorch's tensors over NumPy's ndarrays
Building a neural network using PyTorch
Dataset, DataLoader, and batch size
Predicting on new data points
Implementing a custom loss function
Fetching the values of intermediate layers
Using a sequential method to build a neural network
Saving and loading a PyTorch model
Using state_dict
Saving
Loading
Summary
Questions
Chapter 3: Building a Deep Neural Network with PyTorch
Representing an image
Converting images into structured arrays and scalars
Creating a structured array for colored images
Why leverage neural networks for image analysis?.
Preparing our data for image classification
Training a neural network
Scaling a dataset to improve model accuracy
Understanding the impact of varying the batch size
Batch size of 32
Batch size of 10,000
Understanding the impact of varying the loss optimizer
Building a deeper neural network
Understanding the impact of batch normalization
Very small input values without batch normalization
Very small input values with batch normalization
The concept of overfitting
Impact of adding dropout
Impact of regularization
L1 regularization
L2 regularization
Summary
Questions
Section 2: Object Classification and Detection
Chapter 4: Introducing Convolutional Neural Networks
The problem with traditional deep neural networks
Building blocks of a CNN
Convolution
Filters
Strides and padding
Strides
Padding
Pooling
Putting them all together
How convolution and pooling help in image translation
Implementing a CNN
Classifying images using deep CNNs
Visualizing the outcome of feature learning
Building a CNN for classifying real-world images
Impact on the number of images used for training
Summary
Questions
Chapter 5: Transfer Learning for Image Classification
Introducing transfer learning
Understanding the VGG16 architecture
Implementing VGG16
Understanding the ResNet architecture
Implementing ResNet18
Implementing facial keypoint detection
2D and 3D facial keypoint detection
Implementing age estimation and gender classification
Introducing the torch_snippets library
Summary
Questions
Chapter 6: Practical Aspects of Image Classification
Generating CAMs
Understanding the impact of data augmentation and batch normalization
Coding up road sign detection
Practical aspects to take care of during model implementation.
Imbalanced data
The size of the object within an image
The difference between training and validation data
The number of nodes in the flatten layer
Image size
OpenCV utilities
Summary
Questions
Chapter 7: Basics of Object Detection
Introducing object detection
Creating a bounding-box ground truth for training
Understanding region proposals
Leveraging SelectiveSearch to generate region proposals
Implementing SelectiveSearch to generate region proposals
Understanding IoU
Non-max suppression
Mean average precision
Training R-CNN-based custom object detectors
Working details of R-CNN
Implementing R-CNN for object detection on a custom dataset
Downloading the dataset
Preparing the dataset
Fetching region proposals and the ground truth of offset
Creating the training data
R-CNN network architecture
Predicting on a new image
Training Fast R-CNN-based custom object detectors
Working details of Fast R-CNN
Implementing Fast R-CNN for object detection on a custom dataset
Summary
Questions
Chapter 8: Advanced Object Detection
Components of modern object detection algorithms
Anchor boxes
Region proposal network
Classification and regression
Training Faster R-CNN on a custom dataset
Working details of YOLO
Training YOLO on a custom dataset
Installing Darknet
Setting up the dataset format
Configuring the architecture
Training and testing the model
Working details of SSD
Components in SSD code
SSD300
MultiBoxLoss
Training SSD on a custom dataset
Summary
Questions
Chapter 9: Image Segmentation
Exploring the U-Net architecture
Performing upscaling
Implementing semantic segmentation using U-Net
Exploring the Mask R-CNN architecture
RoI Align
Mask head.
Implementing instance segmentation using Mask R-CNN
Predicting multiple instances of multiple classes
Summary
Questions
Chapter 10: Applications of Object Detection and Segmentation
Multi-object instance segmentation
Fetching and preparing data
Training the model for instance segmentation
Making inferences on a new image
Human pose detection
Crowd counting
Implementing crowd counting
Image colorization
3D object detection with point clouds
Theory
Input encoding
Output encoding
Training the YOLO model for 3D object detection
Data format
Data inspection
Training
Testing
Action recognition from video
Identifying an action in a given video
Training a recognizer on a custom dataset
Summary
Questions
Section 3: Image Manipulation
Chapter 11: Autoencoders and Image Manipulation
Understanding autoencoders
How autoencoders work
Implementing vanilla autoencoders
Implementing convolutional autoencoders
Grouping similar images using t-SNE
Understanding variational autoencoders
The need for VAEs
How VAEs work
KL divergence
Building a VAE
Performing an adversarial attack on images
Understanding neural style transfer
How neural style transfer works
Performing neural style transfer
Understanding deepfakes
How deepfakes work
Generating a deepfake
Summary
Questions
Chapter 12: Image Generation Using GANs
Introducing GANs
Using GANs to generate handwritten digits
Using DCGANs to generate face images
Implementing conditional GANs
Summary
Questions
Chapter 13: Advanced GANs to Manipulate Images
Leveraging the Pix2Pix GAN
Leveraging CycleGAN
How CycleGAN works
Implementing CycleGAN
Leveraging StyleGAN on custom images
The evolution of StyleGAN
Implementing StyleGAN
Introducing SRGAN.
Architecture
Coding SRGAN
Summary
Questions
Section 4: Combining Computer Vision with Other Techniques
Chapter 14: Combining Computer Vision and Reinforcement Learning
Learning the basics of reinforcement learning
Calculating the state value
Calculating the state-action value
Implementing Q-learning
Defining the Q-value
Understanding the Gym environment
Building a Q-table
Leveraging exploration-exploitation
Implementing deep Q-learning
Understanding the CartPole environment
Performing CartPole balancing
Implementing deep Q-learning with the fixed targets model
Understanding the use case
Coding up an agent to play Pong
Implementing an agent to perform autonomous driving
Setting up the CARLA environment
Installing the CARLA binaries
Installing the CARLA Gym environment
Training a self-driving agent
Creating model.py
Creating actor.py
Training a DQN with fixed targets
Summary
Questions
Chapter 15: Combining Computer Vision and NLP Techniques
Introducing transformers
Basics of transformers
Encoder block
Decoder block
How ViTs work
Implementing ViTs
Transcribing handwritten images
Handwriting transcription workflow
Handwriting transcription in code
Document layout analysis
Understanding LayoutLM
Implementing LayoutLMv3
Visual question answering
Introducing BLIP2
Representation learning
Generative learning
Implementing BLIP2
Summary
Questions
Chapter 16: Foundation Models in Computer Vision
Introducing CLIP
How CLIP works
Building a CLIP model from scratch
Leveraging OpenAI CLIP
Introducing SAM
How SAM works
Implementing SAM
How FastSAM works
All-instance segmentation
Prompt-guided selection
Implementing FastSAM
Introducing diffusion models
How diffusion models work.
Diffusion model architecture.

Modern Computer Vision with Pytorch A Practical Roadmap from Deep Learning Fundamentals to Advanced Applications and Generative AI

Ejemplares similares