Computer vision projects with PyTorch design and develop production-grade models
Design and develop end-to-end, production-grade computer vision projects for real-world industry problems. This book discusses computer vision algorithms and their applications using PyTorch. The book begins with the fundamentals of computer vision: convolutional neural nets, RESNET, YOLO, data augm...
Otros Autores: | , , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
New York, New York :
Apress
[2022]
|
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009671497006719 |
Tabla de Contenidos:
- Intro
- Table of Contents
- About the Authors
- About the Technical Reviewer
- Introduction
- Chapter 1: The Building Blocks of Computer Vision
- What Is Computer Vision
- Applications
- Classification
- Object Detection and Localization
- Image Segmentation
- Anomaly Detection
- Video Analysis
- Channels
- Convolutional Neural Networks
- Receptive Field
- Local Receptive Field
- Global Receptive Field
- Pooling
- Max Pooling
- Average Pooling
- Global Average Pooling
- Calculation: Feature Map and Receptive Fields
- Kernel
- Stride
- Pooling
- Padding
- Input and Output
- Calculation of Receptive Field
- Understanding the CNN Architecture Type
- Understanding Types of Architecture
- AlexNet
- VGG
- ResNet
- Inception Architectures
- Working with Deep Learning Model Techniques
- Batch Normalization
- Dropouts
- Data Augmentation Techniques
- Introduction to PyTorch
- Installation
- Basic Start
- Summary
- Chapter 2: Image Classification
- Topics to Cover
- Defining the Problem
- Overview of the Approach
- Creating an Image Classification Pipeline
- First Basic Model
- Data
- Data Exploration
- Data Loader
- Define the Model
- The Training Process
- The Second Variation of Model
- The Third Variation of the Model
- The Fourth Variation of the Model
- Summary
- Chapter 3: Building an Object Detection Model
- Object Detection Using Boosted Cascade
- R-CNN
- The Region Proposal Network
- Fast Region-Based Convolutional Neural Network
- How the Region Proposal Network Works
- The Anchor Generation Layer
- The Region Proposal Layer
- Mask R-CNN
- Prerequisites
- YOLO
- YOLO V2/V3
- Project Code Snippets
- Step 1: Getting Annotated Data
- Step 2: Fixing the Configuration File and Training
- The Model File
- Summary
- Chapter 4: Building an Image Segmentation Model.
- Image Segmentation
- Pretrained Support from PyTorch
- Semantic Segmentation
- Instance Segmentation
- Fine-Tuning the Model
- Summary
- Chapter 5: Image-Based Search and Recommendation System
- Problem Statement
- Approach and Methodology
- Implementation
- The Dataset
- Installing and Importing Libraries
- Importing and Understanding the Data
- Feature Engineering
- ResNet18
- Calculating Similarity and Ranking
- Visualizing the Recommendations
- Taking Image Input from Users and Recommending Similar Products
- Summary
- Chapter 6: Pose Estimation
- Top-Down Approach
- Bottom-Up Approach
- OpenPose
- Branch-1
- Branch-2
- HRNet (High-Resolution Net)
- Higher HRNet
- PoseNet
- How Does PoseNet Work?
- Single Person Pose Estimation
- Multi-Person Pose Estimation
- Pros and Cons of PoseNet
- Applications of Pose Estimation
- Test Cases Performed Retail Store Videos
- Implementation
- Step 1: Identify the List of Human Keypoints to Track
- Step 2: Identify the Possible Connections Between the Keypoints
- Step 3: Load the Pretrained Model from the PyTorch Library
- Step 4: Input Image Preprocessing and Modeling
- Step 5: Build Custom Functions to Plot the Output
- Step 6: Plot the Output on the Input Image
- Summary
- Chapter 7: Image Anomaly Detection
- Anomaly Detection
- Approach 1: Using a Pretrained Classification Model
- Step 1: Import the Required Libraries
- Step 2: Create the Seed and Deterministic Functions
- Step 3: Set the Hyperparameter
- Step 4: Import the Dataset
- Step 5: Image Preprocessing Stage
- Step 6: Load the Pretrained Model
- Step 7: Freeze the Model
- Step 8: Train the Model
- Step 9: Evaluate the Model
- Approach 2: Using Autoencoder
- Step 1: Prepare the Dataset Object
- Step 2: Build the Autoencoder Network
- Step 3: Train the Autoencoder Network.
- Step 4: Calculate the Reconstruction Loss Based on the Original Data
- Step 5: Select the Most Anomalous Digit Based on the Error Metric Score
- Output
- Summary
- Chapter 8: Image Super-Resolution
- Up-Scaling Using the Nearest Neighbor Concept
- Understanding Bilinear Up-Scaling
- Variational Autoencoders
- Generative Adversarial Networks
- The Model Code
- Model Development
- Imports
- Running the Application
- Summary
- Chapter 9: Video Analytics
- Problem Statement
- Approach
- Implementation
- Data
- Uploading the Required Videos to Google Colab
- Convert the Video to a Series of Images
- Image Extraction
- Data Preparation
- Identify the Hotspots in a Retail Store
- Importing Images
- Getting Crowd Counts
- Security and Surveillance
- Identify the Demographics (Age and Gender)
- Summary
- Chapter 10: Explainable AI for Computer Vision
- Grad-CAM
- Grad-CAM++
- NBDT
- Step 1
- Step 2
- Steps 3 and 4
- Grad-CAM and Grad-CAM++ Implementation
- Grad-CAM and Grad-CAM++ Implementation on a Single Image
- NBDT Implementation on a Single Image
- Summary
- Index.