Computer vision projects with PyTorch design and develop production-grade models

Design and develop end-to-end, production-grade computer vision projects for real-world industry problems. This book discusses computer vision algorithms and their applications using PyTorch. The book begins with the fundamentals of computer vision: convolutional neural nets, RESNET, YOLO, data augm...

Descripción completa

Detalles Bibliográficos
Otros Autores: Kulkarni, Akshay, author (author), Shivananda, Adarsha, author, Sharma, Nitin Ranjan, author
Formato: Libro electrónico
Idioma:Inglés
Publicado: New York, New York : Apress [2022]
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009671497006719
Tabla de Contenidos:
  • Intro
  • Table of Contents
  • About the Authors
  • About the Technical Reviewer
  • Introduction
  • Chapter 1: The Building Blocks of Computer Vision
  • What Is Computer Vision
  • Applications
  • Classification
  • Object Detection and Localization
  • Image Segmentation
  • Anomaly Detection
  • Video Analysis
  • Channels
  • Convolutional Neural Networks
  • Receptive Field
  • Local Receptive Field
  • Global Receptive Field
  • Pooling
  • Max Pooling
  • Average Pooling
  • Global Average Pooling
  • Calculation: Feature Map and Receptive Fields
  • Kernel
  • Stride
  • Pooling
  • Padding
  • Input and Output
  • Calculation of Receptive Field
  • Understanding the CNN Architecture Type
  • Understanding Types of Architecture
  • AlexNet
  • VGG
  • ResNet
  • Inception Architectures
  • Working with Deep Learning Model Techniques
  • Batch Normalization
  • Dropouts
  • Data Augmentation Techniques
  • Introduction to PyTorch
  • Installation
  • Basic Start
  • Summary
  • Chapter 2: Image Classification
  • Topics to Cover
  • Defining the Problem
  • Overview of the Approach
  • Creating an Image Classification Pipeline
  • First Basic Model
  • Data
  • Data Exploration
  • Data Loader
  • Define the Model
  • The Training Process
  • The Second Variation of Model
  • The Third Variation of the Model
  • The Fourth Variation of the Model
  • Summary
  • Chapter 3: Building an Object Detection Model
  • Object Detection Using Boosted Cascade
  • R-CNN
  • The Region Proposal Network
  • Fast Region-Based Convolutional Neural Network
  • How the Region Proposal Network Works
  • The Anchor Generation Layer
  • The Region Proposal Layer
  • Mask R-CNN
  • Prerequisites
  • YOLO
  • YOLO V2/V3
  • Project Code Snippets
  • Step 1: Getting Annotated Data
  • Step 2: Fixing the Configuration File and Training
  • The Model File
  • Summary
  • Chapter 4: Building an Image Segmentation Model.
  • Image Segmentation
  • Pretrained Support from PyTorch
  • Semantic Segmentation
  • Instance Segmentation
  • Fine-Tuning the Model
  • Summary
  • Chapter 5: Image-Based Search and Recommendation System
  • Problem Statement
  • Approach and Methodology
  • Implementation
  • The Dataset
  • Installing and Importing Libraries
  • Importing and Understanding the Data
  • Feature Engineering
  • ResNet18
  • Calculating Similarity and Ranking
  • Visualizing the Recommendations
  • Taking Image Input from Users and Recommending Similar Products
  • Summary
  • Chapter 6: Pose Estimation
  • Top-Down Approach
  • Bottom-Up Approach
  • OpenPose
  • Branch-1
  • Branch-2
  • HRNet (High-Resolution Net)
  • Higher HRNet
  • PoseNet
  • How Does PoseNet Work?
  • Single Person Pose Estimation
  • Multi-Person Pose Estimation
  • Pros and Cons of PoseNet
  • Applications of Pose Estimation
  • Test Cases Performed Retail Store Videos
  • Implementation
  • Step 1: Identify the List of Human Keypoints to Track
  • Step 2: Identify the Possible Connections Between the Keypoints
  • Step 3: Load the Pretrained Model from the PyTorch Library
  • Step 4: Input Image Preprocessing and Modeling
  • Step 5: Build Custom Functions to Plot the Output
  • Step 6: Plot the Output on the Input Image
  • Summary
  • Chapter 7: Image Anomaly Detection
  • Anomaly Detection
  • Approach 1: Using a Pretrained Classification Model
  • Step 1: Import the Required Libraries
  • Step 2: Create the Seed and Deterministic Functions
  • Step 3: Set the Hyperparameter
  • Step 4: Import the Dataset
  • Step 5: Image Preprocessing Stage
  • Step 6: Load the Pretrained Model
  • Step 7: Freeze the Model
  • Step 8: Train the Model
  • Step 9: Evaluate the Model
  • Approach 2: Using Autoencoder
  • Step 1: Prepare the Dataset Object
  • Step 2: Build the Autoencoder Network
  • Step 3: Train the Autoencoder Network.
  • Step 4: Calculate the Reconstruction Loss Based on the Original Data
  • Step 5: Select the Most Anomalous Digit Based on the Error Metric Score
  • Output
  • Summary
  • Chapter 8: Image Super-Resolution
  • Up-Scaling Using the Nearest Neighbor Concept
  • Understanding Bilinear Up-Scaling
  • Variational Autoencoders
  • Generative Adversarial Networks
  • The Model Code
  • Model Development
  • Imports
  • Running the Application
  • Summary
  • Chapter 9: Video Analytics
  • Problem Statement
  • Approach
  • Implementation
  • Data
  • Uploading the Required Videos to Google Colab
  • Convert the Video to a Series of Images
  • Image Extraction
  • Data Preparation
  • Identify the Hotspots in a Retail Store
  • Importing Images
  • Getting Crowd Counts
  • Security and Surveillance
  • Identify the Demographics (Age and Gender)
  • Summary
  • Chapter 10: Explainable AI for Computer Vision
  • Grad-CAM
  • Grad-CAM++
  • NBDT
  • Step 1
  • Step 2
  • Steps 3 and 4
  • Grad-CAM and Grad-CAM++ Implementation
  • Grad-CAM and Grad-CAM++ Implementation on a Single Image
  • NBDT Implementation on a Single Image
  • Summary
  • Index.