Graph Based Multimedia Analysis

Graph Based Multimedia Analysis applies concepts from graph theory to the problems of analyzing overabundant video data. Video data can be quite diverse: exocentric (captured by a standard camera) or egocentric (captured by a wearable device like Google Glass); of various durations (ranging from a f...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Chowdhury, Ananda S., author (author), Sahu, Abhimanyu, author
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Cambridge, MA : Morgan Kaufmann [2025]
Edición:	First edition
Materias:	Graph theory > Data processing. Graph algorithms. Big data. Data mining. Image processing.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009842233906719

Tabla de Contenidos:

Front Cover
Graph Based Multimedia Analysis
Copyright
Contents
List of figures
List of tables
Biography
Foreword
Preface
1 Introduction
1.1 Motivation
1.2 Chapter organization
1.3 Basics of multimedia
1.4 Preliminaries of a video
1.5 Multimedia problems
1.6 Graph based solutions
1.7 Other solution models
1.8 Organization of the book
References
2 Theoretical foundations
2.1 Motivation
2.2 Organization
2.3 Graph basics
2.4 Delaunay graph
2.5 Bipartite graph
2.6 Minimum spanning tree
2.7 Optimum path forest
2.8 Random walks on a graph
2.9 Knapsack problems
2.10 Elementary game theory
References
3 Exocentric video summarization
3.1 Motivation
3.2 Chapter organization
3.3 Related works
3.3.1 Related works for exocentric video summarization
3.3.2 Related works for scalable exocentric video summarization
3.4 Method I: Delaunay graph based solutions for exocentric video summarization
3.4.1 Method IA: Constrained Delaunay graph clustering based summary
3.4.1.1 Video frame presampling
3.4.1.2 Feature extraction
3.4.1.3 Elimination of redundant frames
3.4.1.4 Delaunay graph based constrained clustering
3.4.1.5 Key frame extraction
3.4.2 Method IB: Delaunay graph based summary with user customization
3.4.3 Method IC: Delaunay graph based summary in enhanced feature space
3.5 Method II: A graph modularity based clustering for exocentric video summarization
3.5.1 Compressed domain feature extraction
3.5.2 Multi-feature fusion
3.5.3 Graph modularity based clustering
3.5.4 Key frame extraction
3.6 Scalable exocentric video summarization with skeleton graph and random walk
3.6.1 Extraction of skeleton graph
3.6.2 Clustering of skeleton graph via MST
3.6.3 Label propagation with random walks.
3.6.4 Key frame selection and ranking
3.7 Time-complexity analysis
3.7.1 Complexity analysis of an exocentric video summarization algorithm
3.7.2 Complexity analysis of the scalable exocentric video summarization algorithm
3.8 Experimental test bed
3.8.1 Dataset(s)
3.8.2 Performance measures
3.8.2.1 Objective measures
3.8.2.2 Subjective measures
3.9 Results of Delaunay graph based exocentric video summarization methods
3.9.1 Results of constrained Delaunay graph clustering based summary
3.9.1.1 Performance analysis with information theoretic presampling
3.9.1.2 Performance analysis with deviation ratio constraint
3.9.1.3 Performance comparison with state-of-the-art methods
3.9.1.4 Performance comparison with K-means clustering
3.9.1.5 Clustering performance analysis
3.9.1.6 Tuning of the parameters
3.9.1.7 Key frame visualization
3.9.2 Results of Delaunay graph based summary with user customization
3.9.3 Results of Delaunay graph based summary in enhanced feature space
3.9.3.1 Performance analysis with semantic features
3.9.3.2 Performance analysis with CCA
3.10 Results of graph modularity based solution
3.11 Results of scalable exocentric video summarization
3.11.1 Objective evaluations
3.11.2 Subjective evaluations
3.11.3 Comparison of execution times
3.12 Summary
3.12.1 Summary of exocentric video summarization
3.12.2 Summary of scalable exocentric video summarization
References
4 Multi-view exocentric video summarization
4.1 Motivation
4.2 Chapter organization
4.3 Related work
4.4 Proposed method
4.4.1 Video preprocessing
4.4.1.1 Shot detection and representation
4.4.1.2 Feature extraction
4.4.2 Unimportant shot elimination using Gaussian entropy
4.4.3 Multi-view correlation using bipartite matching.
4.4.4 Shot clustering by OPF
4.5 Time-complexity analysis
4.6 Experimental results
4.6.1 Datasets
4.6.2 Performance measures
4.6.3 Ablation study
4.6.4 Comparison with mono-view methods
4.6.5 Comparison with multi-view methods
4.7 Summary
References
5 Egocentric video summarization
5.1 Motivation
5.2 Chapter organization
5.3 Related work
5.4 Proposed methods
5.4.1 Method I: Egocentric video summarization with different graph representations
5.4.1.1 Graph based shot boundary detection
5.4.1.2 Graph based representative frame selection
5.4.1.3 Graph based center-surround model
5.4.1.4 Graph based feature extraction
5.4.1.5 Construction of the VSG
5.4.1.6 MST based clustering with a new measure of edge inadmissibility
5.4.2 Method II: Egocentric video summarization with deep features and optimal clustering
5.4.2.1 Feature extraction using deep learning
5.4.2.2 Set of number of clusters
5.4.2.3 CSMIK K-means
5.4.2.3.1 Center-surround model
5.4.2.3.2 Integer knapsack formulation
5.4.2.3.3 CSMIK K-means
5.5 Time-complexity analysis
5.5.1 Method-I: Time complexity analysis
5.5.2 Method-II: Time complexity analysis
5.6 Experimental results
5.6.1 Datasets
5.6.2 Performance measures
5.6.3 Tuning of the parameters
5.6.3.1 Method-I: Tuning of the parameters
5.6.3.2 Method-II: Tuning of the parameters
5.6.4 Method-I: Experimental results and analysis
5.6.4.1 Ablation studies
5.6.4.2 Cluster validation
5.6.4.3 Results on SumMe dataset
5.6.4.4 Results on TvSum50 dataset
5.6.5 Method-II: Experimental results and analysis
5.6.5.1 Ablation studies
5.6.5.2 Cluster validation
5.6.5.3 Results on SumMe dataset
5.6.5.4 Results on TvSum50 dataset
5.6.5.5 Results on ADL dataset
5.6.5.6 Results on Base jumping from CoSum dataset.
5.6.5.7 Comparison with human performance
5.6.5.8 Test of statistical significance
5.6.5.9 Execution times
5.6.5.10 Keyframe visualization
5.7 Summary
References
6 Egocentric video cosummarization
6.1 Motivation
6.2 Chapter organization
6.3 Related work
6.4 Proposed methods
6.4.1 Shot segmentation
6.4.2 Center-surround model
6.4.3 Method I: Egocentric video cosummarization with bipartite graph matching and game theory
6.4.3.1 A game-theoretic model of visual similarity
6.4.3.2 Shot correspondence using bipartite graph matching
6.4.4 Method II: Egocentric video cosummarization with random walks on a constrained graph and transfer learning
6.4.4.1 Feature extraction using transfer learning
6.4.4.2 A video representation graph
6.4.4.3 Must-link and cannot-link constraints
6.4.4.4 Must-link constrained modified graph
6.4.4.5 Shot clustering by random walk with label refinement
6.5 Time-complexity analysis
6.5.1 Method-I: Time-complexity analysis
6.5.2 Method-II: Time-complexity analysis
6.6 Experimental results
6.6.1 Datasets
6.6.2 Performance measures
6.6.3 Tuning of the parameters
6.6.4 Method-I: Experimental results and analysis
6.6.4.1 Ablation study
6.6.4.2 Comparisons with other approaches
6.6.4.3 Test of statistical significance
6.6.4.4 Execution times
6.6.5 Method-II: Experimental results and analysis
6.6.5.1 Implementation details
6.6.5.2 Ablation studies
6.6.5.3 Results on short duration videos
6.6.5.4 Results on long duration videos
6.6.5.5 Comparison with human performance
6.6.5.6 Test of statistical significance
6.6.5.7 Execution times
6.7 Summary
References
7 Action recognition in egocentric video
7.1 Motivation
7.2 Chapter organization
7.3 Related work
7.4 Proposed method.
7.4.1 Method I: Action recognition in egocentric video with shallow feature and video similarity graph
7.4.1.1 PHOG feature extraction
7.4.1.2 Features from the center-surround model
7.4.1.3 Construction of the VSG graph
7.4.1.4 Random walk on VSG
7.4.2 Method II: Action recognition in egocentric video with deep features and video representation graph
7.4.2.1 Center-surround model
7.4.2.2 Superpixel extraction
7.4.2.3 Feature extraction using deep learning
7.4.2.4 Video representation graph
7.4.2.5 Random walk based action labeling
7.4.2.6 Action summary
7.5 Time-complexity analysis
7.5.1 Method-I: Time-complexity analysis
7.5.2 Method-II: Time-complexity analysis
7.6 Experimental results
7.6.1 Dataset
7.6.2 Performance measures
7.6.3 Tuning of the parameters
7.6.3.1 Method-I: Tuning of the parameters
7.6.3.2 Method-II: Tuning of the parameters
7.6.4 Method-I: Experimental results and analysis
7.6.4.1 Results on ADL dataset
7.6.5 Method-II: Experimental results and analysis
7.6.5.1 Ablation studies for action recognition
7.6.5.2 External comparisons for action recognition
Results on the ADL dataset:
Results on the GTEA dataset:
Results on the EGTEA Gaze+ dataset:
Results on the EgoGesture dataset:
Results on the EPIC-Kitchens dataset:
7.6.5.3 Action localization
7.6.5.4 Comparisons for action summarization
7.7 Summary
References
8 Conclusions
8.1 Concluding remarks
8.2 Future research directions
References
A Source codes
A.1 Organization
A.2 Source codes - constrained Delaunay graph clustering for exocentric video summarization
A.3 Source codes - optimum-path forest clustering for multi-view exocentric video summarization
A.4 Source codes - different graph representations for egocentric video summarization.
A.5 Source codes - deep feature and integer knapsack for egocentric video summarization.

Graph Based Multimedia Analysis

Ejemplares similares