Intelligent speech signal processing

Intelligent Speech Signal Processing investigates the utilization of speech analytics across several systems and real-world activities, including sharing data analytics, creating collaboration networks between several participants, and implementing video-conferencing in different application areas....

Descripción completa

Detalles Bibliográficos
Otros Autores:	Dey, Nilanjan, author (author), Dey, Nilanjan, editor (editor)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	London, England : Academic Press [2019]
Edición:	First edition
Materias:	Automatic speech recognition.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630433606719

Tabla de Contenidos:

Front Cover
Intelligent Speech Signal Processing
Copyright
Contents
Contributors
About the Editor
Preface
Chapter 1: Speech Processing in Healthcare: Can We Integrate?
References
Chapter 2: End-to-End Acoustic Modeling Using Convolutional Neural Networks
2.1. Introduction
2.2. Related Work
2.3. Various Architecture of ASR
2.3.1. GMM/DNN
2.3.2. Attention Mechanism
2.3.3. Connectionist Temporal Classification
2.4. Convolutional Neural Networks
2.4.1. Type of Pooling
2.4.1.1. Max Pooling
2.4.1.2. Average Pooling
2.4.1.3. Stochastic Pooling
2.4.1.4. Lp Pooling
2.4.1.5. Mixed Pooling
2.4.1.6. Multiscale Orderless Pooling
2.4.1.7. Spectral Pooling
2.4.2. Types of Nonlinear Functions
2.4.2.1. Sigmoid Neurons
2.4.2.2. Maxout Neurons
2.4.2.3. Rectified Linear Units
2.4.2.4. Parameterized Rectified Linear Units
2.4.2.5. Dropout
2.5. CNN-Based End-to-End Approach
2.6. Experiments and Their Results
2.7. Conclusion
References
Chapter 3: A Real-Time DSP-Based System for Voice Activity Detection and Background Noise Reduction
3.1. Introduction
3.2. Microchip dsPIC33 Digital Signal Controller
3.2.1. VAD and Noise Suppression Algorithm
3.3. High Pass Filter
3.4. Fast Fourier Transform
3.5. Channel Energy Computation
3.6. Channel SNR Computation
3.7. VAD Decision
3.8. VAD Hangover
3.9. Computation of Scaling Factor
3.10. Scaling of Frequency Channels
3.11. Inverse Fourier Transform
3.12. Application Programming Interface
3.13. Resource Requirements
3.14. Microchip PIC Programmer
3.15. Audio Components
3.16. VAD and Background Noise Reduction Techniques
3.17. Results and Discussion
3.18. Conclusion and Discussion
References
Further Reading.
Chapter 4: Disambiguating Conflicting Classification Results in AVSR
4.1. Introduction
4.2. Detection of Conflicting Classes
4.3. Complementary Models for Classification
4.4. Proposed Cascade of Classifiers
4.5. Audio-Visual Databases
4.5.1. AV-CMU Database
4.5.2. AV-UNR Database
4.5.3. AVLetters Database
4.6. Experimental Results
4.6.1. Hidden Markov Models
4.6.2. Random Forest
4.6.3. Support Vector Machine
4.6.4. AdaBoost
4.6.5. Analysis and Comparison
4.7. Conclusions
References
Chapter 5: A Deep Dive Into Deep Learning Techniques for Solving Spoken Language Identification Problems
5.1. Introduction
5.2. Spoken Language Identification
5.3. Cues for Spoken Language Identification
5.4. Stages in Spoken Language Identification
5.5. Deep Learning
5.6. Artificial and Deep Neural Network
5.7. Comparison of Spoken LID System Implementations with Deep Learning Techniques
5.8. Discussion
5.9. Conclusion
References
Chapter 6: Voice Activity Detection-Based Home Automation System for People With Special Needs
6.1. Introduction
6.2. Conceptual Design of the System
6.3. System Implementation
6.3.1. Speech Recognition
6.3.2. System Automation
6.3.3. Other Applications
6.3.4. Results and Discussion
6.4. Significance/Contribution
6.5. Conclusion
References
Further Reading
Chapter 7: Speech Summarization for Tamil Language
7.1. Introduction
7.2. Extractive Summarization
7.2.1. Supervised Summarization Methods
7.2.2. Unsupervised Summarization Methods
7.3. Abstractive Summarization
7.3.1. Structured Approach
7.3.1.1. Tree-Based Technique
7.3.1.2. Template-Based Technique
7.3.1.3. Ontology-Based Technique
7.3.1.4. Lead and Body Phrase Technique
7.3.1.5. Rule-Based Technique
7.3.2. Semantic-Based Approach.
7.3.2.1. Multimodal Semantic Technique
7.3.2.2. Information Item-Based Technique
7.3.2.3. Semantic Graph-Based Technique
7.4. Need for Speech Summarization
7.5. Issues in the Summarization of a Spoken Document
7.6. Tamil Language
7.6.1. Tamil Unicode
7.7. System Design for Summarization of Speech Data in Tamil Language
7.7.1. Speech Recognition Techniques
7.7.2. Isolated Tamil Speech Recognition
7.7.2.1. Related Work on Tamil Speech Recognition
7.7.3. Features Used for Summarization
7.7.3.1. Acoustic/Prosodic Feature
7.7.3.2. Lexical Feature
7.7.3.3. Part-of-Speech Tagging
7.7.3.4. Stop Word Removal
7.7.3.5. Structural Feature
7.7.3.6. Discourse Feature
7.7.3.7. Relevance Feature
7.7.4. Related Work on Tamil Text Summarization
7.8. Evaluation Metrics
7.8.1. Rouge
7.8.1.1. ROUGE-n
7.8.1.2. ROUGE-L
7.8.1.3. ROUGE-SU
7.8.2. Precision, Recall, and F-Measure
7.8.3. Word Error Rate
7.8.4. Word Recognition Rate
7.9. Speech Corpora for Tamil Language
7.10. Conclusion
References
Further Reading
Chapter 8: Classifying Recurrent Dynamics on Emotional Speech Signals
8.1. Introduction
8.2. Data Collection and Processing
8.3. Research Methodology
8.3.1. Phase Space Reconstruction
8.3.2. Recurrence Plot Analysis
8.4. Numerical Experiments and Results
8.4.1. Noise-Free Environment
8.4.2. Noisy Environment
8.5. Conclusion
References
Chapter 9: Intelligent Speech Processing in the Time-Frequency Domain
9.1. Wavelet Packet Decomposition
9.1.1. Spectral Analysis
9.1.2. Pitch Detection
9.2. Empirical Mode Decomposition
9.2.1. EMD of Synthetic and Speech Signal
9.2.2. Spectrum of IMFs
9.2.3. Estimation of Pitch
9.3. Variational Mode Decomposition
9.3.1. Voiced and Unvoiced Detection of Speech.
9.3.2. Estimation of Pitch Period
9.4. Synchrosqueezing Wavelet Transform: EMD Like a Tool
9.4.1. Reconstructions of Speech Signal Using Synchrosqueezed Transform
9.4.2. Advantage of Synchrosqueezed Wavelet Transforms
9.5. Applications of the Decomposition Technique
9.5.1. Feature Extraction
9.5.2. Clinical Diagnosis and Pathological Speech Processing
9.5.3. Automatic Speech Recognition
9.5.4. Other Application Area
9.6. Conclusion
References
Further Reading
Chapter 10: A Framework for Artificially Intelligent Customized Voice Response System Design using Speech Synthesis Marku ...
10.1. Introduction
10.2. Literature Survey
10.3. AWS IoT
10.4. Amazon Voice Service (AVS)
10.5. AWS Lambda
10.6. Message Queuing Telemetry Transport (MQTT)
10.7. Proposed Architecture
10.8. Conclusion
References
Index
Back Cover.

Intelligent speech signal processing

Ejemplares similares