Intelligent speech signal processing
Intelligent Speech Signal Processing investigates the utilization of speech analytics across several systems and real-world activities, including sharing data analytics, creating collaboration networks between several participants, and implementing video-conferencing in different application areas....
Otros Autores: | , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
London, England :
Academic Press
[2019]
|
Edición: | First edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630433606719 |
Tabla de Contenidos:
- Front Cover
- Intelligent Speech Signal Processing
- Copyright
- Contents
- Contributors
- About the Editor
- Preface
- Chapter 1: Speech Processing in Healthcare: Can We Integrate?
- References
- Chapter 2: End-to-End Acoustic Modeling Using Convolutional Neural Networks
- 2.1. Introduction
- 2.2. Related Work
- 2.3. Various Architecture of ASR
- 2.3.1. GMM/DNN
- 2.3.2. Attention Mechanism
- 2.3.3. Connectionist Temporal Classification
- 2.4. Convolutional Neural Networks
- 2.4.1. Type of Pooling
- 2.4.1.1. Max Pooling
- 2.4.1.2. Average Pooling
- 2.4.1.3. Stochastic Pooling
- 2.4.1.4. Lp Pooling
- 2.4.1.5. Mixed Pooling
- 2.4.1.6. Multiscale Orderless Pooling
- 2.4.1.7. Spectral Pooling
- 2.4.2. Types of Nonlinear Functions
- 2.4.2.1. Sigmoid Neurons
- 2.4.2.2. Maxout Neurons
- 2.4.2.3. Rectified Linear Units
- 2.4.2.4. Parameterized Rectified Linear Units
- 2.4.2.5. Dropout
- 2.5. CNN-Based End-to-End Approach
- 2.6. Experiments and Their Results
- 2.7. Conclusion
- References
- Chapter 3: A Real-Time DSP-Based System for Voice Activity Detection and Background Noise Reduction
- 3.1. Introduction
- 3.2. Microchip dsPIC33 Digital Signal Controller
- 3.2.1. VAD and Noise Suppression Algorithm
- 3.3. High Pass Filter
- 3.4. Fast Fourier Transform
- 3.5. Channel Energy Computation
- 3.6. Channel SNR Computation
- 3.7. VAD Decision
- 3.8. VAD Hangover
- 3.9. Computation of Scaling Factor
- 3.10. Scaling of Frequency Channels
- 3.11. Inverse Fourier Transform
- 3.12. Application Programming Interface
- 3.13. Resource Requirements
- 3.14. Microchip PIC Programmer
- 3.15. Audio Components
- 3.16. VAD and Background Noise Reduction Techniques
- 3.17. Results and Discussion
- 3.18. Conclusion and Discussion
- References
- Further Reading.
- Chapter 4: Disambiguating Conflicting Classification Results in AVSR
- 4.1. Introduction
- 4.2. Detection of Conflicting Classes
- 4.3. Complementary Models for Classification
- 4.4. Proposed Cascade of Classifiers
- 4.5. Audio-Visual Databases
- 4.5.1. AV-CMU Database
- 4.5.2. AV-UNR Database
- 4.5.3. AVLetters Database
- 4.6. Experimental Results
- 4.6.1. Hidden Markov Models
- 4.6.2. Random Forest
- 4.6.3. Support Vector Machine
- 4.6.4. AdaBoost
- 4.6.5. Analysis and Comparison
- 4.7. Conclusions
- References
- Chapter 5: A Deep Dive Into Deep Learning Techniques for Solving Spoken Language Identification Problems
- 5.1. Introduction
- 5.2. Spoken Language Identification
- 5.3. Cues for Spoken Language Identification
- 5.4. Stages in Spoken Language Identification
- 5.5. Deep Learning
- 5.6. Artificial and Deep Neural Network
- 5.7. Comparison of Spoken LID System Implementations with Deep Learning Techniques
- 5.8. Discussion
- 5.9. Conclusion
- References
- Chapter 6: Voice Activity Detection-Based Home Automation System for People With Special Needs
- 6.1. Introduction
- 6.2. Conceptual Design of the System
- 6.3. System Implementation
- 6.3.1. Speech Recognition
- 6.3.2. System Automation
- 6.3.3. Other Applications
- 6.3.4. Results and Discussion
- 6.4. Significance/Contribution
- 6.5. Conclusion
- References
- Further Reading
- Chapter 7: Speech Summarization for Tamil Language
- 7.1. Introduction
- 7.2. Extractive Summarization
- 7.2.1. Supervised Summarization Methods
- 7.2.2. Unsupervised Summarization Methods
- 7.3. Abstractive Summarization
- 7.3.1. Structured Approach
- 7.3.1.1. Tree-Based Technique
- 7.3.1.2. Template-Based Technique
- 7.3.1.3. Ontology-Based Technique
- 7.3.1.4. Lead and Body Phrase Technique
- 7.3.1.5. Rule-Based Technique
- 7.3.2. Semantic-Based Approach.
- 7.3.2.1. Multimodal Semantic Technique
- 7.3.2.2. Information Item-Based Technique
- 7.3.2.3. Semantic Graph-Based Technique
- 7.4. Need for Speech Summarization
- 7.5. Issues in the Summarization of a Spoken Document
- 7.6. Tamil Language
- 7.6.1. Tamil Unicode
- 7.7. System Design for Summarization of Speech Data in Tamil Language
- 7.7.1. Speech Recognition Techniques
- 7.7.2. Isolated Tamil Speech Recognition
- 7.7.2.1. Related Work on Tamil Speech Recognition
- 7.7.3. Features Used for Summarization
- 7.7.3.1. Acoustic/Prosodic Feature
- 7.7.3.2. Lexical Feature
- 7.7.3.3. Part-of-Speech Tagging
- 7.7.3.4. Stop Word Removal
- 7.7.3.5. Structural Feature
- 7.7.3.6. Discourse Feature
- 7.7.3.7. Relevance Feature
- 7.7.4. Related Work on Tamil Text Summarization
- 7.8. Evaluation Metrics
- 7.8.1. Rouge
- 7.8.1.1. ROUGE-n
- 7.8.1.2. ROUGE-L
- 7.8.1.3. ROUGE-SU
- 7.8.2. Precision, Recall, and F-Measure
- 7.8.3. Word Error Rate
- 7.8.4. Word Recognition Rate
- 7.9. Speech Corpora for Tamil Language
- 7.10. Conclusion
- References
- Further Reading
- Chapter 8: Classifying Recurrent Dynamics on Emotional Speech Signals
- 8.1. Introduction
- 8.2. Data Collection and Processing
- 8.3. Research Methodology
- 8.3.1. Phase Space Reconstruction
- 8.3.2. Recurrence Plot Analysis
- 8.4. Numerical Experiments and Results
- 8.4.1. Noise-Free Environment
- 8.4.2. Noisy Environment
- 8.5. Conclusion
- References
- Chapter 9: Intelligent Speech Processing in the Time-Frequency Domain
- 9.1. Wavelet Packet Decomposition
- 9.1.1. Spectral Analysis
- 9.1.2. Pitch Detection
- 9.2. Empirical Mode Decomposition
- 9.2.1. EMD of Synthetic and Speech Signal
- 9.2.2. Spectrum of IMFs
- 9.2.3. Estimation of Pitch
- 9.3. Variational Mode Decomposition
- 9.3.1. Voiced and Unvoiced Detection of Speech.
- 9.3.2. Estimation of Pitch Period
- 9.4. Synchrosqueezing Wavelet Transform: EMD Like a Tool
- 9.4.1. Reconstructions of Speech Signal Using Synchrosqueezed Transform
- 9.4.2. Advantage of Synchrosqueezed Wavelet Transforms
- 9.5. Applications of the Decomposition Technique
- 9.5.1. Feature Extraction
- 9.5.2. Clinical Diagnosis and Pathological Speech Processing
- 9.5.3. Automatic Speech Recognition
- 9.5.4. Other Application Area
- 9.6. Conclusion
- References
- Further Reading
- Chapter 10: A Framework for Artificially Intelligent Customized Voice Response System Design using Speech Synthesis Marku ...
- 10.1. Introduction
- 10.2. Literature Survey
- 10.3. AWS IoT
- 10.4. Amazon Voice Service (AVS)
- 10.5. AWS Lambda
- 10.6. Message Queuing Telemetry Transport (MQTT)
- 10.7. Proposed Architecture
- 10.8. Conclusion
- References
- Index
- Back Cover.