Next-Generation Sequencing Data Analysis

Detalles Bibliográficos
Otros Autores: Wang, Xinkun, 1966- author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Boca Raton, FL : CRC Press [2024]
Edición:Second edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009810646906719
Tabla de Contenidos:
  • Cover
  • Half Title
  • Title Page
  • Copyright Page
  • Table of Contents
  • Preface to the Second Edition
  • Author
  • Part I Introduction to Cellular and Molecular Biology
  • 1 The Cellular System and the Code of Life
  • 1.1 The Cellular Challenge
  • 1.2 How Cells Meet the Challenge
  • 1.3 Molecules in Cells
  • 1.4 Intracellular Structures Or Spaces
  • 1.4.1 Nucleus
  • 1.4.2 Cell Membrane
  • 1.4.3 Cytoplasm
  • 1.4.4 Endosome, Lysosome, and Peroxisome
  • 1.4.5 Ribosome
  • 1.4.6 Endoplasmic Reticulum
  • 1.4.7 Golgi Apparatus
  • 1.4.8 Cytoskeleton
  • 1.4.9 Mitochondrion
  • 1.4.10 Chloroplast
  • 1.5 The Cell as a System
  • 1.5.1 The Cellular System
  • 1.5.2 Systems Biology of the Cell
  • 1.5.3 How to Study the Cellular System
  • References
  • 2 DNA Sequence: The Genome Base
  • 2.1 The DNA Double Helix and Base Sequence
  • 2.2 How DNA Molecules Replicate and Maintain Fidelity
  • 2.3 How the Genetic Information Stored in DNA Is Transferred to Protein
  • 2.4 The Genomic Landscape
  • 2.4.1 The Minimal Genome
  • 2.4.2 Genome Sizes
  • 2.4.3 Protein-Coding Regions of the Genome
  • 2.4.4 Non-Coding Genomic Elements
  • 2.5 DNA Packaging, Sequence Access, and DNA-Protein Interactions
  • 2.5.1 DNA Packaging
  • 2.5.2 Sequence Access
  • 2.5.3 DNA-Protein Interactions
  • 2.6 DNA Sequence Mutation and Polymorphism
  • 2.7 Genome Evolution
  • 2.8 Epigenome and DNA Methylation
  • 2.9 Genome Sequencing and Disease Risk
  • 2.9.1 Mendelian (Single-Gene) Diseases
  • 2.9.2 Complex Diseases That Involve Multiple Genes
  • 2.9.3 Diseases Caused By Genome Instability
  • 2.9.4 Epigenomic/Epigenetic Diseases
  • References
  • 3 RNA: The Transcribed Sequence
  • 3.1 RNA as the Messenger
  • 3.2 The Molecular Structure of RNA
  • 3.3 Generation, Processing, and Turnover of RNA as a Messenger
  • 3.3.1 DNA Template
  • 3.3.2 Transcription of Prokaryotic Genes.
  • 3.3.3 Pre-MRNA Transcription of Eukaryotic Genes
  • 3.3.4 Maturation of MRNA
  • 3.3.5 Transport and Localization
  • 3.3.6 Stability and Decay
  • 3.3.7 Major Steps of MRNA Transcript Level Regulation
  • 3.4 RNA Is More Than a Messenger
  • 3.4.1 Ribozyme
  • 3.4.2 SnRNA and SnoRNA
  • 3.4.3 RNA for Telomere Replication
  • 3.4.4 RNAi and Small Non-Coding RNAs
  • 3.4.4.1 MiRNA
  • 3.4.4.2 SiRNA
  • 3.4.4.3 PiRNA
  • 3.4.5 Long Non-Coding RNAs
  • 3.4.6 Other Non-Coding RNAs
  • 3.5 The Cellular Transcriptional Landscape
  • References
  • Part II Introduction to Next-Generation Sequencing (NGS) and NGS Data Analysis
  • 4 Next-Generation Sequencing (NGS) Technologies: Ins and Outs
  • 4.1 How to Sequence DNA: From First Generation to the Next
  • 4.2 Ins and Outs of Different NGS Platforms
  • 4.2.1 Illumina Reversible Terminator Short-Read Sequencing
  • 4.2.1.1 Sequencing Principle
  • 4.2.1.2 Implementation
  • 4.2.1.3 Error Rate, Read Length, Data Output, and Cost
  • 4.2.1.4 Sequence Data Generation
  • 4.2.2 Pacific Biosciences Single-Molecule Real-Time (SMRT) Long-Read Sequencing
  • 4.2.2.1 Sequencing Principle
  • 4.2.2.2 Implementation
  • 4.2.2.3 Error Rate, Read Length, Data Output, and Cost
  • 4.2.2.4 Sequence Data Generation
  • 4.2.3 Oxford Nanopore Technologies (ONT) Long-Read Sequencing
  • 4.2.3.1 Sequencing Principle
  • 4.2.3.2 Implementation
  • 4.2.3.3 Error Rate, Read Length, Data Output, and Cost
  • 4.2.3.4 Sequence Data Generation
  • 4.2.4 Ion Torrent Semiconductor Sequencing
  • 4.2.4.1 Sequencing Principle
  • 4.2.4.2 Implementation
  • 4.2.4.3 Error Rate, Read Length, Date Output, and Cost
  • 4.2.4.4 Sequence Data Generation
  • 4.3 A Typical NGS Workflow
  • 4.4 Biases and Other Adverse Factors That May Affect NGS Data Accuracy
  • 4.4.1 Biases in Library Construction
  • 4.4.2 Biases and Other Factors in Sequencing
  • 4.5 Major Applications of NGS.
  • 4.5.1 Transcriptomic Profiling (Bulk and Single-Cell RNA-Seq)
  • 4.5.2 Genetic Mutation and Variation Identification
  • 4.5.3 De Novo Genome Assembly
  • 4.5.4 Protein-DNA Interaction Analysis (ChIP-Seq)
  • 4.5.5 Epigenomics and DNA Methylation Study (Methyl-Seq)
  • 4.5.6 Metagenomics
  • References
  • 5 Early-Stage Next-Generation Sequencing (NGS) Data Analysis: Common Steps
  • 5.1 Basecalling, FASTQ File Format, and Base Quality Score
  • 5.2 NGS Data Quality Control and Preprocessing
  • 5.3 Read Mapping
  • 5.3.1 Mapping Approaches and Algorithms
  • 5.3.2 Selection of Mapping Algorithms and Reference Genome Sequences
  • 5.3.3 SAM/BAM as the Standard Mapping File Format
  • 5.3.4 Mapping File Examination and Operation
  • 5.4 Tertiary Analysis
  • References
  • 6 Computing Needs for Next-Generation Sequencing (NGS) Data Management and Analysis
  • 6.1 NGS Data Storage, Transfer, and Sharing
  • 6.2 Computing Power Required for NGS Data Analysis
  • 6.3 Cloud Computing
  • 6.4 Software Needs for NGS Data Analysis
  • 6.4.1 Parallel Computing
  • 6.5 Bioinformatics Skills Required for NGS Data Analysis
  • References
  • Part III Application-Specific NGS Data Analysis
  • 7 Transcriptomics By Bulk RNA-Seq
  • 7.1 Principle of RNA-Seq
  • 7.2 Experimental Design
  • 7.2.1 Factorial Design
  • 7.2.2 Replication and Randomization
  • 7.2.3 Sample Preparation and Sequencing Library Preparation
  • 7.2.4 Sequencing Strategy
  • 7.3 RNA-Seq Data Analysis
  • 7.3.1 Read Mapping
  • 7.3.2 Quantification of Reads
  • 7.3.3 Normalization
  • 7.3.4 Batch Effect Removal
  • 7.3.5 Identification of Differentially Expressed Genes
  • 7.3.6 Multiple Testing Correction
  • 7.3.7 Gene Clustering
  • 7.3.8 Functional Analysis of Identified Genes
  • 7.3.9 Differential Splicing Analysis
  • 7.4 Visualization of RNA-Seq Data
  • 7.5 RNA-Seq as a Discovery Tool
  • References.
  • 8 Transcriptomics By Single-Cell RNA-Seq
  • 8.1 Experimental Design
  • 8.1.1 Single-Cell RNA-Seq General Approaches
  • 8.1.2 Cell Number and Sequencing Depth
  • 8.1.3 Batch Effects Minimization and Sample Replication
  • 8.2 Single-Cell Preparation, Library Construction, and Sequencing
  • 8.2.1 Single-Cell Preparation
  • 8.2.2 Single Nuclei Preparation
  • 8.2.3 Library Construction and Sequencing
  • 8.3 Preprocessing of ScRNA-Seq Data
  • 8.3.1 Initial Data Preprocessing and Quality Control
  • 8.3.2 Alignment and Transcript Counting
  • 8.3.3 Data Cleanup Post Alignment
  • 8.3.4 Normalization
  • 8.3.5 Batch Effects Correction
  • 8.3.6 Signal Imputation
  • 8.4 Feature Selection, Dimension Reduction, and Visualization
  • 8.4.1 Feature Selection
  • 8.4.2 Dimension Reduction
  • 8.4.3 Visualization
  • 8.5 Cell Clustering, Cell Identity Annotation, and Compositional Analysis
  • 8.5.1 Cell Clustering
  • 8.5.2 Cell Identity Annotation
  • 8.5.3 Compositional Analysis
  • 8.6 Differential Expression Analysis
  • 8.7 Trajectory Inference
  • 8.8 Advanced Analyses
  • 8.8.1 SNV/CNV Detection and Allele-Specific Expression Analysis
  • 8.8.2 Alternative Splicing Analysis
  • 8.8.3 Gene Regulatory Network Inference
  • References
  • 9 Small RNA Sequencing
  • 9.1 Small RNA NGS Data Generation and Upstream Processing
  • 9.1.1 Data Generation
  • 9.1.2 Preprocessing
  • 9.1.3 Mapping
  • 9.1.4 Identification of Known and Putative Small RNA Species
  • 9.1.5 Normalization
  • 9.2 Identification of Differentially Expressed Small RNAs
  • 9.3 Functional Analysis of Identified Known Small RNAs
  • References
  • 10 Genotyping and Variation Discovery By Whole Genome/Exome Sequencing
  • 10.1 Data Preprocessing, Mapping, Realignment, and Recalibration
  • 10.2 Single Nucleotide Variant (SNV) and Short Indel Calling
  • 10.2.1 Germline SNV and Indel Calling
  • 10.2.2 Somatic Mutation Detection.
  • 10.2.3 Variant Calling From RNA Sequencing Data
  • 10.2.4 Variant Call Format (VCF)
  • 10.2.5 Evaluating VCF Results
  • 10.3 Structural Variant (SV) Calling
  • 10.3.1 Short-Read-Based SV Calling
  • 10.3.2 Long-Read-Based SV Calling
  • 10.3.3 CNV Detection
  • 10.3.4 Integrated SV Analysis
  • 10.4 Annotation of Called Variants
  • References
  • 11 Clinical Sequencing and Detection of Actionable Variants
  • 11.1 Clinical Sequencing Data Generation
  • 11.1.1 Patient Sample Collection
  • 11.1.2 Library Preparation and Sequencing Approaches
  • 11.2 Read Mapping and Variant Calling
  • 11.3 Variant Filtering
  • 11.3.1 Frequency of Occurrence
  • 11.3.2 Functional Consequence
  • 11.3.3 Existing Evidence of Relationship to Human Disease
  • 11.3.4 Clinical Phenotype Match
  • 11.3.5 Mode of Inheritance
  • 11.4 Variant Ranking and Prioritization
  • 11.5 Classification of Variants Based On Pathogenicity
  • 11.5.1 Classification of Germline Variants
  • 11.5.2 Classification of Somatic Variants
  • 11.6 Clinical Review and Reporting
  • 11.6.1 Use of Artificial Intelligence in Variant Reporting
  • 11.6.2 Expert Review
  • 11.6.3 Generation of Testing Report
  • 11.6.4 Variant Validation
  • 11.6.5 Incorporation Into a Patient's Electronic Health Record
  • 11.6.6 Reporting of Secondary Findings
  • 11.6.7 Patient Counseling and Periodic Report Updates
  • 11.7 Bioinformatics Pipeline Validation
  • References
  • 12 De Novo Genome Assembly With Long And/or Short Reads
  • 12.1 Genomic Factors and Sequencing Strategies for De Novo Assembly
  • 12.1.1 Genomic Factors That Affect De Novo Assembly
  • 12.1.2 Sequencing Strategies for De Novo Assembly
  • 12.2 Assembly of Contigs
  • 12.2.1 Sequence Data Preprocessing, Error Correction, and Assessment of Genome Characteristics
  • 12.2.2 Contig Assembly Algorithms
  • 12.2.3 Polishing
  • 12.3 Scaffolding and Gap Closure.
  • 12.4 Assembly Quality Evaluation.