Mastering Transformers The Journey from BERT to Large Language Models and Stable Diffusion

Explore transformer-based language models from BERT to GPT, delving into NLP and computer vision tasks, while tackling challenges effectively Key Features Understand the complexity of deep learning architecture and transformers architecture Create solutions to industrial natural language processing...

Descripción completa

Detalles Bibliográficos
Otros Autores: Yıldırım, Savaş, author (author), Asgari-Chenaghlu, Meysam, author
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England : Packt Publishing [2024]
Edición:Second edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009825849206719
Tabla de Contenidos:
  • Cover
  • Title page
  • Copyright and Credits
  • Contributors
  • Table of Contents
  • Preface
  • Part 1: Recent Developments in the Field, Installations, and Hello World Applications
  • Chapter 1: From Bag-of-Words to the Transformers
  • Evolution of NLP approaches
  • Recalling traditional NLP approaches
  • Language modeling and generation
  • Leveraging DL
  • Considering the word order with RNN models
  • LSTMs and gated recurrent units
  • Contextual word embeddings and TL
  • Overview of the Transformer architecture
  • Attention mechanism
  • Multi-head attention mechanisms
  • Using TL with Transformers
  • Multimodal learning
  • Summary
  • References
  • Chapter 2: A Hands-On Introduction to the Subject
  • Technical requirements
  • Installing transformer with Anaconda
  • Installation on Linux
  • Installation on Windows
  • Installation on macOS
  • Installing TensorFlow, PyTorch, and Transformer
  • Installing and using Google Colab
  • Working with language models and tokenizers
  • Working with community-provided models
  • Working with multimodal transformers
  • Working with benchmarks and datasets
  • Important benchmarks
  • GLUE benchmark
  • SuperGLUE benchmark
  • XTREME benchmark
  • XGLUE benchmark
  • SQuAD benchmark
  • Accessing the datasets with an application programming interface
  • Data manipulation using the datasets library
  • Sorting, indexing, and shuffling
  • Caching and reusability
  • Dataset filter and map function
  • Processing data with the map function
  • Working with local files
  • Preparing a dataset for model training
  • Benchmarking for speed and memory
  • Summary
  • Part 2: Transformer Models: From Autoencoders to Autoregressive Models
  • Chapter 3: Autoencoding Language Models
  • Technical requirements
  • BERT - one of the autoencoding language models
  • BERT language model pretraining tasks.
  • A deeper look into the BERT language model
  • Autoencoding language model training for any language
  • Sharing models with the community
  • Other autoencoding models
  • Introducing ALBERT
  • RoBERTa
  • ELECTRA
  • DeBERTa
  • Working with tokenization algorithms
  • BPE
  • WordPiece tokenization
  • Sentence piece tokenization
  • The tokenizers library
  • Summary
  • Chapter 4: From Generative Models to Large Language Models
  • Technical requirements
  • An introduction to GLMs
  • Working with GLMs
  • GPT model family
  • Transformer-XL
  • XLNet
  • Working with text-to-text models
  • Multi-task learning with T5
  • Zero-Shot Text Generalization with T0
  • Another Denoising-Based Seq2Seq Model - BART
  • GLM training
  • NLG using AR models
  • Summary
  • References
  • Chapter 5: Fine-Tuning Language Models for Text Classification
  • Technical requirements
  • Introduction to text classification
  • Fine-tuning a BERT model for single-sentence binary classification
  • Training a classification model with native PyTorch
  • Fine-tuning BERT for multi-class classification with custom datasets
  • Fine-tuning the BERT model for sentence-pair regression
  • Multilabel text classification
  • Utilizing run_glue.py to fine-tune the models
  • Summary
  • References
  • Chapter 6: Fine-Tuning Language Models for Token Classification
  • Technical requirements
  • Introduction to token classification
  • Understanding NER
  • Understanding POS tagging
  • Understanding QA
  • Fine-tuning language models for NER
  • Question answering using token classification
  • Question answering for many tasks
  • Summary
  • Chapter 7: Text Representation
  • Technical requirements
  • Introduction to sentence embeddings
  • Cross-encoder versus bi-encoder
  • Benchmarking sentence similarity models
  • Using BART for zero-shot learning
  • Semantic similarity experiment with FLAIR.
  • Average word embeddings
  • RNN-based document embeddings
  • Transformer-based BERT embeddings
  • SBERT embeddings
  • Text clustering with Sentence-BERT
  • Topic modeling with BERTopic
  • Semantic search with SBERT
  • Instruction fine-tuned embedding models
  • Summary
  • Further reading
  • Chapter 8: Boosting Model Performance
  • Technical requirements
  • Improving performance with data augmentation
  • Character-level augmentation
  • Word-level augmentation
  • Sentence-level augmentation
  • Boosting IMDB text classification with augmentation
  • Adapting the model to the domain
  • Optimizing the parameters with HPO
  • Summary
  • Chapter 9: Parameter Efficient Fine-Tuning
  • Technical requirements
  • Introduction to PEFT
  • Understanding Types of PEFT
  • Additive methods
  • Selective methods
  • Low-rank fine-tuning
  • Hands-on PEFT experiments
  • Fine-tuning a BERT checkpoint with adapter tuning
  • Efficiently fine-tune FLAN-T5 for an NLI task with Lora
  • Tuning with QLoRA
  • Summary
  • References
  • Part 3: Advanced Topics
  • Chapter 10: Large Language Models
  • Technical requirements
  • Why large language models?
  • Importance of reward function
  • The instruction-following ability of LLMs
  • Fine-tuning large language models
  • Summary
  • Chapter 11: Explainable AI (XAI) in NLP
  • Technical requirements
  • Interpreting attention heads
  • Visualizing attention heads with exBERT
  • Multiscale visualization of attention heads with BertViz
  • Understanding the inner parts of BERT with probing classifiers
  • Explain the model decision
  • Interpret Transformers' decision with LIME
  • Interpret Transformers' decision with SHAP
  • Summary
  • Chapter 12: Working with Efficient Transformers
  • Technical requirements
  • Introduction to efficient, light, and fast transformers
  • Implementation for model size reduction.
  • Working with DistilBERT for knowledge distillation
  • Pruning transformers
  • Quantization
  • Working with efficient self-attention
  • Sparse attention with fixed patterns
  • Learnable patterns
  • Low-rank factorization, kernel methods, and other approaches
  • Easier quantization using bitsandbytes
  • Summary
  • References
  • Chapter 13: Cross-Lingual and Multilingual Language Modeling
  • Technical requirements
  • Translation language modeling and cross-lingual knowledge sharing
  • XLM and mBERT
  • mBERT
  • XLM
  • Cross-lingual similarity tasks
  • Cross-lingual text similarity
  • Visualizing cross-lingual textual similarity
  • Cross-lingual classification
  • Cross-lingual zero-shot learning
  • Massive multilingual translation
  • Fine-tuning the performance of multilingual models
  • Summary
  • References
  • Chapter 14: Serving Transformer Models
  • Technical requirements
  • FastAPI Transformer model serving
  • Dockerizing APIs
  • Faster Transformer model serving using TFX
  • Load testing using Locust
  • Faster inference using ONNX
  • SageMaker inference
  • Summary
  • Further reading
  • Chapter 15: Model Tracking and Monitoring
  • Technical requirements
  • Tracking model metrics
  • Tracking model training with TensorBoard
  • Tracking model training live with W&amp
  • B
  • Summary
  • Further reading
  • Part 4: Transformers beyond NLP
  • Chapter 16: Vision Transformers
  • Technical requirements
  • Vision transformers
  • Image classification using transformers
  • Semantic segmentation and object detection using transformers
  • Visual prompt models
  • Summary
  • Chapter 17: Multimodal Generative Transformers
  • Technical requirements
  • Multimodal learning
  • Generative multimodal AI
  • Stable Diffusion for text-to-image generation
  • Stable Diffusion in action
  • Music generation using MusicGen
  • Text-to-speech generation using transformers
  • Summary.
  • Chapter 18: Revisiting Transformers Architecture for Time Series
  • Technical requirements
  • Understanding time series concepts
  • Transformers and time series modeling
  • Summary
  • Index
  • Other Books You May Enjoy.