Mastering Transformers The Journey from BERT to Large Language Models and Stable Diffusion

Explore transformer-based language models from BERT to GPT, delving into NLP and computer vision tasks, while tackling challenges effectively Key Features Understand the complexity of deep learning architecture and transformers architecture Create solutions to industrial natural language processing...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Yıldırım, Savaş, author (author), Asgari-Chenaghlu, Meysam, author
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham, England : Packt Publishing [2024]
Edición:	Second edition
Materias:	Natural language processing (Computer science)
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009825849206719

Tabla de Contenidos:

Cover
Title page
Copyright and Credits
Contributors
Table of Contents
Preface
Part 1: Recent Developments in the Field, Installations, and Hello World Applications
Chapter 1: From Bag-of-Words to the Transformers
Evolution of NLP approaches
Recalling traditional NLP approaches
Language modeling and generation
Leveraging DL
Considering the word order with RNN models
LSTMs and gated recurrent units
Contextual word embeddings and TL
Overview of the Transformer architecture
Attention mechanism
Multi-head attention mechanisms
Using TL with Transformers
Multimodal learning
Summary
References
Chapter 2: A Hands-On Introduction to the Subject
Technical requirements
Installing transformer with Anaconda
Installation on Linux
Installation on Windows
Installation on macOS
Installing TensorFlow, PyTorch, and Transformer
Installing and using Google Colab
Working with language models and tokenizers
Working with community-provided models
Working with multimodal transformers
Working with benchmarks and datasets
Important benchmarks
GLUE benchmark
SuperGLUE benchmark
XTREME benchmark
XGLUE benchmark
SQuAD benchmark
Accessing the datasets with an application programming interface
Data manipulation using the datasets library
Sorting, indexing, and shuffling
Caching and reusability
Dataset filter and map function
Processing data with the map function
Working with local files
Preparing a dataset for model training
Benchmarking for speed and memory
Summary
Part 2: Transformer Models: From Autoencoders to Autoregressive Models
Chapter 3: Autoencoding Language Models
Technical requirements
BERT - one of the autoencoding language models
BERT language model pretraining tasks.
A deeper look into the BERT language model
Autoencoding language model training for any language
Sharing models with the community
Other autoencoding models
Introducing ALBERT
RoBERTa
ELECTRA
DeBERTa
Working with tokenization algorithms
BPE
WordPiece tokenization
Sentence piece tokenization
The tokenizers library
Summary
Chapter 4: From Generative Models to Large Language Models
Technical requirements
An introduction to GLMs
Working with GLMs
GPT model family
Transformer-XL
XLNet
Working with text-to-text models
Multi-task learning with T5
Zero-Shot Text Generalization with T0
Another Denoising-Based Seq2Seq Model - BART
GLM training
NLG using AR models
Summary
References
Chapter 5: Fine-Tuning Language Models for Text Classification
Technical requirements
Introduction to text classification
Fine-tuning a BERT model for single-sentence binary classification
Training a classification model with native PyTorch
Fine-tuning BERT for multi-class classification with custom datasets
Fine-tuning the BERT model for sentence-pair regression
Multilabel text classification
Utilizing run_glue.py to fine-tune the models
Summary
References
Chapter 6: Fine-Tuning Language Models for Token Classification
Technical requirements
Introduction to token classification
Understanding NER
Understanding POS tagging
Understanding QA
Fine-tuning language models for NER
Question answering using token classification
Question answering for many tasks
Summary
Chapter 7: Text Representation
Technical requirements
Introduction to sentence embeddings
Cross-encoder versus bi-encoder
Benchmarking sentence similarity models
Using BART for zero-shot learning
Semantic similarity experiment with FLAIR.
Average word embeddings
RNN-based document embeddings
Transformer-based BERT embeddings
SBERT embeddings
Text clustering with Sentence-BERT
Topic modeling with BERTopic
Semantic search with SBERT
Instruction fine-tuned embedding models
Summary
Further reading
Chapter 8: Boosting Model Performance
Technical requirements
Improving performance with data augmentation
Character-level augmentation
Word-level augmentation
Sentence-level augmentation
Boosting IMDB text classification with augmentation
Adapting the model to the domain
Optimizing the parameters with HPO
Summary
Chapter 9: Parameter Efficient Fine-Tuning
Technical requirements
Introduction to PEFT
Understanding Types of PEFT
Additive methods
Selective methods
Low-rank fine-tuning
Hands-on PEFT experiments
Fine-tuning a BERT checkpoint with adapter tuning
Efficiently fine-tune FLAN-T5 for an NLI task with Lora
Tuning with QLoRA
Summary
References
Part 3: Advanced Topics
Chapter 10: Large Language Models
Technical requirements
Why large language models?
Importance of reward function
The instruction-following ability of LLMs
Fine-tuning large language models
Summary
Chapter 11: Explainable AI (XAI) in NLP
Technical requirements
Interpreting attention heads
Visualizing attention heads with exBERT
Multiscale visualization of attention heads with BertViz
Understanding the inner parts of BERT with probing classifiers
Explain the model decision
Interpret Transformers' decision with LIME
Interpret Transformers' decision with SHAP
Summary
Chapter 12: Working with Efficient Transformers
Technical requirements
Introduction to efficient, light, and fast transformers
Implementation for model size reduction.
Working with DistilBERT for knowledge distillation
Pruning transformers
Quantization
Working with efficient self-attention
Sparse attention with fixed patterns
Learnable patterns
Low-rank factorization, kernel methods, and other approaches
Easier quantization using bitsandbytes
Summary
References
Chapter 13: Cross-Lingual and Multilingual Language Modeling
Technical requirements
Translation language modeling and cross-lingual knowledge sharing
XLM and mBERT
mBERT
XLM
Cross-lingual similarity tasks
Cross-lingual text similarity
Visualizing cross-lingual textual similarity
Cross-lingual classification
Cross-lingual zero-shot learning
Massive multilingual translation
Fine-tuning the performance of multilingual models
Summary
References
Chapter 14: Serving Transformer Models
Technical requirements
FastAPI Transformer model serving
Dockerizing APIs
Faster Transformer model serving using TFX
Load testing using Locust
Faster inference using ONNX
SageMaker inference
Summary
Further reading
Chapter 15: Model Tracking and Monitoring
Technical requirements
Tracking model metrics
Tracking model training with TensorBoard
Tracking model training live with W&amp
B
Summary
Further reading
Part 4: Transformers beyond NLP
Chapter 16: Vision Transformers
Technical requirements
Vision transformers
Image classification using transformers
Semantic segmentation and object detection using transformers
Visual prompt models
Summary
Chapter 17: Multimodal Generative Transformers
Technical requirements
Multimodal learning
Generative multimodal AI
Stable Diffusion for text-to-image generation
Stable Diffusion in action
Music generation using MusicGen
Text-to-speech generation using transformers
Summary.
Chapter 18: Revisiting Transformers Architecture for Time Series
Technical requirements
Understanding time series concepts
Transformers and time series modeling
Summary
Index
Other Books You May Enjoy.

Mastering Transformers The Journey from BERT to Large Language Models and Stable Diffusion

Ejemplares similares