Transformers for Natural Language Processing and Computer Vision Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3

Transformers for Natural Language Processing and Computer Vision, Third Edition, explores Large Language Model (LLM) architectures, applications, and various platforms (Hugging Face, OpenAI, and Google Vertex AI) used for Natural Language Processing (NLP) and Computer Vision (CV). The book guides yo...

Descripción completa

Detalles Bibliográficos
Otros Autores: Rothman, Denis, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England : Packt Publishing Ltd [2024]
Edición:Third edition
Colección:Expert insight.
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009849137706719
Tabla de Contenidos:
  • Cover
  • Copyright
  • Contributors
  • Table of Contents
  • Preface
  • Chapter 1: What Are Transformers?
  • How constant time complexity O(1) changed our lives forever
  • O(1) attention conquers O(n) recurrent methods
  • Attention layer
  • Recurrent layer
  • The magic of the computational time complexity of an attention layer
  • Computational time complexity with a CPU
  • Computational time complexity with a GPU
  • Computational time complexity with a TPU
  • TPU-LLM
  • A brief journey from recurrent to attention
  • A brief history
  • From one token to an AI revolution
  • From one token to everything
  • Foundation Models
  • From general purpose to specific tasks
  • The role of AI professionals
  • The future of AI professionals
  • What resources should we use?
  • Decision-making guidelines
  • The rise of transformer seamless APIs and assistants
  • Choosing ready-to-use API-driven libraries
  • Choosing a cloud platform and transformer model
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 2: Getting Started with the Architecture of the Transformer Model
  • The rise of the Transformer: Attention Is All You Need
  • The encoder stack
  • Input embedding
  • Positional encoding
  • Sublayer 1: Multi-head attention
  • Sublayer 2: Feedforward network
  • The decoder stack
  • Output embedding and position encoding
  • The attention layers
  • The FFN sublayer, the post-LN, and the linear layer
  • Training and performance
  • Hugging Face transformer models
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 3: Emergent vs Downstream Tasks: The Unseen Depths of Transformers
  • The paradigm shift: What is an NLP task?
  • Inside the head of the attention sublayer of a transformer
  • Exploring emergence with ChatGPT
  • Investigating the potential of downstream tasks
  • Evaluating models with metrics
  • Accuracy score
  • F1-score.
  • MCC
  • Human evaluation
  • Benchmark tasks and datasets
  • Defining the SuperGLUE benchmark tasks
  • Running downstream tasks
  • The Corpus of Linguistic Acceptability (CoLA)
  • Stanford Sentiment TreeBank (SST-2)
  • Microsoft Research Paraphrase Corpus (MRPC)
  • Winograd schemas
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 4: Advancements in Translations with Google Trax, Google Translate, and Gemini
  • Defining machine translation
  • Human transductions and translations
  • Machine transductions and translations
  • Evaluating machine translations
  • Preprocessing a WMT dataset
  • Preprocessing the raw data
  • Finalizing the preprocessing of the datasets
  • Evaluating machine translations with BLEU
  • Geometric evaluations
  • Applying a smoothing technique
  • Translations with Google Trax
  • Installing Trax
  • Creating the Original Transformer model
  • Initializing the model using pretrained weights
  • Tokenizing a sentence
  • Decoding from the Transformer
  • De-tokenizing and displaying the translation
  • Translation with Google Translate
  • Translation with a Google Translate AJAX API Wrapper
  • Implementing googletrans
  • Translation with Gemini
  • Gemini's potential
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 5: Diving into Fine-Tuning through BERT
  • The architecture of BERT
  • The encoder stack
  • Preparing the pretraining input environment
  • Pretraining and fine-tuning a BERT model
  • Fine-tuning BERT
  • Defining a goal
  • Hardware constraints
  • Installing Hugging Face Transformers
  • Importing the modules
  • Specifying CUDA as the device for torch
  • Loading the CoLA dataset
  • Creating sentences, label lists, and adding BERT tokens
  • Activating the BERT tokenizer
  • Processing the data
  • Creating attention masks
  • Splitting the data into training and validation sets.
  • Converting all the data into torch tensors
  • Selecting a batch size and creating an iterator
  • BERT model configuration
  • Loading the Hugging Face BERT uncased base model
  • Optimizer grouped parameters
  • The hyperparameters for the training loop
  • The training loop
  • Training evaluation
  • Predicting and evaluating using the holdout dataset
  • Exploring the prediction process
  • Evaluating using the Matthews correlation coefficient
  • Matthews correlation coefficient evaluation for the whole dataset
  • Building a Python interface to interact with the model
  • Saving the model
  • Creating an interface for the trained model
  • Interacting with the model
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 6: Pretraining a Transformer from Scratch through RoBERTa
  • Training a tokenizer and pretraining a transformer
  • Building KantaiBERT from scratch
  • Step 1: Loading the dataset
  • Step 2: Installing Hugging Face transformers
  • Step 3: Training a tokenizer
  • Step 4: Saving the files to disk
  • Step 5: Loading the trained tokenizer files
  • Step 6: Checking resource constraints: GPU and CUDA
  • Step 7: Defining the configuration of the model
  • Step 8: Reloading the tokenizer in transformers
  • Step 9: Initializing a model from scratch
  • Exploring the parameters
  • Step 10: Building the dataset
  • Step 11: Defining a data collator
  • Step 12: Initializing the trainer
  • Step 13: Pretraining the model
  • Step 14: Saving the final model (+tokenizer + config) to disk
  • Step 15: Language modeling with FillMaskPipeline
  • Pretraining a Generative AI customer support model on X data
  • Step 1: Downloading the dataset
  • Step 2: Installing Hugging Face transformers
  • Step 3: Loading and filtering the data
  • Step 4: Checking Resource Constraints: GPU and CUDA
  • Step 5: Defining the configuration of the model.
  • Step 6: Creating and processing the dataset
  • Step 7: Initializing the trainer
  • Step 8: Pretraining the model
  • Step 9: Saving the model
  • Step 10: User interface to chat with the Generative AI agent
  • Further pretraining
  • Limitations
  • Next steps
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 7: The Generative AI Revolution with ChatGPT
  • GPTs as GPTs
  • Improvement
  • Diffusion
  • New application sectors
  • Self-service assistants
  • Development assistants
  • Pervasiveness
  • The architecture of OpenAI GPT transformer models
  • The rise of billion-parameter transformer models
  • The increasing size of transformer models
  • Context size and maximum path length
  • From fine-tuning to zero-shot models
  • Stacking decoder layers
  • GPT models
  • OpenAI models as assistants
  • ChatGPT provides source code
  • GitHub Copilot code assistant
  • General-purpose prompt examples
  • Getting started with ChatGPT - GPT-4 as an assistant
  • 1. GPT-4 helps to explain how to write source code
  • 2. GPT-4 creates a function to show the YouTube presentation of GPT-4 by Greg Brockman on March 14, 2023
  • 3. GPT-4 creates an application for WikiArt to display images
  • 4. GPT-4 creates an application to display IMDb reviews
  • 5. GPT-4 creates an application to display a newsfeed
  • 6. GPT-4 creates a k-means clustering (KMC) algorithm
  • Getting started with the GPT-4 API
  • Running our first NLP task with GPT-4
  • Steps 1: Installing OpenAI and Step 2: Entering the API key
  • Step 3: Running an NLP task with GPT-4
  • Key hyperparameters
  • Running multiple NLP tasks
  • Retrieval Augmented Generation (RAG) with GPT-4
  • Installation
  • Document retrieval
  • Augmented retrieval generation
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 8: Fine-Tuning OpenAI GPT Models
  • Risk management.
  • Fine-tuning a GPT model for completion (generative)
  • 1. Preparing the dataset
  • 1.1. Preparing the data in JSON
  • 1.2. Converting the data to JSONL
  • 2. Fine-tuning an original model
  • 3. Running the fine-tuned GPT model
  • 4. Managing fine-tuned jobs and models
  • Before leaving
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 9: Shattering the Black Box with Interpretable Tools
  • Transformer visualization with BertViz
  • Running BertViz
  • Step 1: Installing BertViz and importing the modules
  • Step 2: Load the models and retrieve attention
  • Step 3: Head view
  • Step 4: Processing and displaying attention heads
  • Step 5: Model view
  • Step 6: Displaying the output probabilities of attention heads
  • Streaming the output of the attention heads
  • Visualizing word relationships using attention scores with pandas
  • exBERT
  • Interpreting Hugging Face transformers with SHAP
  • Introducing SHAP
  • Explaining Hugging Face outputs with SHAP
  • Transformer visualization via dictionary learning
  • Transformer factors
  • Introducing LIME
  • The visualization interface
  • Other interpretable AI tools
  • LIT
  • PCA
  • Running LIT
  • OpenAI LLMs explain neurons in transformers
  • Limitations and human control
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 10: Investigating the Role of Tokenizers in Shaping Transformer Models
  • Matching datasets and tokenizers
  • Best practices
  • Step 1: Preprocessing
  • Step 2: Quality control
  • Step 3: Continuous human quality control
  • Word2Vec tokenization
  • Case 0: Words in the dataset and the dictionary
  • Case 1: Words not in the dataset or the dictionary
  • Case 2: Noisy relationships
  • Case 3: Words in a text but not in the dictionary
  • Case 4: Rare words
  • Case 5: Replacing rare words.
  • Exploring sentence and WordPiece tokenizers to understand the efficiency of subword tokenizers for transformers.