Transformers for natural language processing build, train, and fine-tuning deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, and GPT-3

Transformers are a game-changer for natural language understanding (NLU) and have become one of the pillars of artificial intelligence. Transformers for Natural Language Processing, 2nd Edition, investigates deep learning for machine translations, speech-to-text, text-to-speech, language modeling, q...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Rothman, Denis, author (author), Gulli, Antonio, author
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham : Packt Publishing, Limited [2022]
Edición:	2nd ed
Colección:	Expert insight.
Materias:	Artificial intelligence > Data processing. Artificial intelligence > Computer programs. Python (Computer program language) Cloud computing.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009655514606719

Tabla de Contenidos:

Intro
Copyright
Foreword
Contributors
Table of Contents
Preface
Chapter 1: What are Transformers?
The ecosystem of transformers
Industry 4.0
Foundation models
Is programming becoming a sub-domain of NLP?
The future of artificial intelligence specialists
Optimizing NLP models with transformers
The background of transformers
What resources should we use?
The rise of Transformer 4.0 seamless APIs
Choosing ready-to-use API-driven libraries
Choosing a Transformer Model
The role of Industry 4.0 artificial intelligence specialists
Summary
Questions
References
Chapter 2: Getting Started with the Architecture of the Transformer Model
The rise of the Transformer: Attention is All You Need
The encoder stack
Input embedding
Positional encoding
Sublayer 1: Multi-head attention
Sublayer 2: Feedforward network
The decoder stack
Output embedding and position encoding
The attention layers
The FFN sublayer, the post-LN, and the linear layer
Training and performance
Tranformer models in Hugging Face
Summary
Questions
References
Chapter 3: Fine-Tuning BERT Models
The architecture of BERT
The encoder stack
Preparing the pretraining input environment
Pretraining and fine-tuning a BERT model
Fine-tuning BERT
Hardware constraints
Installing the Hugging Face PyTorch interface for BERT
Importing the modules
Specifying CUDA as the device for torch
Loading the dataset
Creating sentences, label lists, and adding BERT tokens
Activating the BERT tokenizer
Processing the data
Creating attention masks
Splitting the data into training and validation sets
Converting all the data into torch tensors
Selecting a batch size and creating an iterator
BERT model configuration.
Loading the Hugging Face BERT uncased base model
Optimizer grouped parameters
The hyperparameters for the training loop
The training loop
Training evaluation
Predicting and evaluating using the holdout dataset
Evaluating using the Matthews Correlation Coefficient
The scores of individual batches
Matthews evaluation for the whole dataset
Summary
Questions
References
Chapter 4: Pretraining a RoBERTa Model from Scratch
Training a tokenizer and pretraining a transformer
Building KantaiBERT from scratch
Step 1: Loading the dataset
Step 2: Installing Hugging Face transformers
Step 3: Training a tokenizer
Step 4: Saving the files to disk
Step 5: Loading the trained tokenizer files
Step 6: Checking resource constraints: GPU and CUDA
Step 7: Defining the configuration of the model
Step 8: Reloading the tokenizer in transformers
Step 9: Initializing a model from scratch
Exploring the parameters
Step 10: Building the dataset
Step 11: Defining a data collator
Step 12: Initializing the trainer
Step 13: Pretraining the model
Step 14: Saving the final model (+tokenizer + config) to disk
Step 15: Language modeling with FillMaskPipeline
Next steps
Summary
Questions
References
Chapter 5: Downstream NLP Tasks with Transformers
Transduction and the inductive inheritance of transformers
The human intelligence stack
The machine intelligence stack
Transformer performances versus Human Baselines
Evaluating models with metrics
Accuracy score
F1-score
Matthews Correlation Coefficient (MCC)
Benchmark tasks and datasets
From GLUE to SuperGLUE
Introducing higher Human Baselines standards
The SuperGLUE evaluation process
Defining the SuperGLUE benchmark tasks
BoolQ
Commitment Bank (CB).
Multi-Sentence Reading Comprehension (MultiRC)
Reading Comprehension with Commonsense Reasoning Dataset (ReCoRD)
Recognizing Textual Entailment (RTE)
Words in Context (WiC)
The Winograd schema challenge (WSC)
Running downstream tasks
The Corpus of Linguistic Acceptability (CoLA)
Stanford Sentiment TreeBank (SST-2)
Microsoft Research Paraphrase Corpus (MRPC)
Winograd schemas
Summary
Questions
References
Chapter 6: Machine Translation with the Transformer
Defining machine translation
Human transductions and translations
Machine transductions and translations
Preprocessing a WMT dataset
Preprocessing the raw data
Finalizing the preprocessing of the datasets
Evaluating machine translation with BLEU
Geometric evaluations
Applying a smoothing technique
Chencherry smoothing
Translation with Google Translate
Translations with Trax
Installing Trax
Creating the original Transformer model
Initializing the model using pretrained weights
Tokenizing a sentence
Decoding from the Transformer
De-tokenizing and displaying the translation
Summary
Questions
References
Chapter 7: The Rise of Suprahuman Transformers with GPT-3 Engines
Suprahuman NLP with GPT-3 transformer models
The architecture of OpenAI GPT transformer models
The rise of billion-parameter transformer models
The increasing size of transformer models
Context size and maximum path length
From fine-tuning to zero-shot models
Stacking decoder layers
GPT-3 engines
Generic text completion with GPT-2
Step 9: Interacting with GPT-2
Training a custom GPT-2 language model
Step 12: Interactive context and completion examples
Running OpenAI GPT-3 tasks
Running NLP tasks online
Getting started with GPT-3 engines
Running our first NLP task with GPT-3.
NLP tasks and examples
Comparing the output of GPT-2 and GPT-3
Fine-tuning GPT-3
Preparing the data
Step 1: Installing OpenAI
Step 2: Entering the API key
Step 3: Activating OpenAI's data preparation module
Fine-tuning GPT-3
Step 4: Creating an OS environment
Step 5: Fine-tuning OpenAI's Ada engine
Step 6: Interacting with the fine-tuned model
The role of an Industry 4.0 AI specialist
Initial conclusions
Summary
Questions
References
Chapter 8: Applying Transformers to Legal and Financial Documents for AI Text Summarization
Designing a universal text-to-text model
The rise of text-to-text transformer models
A prefix instead of task-specific formats
The T5 model
Text summarization with T5
Hugging Face
Hugging Face transformer resources
Initializing the T5-large transformer model
Getting started with T5
Exploring the architecture of the T5 model
Summarizing documents with T5-large
Creating a summarization function
A general topic sample
The Bill of Rights sample
A corporate law sample
Summarization with GPT-3
Summary
Questions
References
Chapter 9: Matching Tokenizers and Datasets
Matching datasets and tokenizers
Best practices
Step 1: Preprocessing
Step 2: Quality control
Continuous human quality control
Word2Vec tokenization
Case 0: Words in the dataset and the dictionary
Case 1: Words not in the dataset or the dictionary
Case 2: Noisy relationships
Case 3: Words in the text but not in the dictionary
Case 4: Rare words
Case 5: Replacing rare words
Case 6: Entailment
Standard NLP tasks with specific vocabulary
Generating unconditional samples with GPT-2
Generating trained conditional samples
Controlling tokenized data
Exploring the scope of GPT-3
Summary
Questions
References.
Chapter 10: Semantic Role Labeling with BERT-Based Transformers
Getting started with SRL
Defining semantic role labeling
Visualizing SRL
Running a pretrained BERT-based model
The architecture of the BERT-based model
Setting up the BERT SRL environment
SRL experiments with the BERT-based model
Basic samples
Sample 1
Sample 2
Sample 3
Difficult samples
Sample 4
Sample 5
Sample 6
Questioning the scope of SRL
The limit of predicate analysis
Redefining SRL
Summary
Questions
References
Chapter 11: Let Your Data Do the Talking: Story, Questions, and Answers
Methodology
Transformers and methods
Method 0: Trial and error
Method 1: NER first
Using NER to find questions
Location entity questions
Person entity questions
Method 2: SRL first
Question-answering with ELECTRA
Project management constraints
Using SRL to find questions
Next steps
Exploring Haystack with a RoBERTa model
Exploring Q&amp
A with a GTP-3 engine
Summary
Questions
References
Chapter 12: Detecting Customer Emotions to Make Predictions
Getting started: Sentiment analysis transformers
The Stanford Sentiment Treebank (SST)
Sentiment analysis with RoBERTa-large
Predicting customer behavior with sentiment analysis
Sentiment analysis with DistilBERT
Sentiment analysis with Hugging Face's models' list
DistilBERT for SST
MiniLM-L12-H384-uncased
RoBERTa-large-mnli
BERT-base multilingual model
Sentiment analysis with GPT-3
Some Pragmatic I4.0 thinking before we leave
Investigating with SRL
Investigating with Hugging Face
Investigating with the GPT-3 playground
GPT-3 code
Summary
Questions
References
Chapter 13: Analyzing Fake News with Transformers
Emotional reactions to fake news.
Cognitive dissonance triggers emotional reactions.

Transformers for natural language processing build, train, and fine-tuning deep neural network architectures for NLP with Python, PyTorch, TensorFlow, BERT, and GPT-3

Ejemplares similares