Generative AI on AWS Building Context-Aware Multimodal Reasoning Applications

Companies today are moving rapidly to integrate generative AI into their products and services. But there's a great deal of hype (and misunderstanding) about the impact and promise of this technology. With this book, Chris Fregly, Antje Barth, and Shelbee Eigenbrode from AWS help CTOs, ML pract...

Descripción completa

Detalles Bibliográficos
Otros Autores: Fregly, Chris, author (author), Barth, Antje, author, Eigenbrode, Shelbee, author
Formato: Libro electrónico
Idioma:Inglés
Publicado: [Sebastopol, California] : O'Reilly Media, Inc 2023.
Edición:First edtion
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009784594106719
Tabla de Contenidos:
  • Cover
  • Copyright
  • Table of Contents
  • Preface
  • Conventions Used in This Book
  • Using Code Examples
  • O'Reilly Online Learning
  • How to Contact Us
  • Acknowledgments
  • Chris
  • Antje
  • Shelbee
  • Chapter 1. Generative AI Use Cases, Fundamentals, and Project Life Cycle
  • Use Cases and Tasks
  • Foundation Models and Model Hubs
  • Generative AI Project Life Cycle
  • Generative AI on AWS
  • Why Generative AI on AWS?
  • Building Generative AI Applications on AWS
  • Summary
  • Chapter 2. Prompt Engineering and In-Context Learning
  • Prompts and Completions
  • Tokens
  • Prompt Engineering
  • Prompt Structure
  • Instruction
  • Context
  • In-Context Learning with Few-Shot Inference
  • Zero-Shot Inference
  • One-Shot Inference
  • Few-Shot Inference
  • In-Context Learning Gone Wrong
  • In-Context Learning Best Practices
  • Prompt-Engineering Best Practices
  • Inference Configuration Parameters
  • Summary
  • Chapter 3. Large-Language Foundation Models
  • Large-Language Foundation Models
  • Tokenizers
  • Embedding Vectors
  • Transformer Architecture
  • Inputs and Context Window
  • Embedding Layer
  • Encoder
  • Self-Attention
  • Decoder
  • Softmax Output
  • Types of Transformer-Based Foundation Models
  • Pretraining Datasets
  • Scaling Laws
  • Compute-Optimal Models
  • Summary
  • Chapter 4. Memory and Compute Optimizations
  • Memory Challenges
  • Data Types and Numerical Precision
  • Quantization
  • fp16
  • bfloat16
  • fp8
  • int8
  • Optimizing the Self-Attention Layers
  • FlashAttention
  • Grouped-Query Attention
  • Distributed Computing
  • Distributed Data Parallel
  • Fully Sharded Data Parallel
  • Performance Comparison of FSDP over DDP
  • Distributed Computing on AWS
  • Fully Sharded Data Parallel with Amazon SageMaker
  • AWS Neuron SDK and AWS Trainium
  • Summary
  • Chapter 5. Fine-Tuning and Evaluation
  • Instruction Fine-Tuning
  • Llama 2-Chat
  • Falcon-Chat
  • FLAN-T5
  • Instruction Dataset
  • Multitask Instruction Dataset
  • FLAN: Example Multitask Instruction Dataset
  • Prompt Template
  • Convert a Custom Dataset into an Instruction Dataset
  • Instruction Fine-Tuning
  • Amazon SageMaker Studio
  • Amazon SageMaker JumpStart
  • Amazon SageMaker Estimator for Hugging Face
  • Evaluation
  • Evaluation Metrics
  • Benchmarks and Datasets
  • Summary
  • Chapter 6. Parameter-Efficient Fine-Tuning
  • Full Fine-Tuning Versus PEFT
  • LoRA and QLoRA
  • LoRA Fundamentals
  • Rank
  • Target Modules and Layers
  • Applying LoRA
  • Merging LoRA Adapter with Original Model
  • Maintaining Separate LoRA Adapters
  • Full-Fine Tuning Versus LoRA Performance
  • QLoRA
  • Prompt Tuning and Soft Prompts
  • Summary
  • Chapter 7. Fine-Tuning with Reinforcement Learning from Human Feedback
  • Human Alignment: Helpful, Honest, and Harmless
  • Reinforcement Learning Overview
  • Train a Custom Reward Model
  • Collect Training Dataset with Human-in-the-Loop
  • Sample Instructions for Human Labelers.