Using Stable Diffusion with Python Leverage Python to Control and Automate High-Quality AI Image Generation Using Stable Diffusion
Master AI image generation by leveraging GenAI tools and techniques such as diffusers, LoRA, textual inversion, ControlNet, and prompt design Key Features Master the art of generating stunning AI artwork with the help of expert guidance and ready-to-run Python code Get instant access to emerging ext...
Otros Autores: | , |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England :
Packt Publishing
[2024]
|
Edición: | First edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009825843906719 |
Tabla de Contenidos:
- Intro
- Title Page
- Copyright and Credits
- Dedication
- Foreword
- Contributors
- Table of Contents
- Preface
- Part 1 - A Whirlwind of Stable Diffusion
- Chapter 1: Introducing Stable Diffusion
- Evolution of the Diffusion model
- Before Transformer and Attention
- Transformer transforms machine learning
- CLIP from OpenAI makes a big difference
- Generate images
- DALL-E 2 and Stable Diffusion
- Why Stable Diffusion
- Which Stable Diffusion to use
- Why this book
- References
- Chapter 2: Setting Up the Environment for Stable Diffusion
- Hardware requirements to run Stable Diffusion
- GPU
- System memory
- Storage
- Software requirements
- CUDA installation
- Installing Python for Windows, Linux, and macOS
- Installing PyTorch
- Running a Stable Diffusion pipeline
- Using Google Colaboratory
- Using Google Colab to run a Stable Diffusion pipeline
- Summary
- References
- Chapter 3: Generating Images Using Stable Diffusion
- Logging in to Hugging Face
- Generating an image
- Generation seed
- Sampling scheduler
- Changing a model
- Guidance scale
- Summary
- References
- Chapter 4: Understanding the Theory Behind Diffusion Models
- Understanding the image-to-noise process
- A more efficient forward diffusion process
- The noise-to-image training process
- The noise-to-image sampling process
- Understanding Classifier Guidance denoising
- Summary
- References
- Chapter 5: Understanding How Stable Diffusion Works
- Stable Diffusion in latent space
- Generating latent vectors using diffusers
- Generating text embeddings using CLIP
- Initializing time step embeddings
- Initializing the Stable Diffusion UNet
- Implementing a text-to-image Stable Diffusion inference pipeline
- Implementing a text-guided image-to-image Stable Diffusion inference pipeline
- Summary
- References.
- Additional reading
- Chapter 6: Using Stable Diffusion Models
- Technical requirements
- Loading the Diffusers model
- Loading model checkpoints from safetensors and ckpt files
- Using ckpt and safetensors files with Diffusers
- Turning off the model safety checker
- Converting the checkpoint model file to the Diffusers format
- Using Stable Diffusion XL
- Summary
- References
- Part 2 - Improving Diffusers with Custom Features
- Chapter 7: Optimizing Performance and VRAM Usage
- Setting the baseline
- Optimization solution 1 - using the float16 or bfloat16 data type
- Optimization solution 2 - enabling VAE tiling
- Optimization solution 3 - enabling Xformers or using PyTorch 2.0
- Optimization solution 4 - enabling sequential CPU offload
- Optimization solution 5 - enabling model CPU offload
- Optimization solution 6 - Token Merging (ToMe)
- Summary
- References
- Chapter 8: Using Community-Shared LoRAs
- Technical requirements
- How does LoRA work?
- Using LoRA with Diffusers
- Applying a LoRA weight during loading
- Diving into the internal structure of LoRA
- Finding the A and B weight matrix from the LoRA file
- Finding the corresponding checkpoint model layer name
- Updating the checkpoint model weights
- Making a function to load LoRA
- Why LoRA works
- Summary
- References
- Chapter 9: Using Textual Inversion
- Diffusers inference using TI
- How TI works
- Building a custom TI loader
- TI in the pt file format
- TI in bin file format
- Detailed steps to build a TI loader
- Putting all of the code together
- Summary
- References
- Chapter 10: Overcoming 77-Token Limitations and Enabling Prompt Weighting
- Understanding the 77-token limitation
- Overcoming the 77-tokens limitation
- Putting all the code together into a function
- Enabling long prompts with weighting
- Verifying the work.
- Overcoming the 77-token limitation using community pipelines
- Summary
- References
- Chapter 11: Image Restore and Super-Resolution
- Understanding the terminologies
- Upscaling images using Img2img diffusion
- One-step super-resolution
- Multiple-step super-resolution
- A super-resolution result comparison
- Img-to-Img limitations
- ControlNet Tile image upscaling
- Steps to use ControlNet Tile to upscale an image
- The ControlNet Tile upscaling result
- Additional ControlNet Tile upscaling samples
- Summary
- References
- Chapter 12: Scheduled Prompt Parsing
- Technical requirements
- Using the Compel package
- Building a custom scheduled prompt pipeline
- A scheduled prompt parser
- Filling in the missing steps
- A Stable Diffusion pipeline supporting scheduled prompts
- Summary
- References
- Part 3 - Advanced Topics
- Chapter 13: Generating Images with ControlNet
- What is ControlNet and how is it different?
- Usage of ControlNet
- Using multiple ControlNets in one pipeline
- How ControlNet works
- Further usage
- More ControlNets with SD
- SDXL ControlNets
- Summary
- References
- Chapter 14: Generating Video Using Stable Diffusion
- Technical requirements
- The principles of text-to-video generation
- Practical applications of AnimateDiff
- Utilizing Motion LoRA to control animation motion
- Summary
- References
- Chapter 15: Generating Image Descriptions Using BLIP-2 and LLaVA
- Technical requirements
- BLIP-2 - Bootstrapping Language-Image Pre-training
- How BLIP-2 works
- Using BLIP-2 to generate descriptions
- LLaVA - Large Language and Vision Assistant
- How LLaVA works
- Installing LLaVA
- Using LLaVA to generate image descriptions
- Summary
- References
- Chapter 16: Exploring Stable Diffusion XL
- What's new in SDXL?
- The VAE of the SDXL
- The UNet of SDXL.
- Two text encoders in SDXL
- The two-stage design
- Using SDXL
- Use SDXL community models
- Using SDXL image-to-image to enhance an image
- Using SDXL LoRA models
- Using SDXL with an unlimited prompt
- Summary
- References
- Chapter 17: Building Optimized Prompts for Stable Diffusion
- What makes a good prompt?
- Be clear and specific
- Be descriptive
- Using consistent terminology
- Reference artworks and styles
- Incorporate negative prompts
- Iterate and refine
- Using LLMs to generate better prompts
- Summary
- References
- Part 4 - Building Stable Diffusion into an Application
- Chapter 18: Applications - Object Editing and Style Transferring
- Editing images using Stable Diffusion
- Replacing image background content
- Removing the image background
- Object and style transferring
- Loading up a Stable Diffusion pipeline with IP-Adapter
- Transferring style
- Summary
- References
- Chapter 19: Generation Data Persistence
- Exploring and understanding the PNG file structure
- Saving extra text data in a PNG image file
- PNG extra data storage limitation
- Summary
- References
- Chapter 20: Creating Interactive User Interfaces
- Introducing Gradio
- Getting started with Gradio
- Gradio fundamentals
- Gradio Blocks
- Inputs and outputs
- Building a progress bar
- Building a Stable Diffusion text-to-image pipeline with Gradio
- Summary
- References
- Chapter 21: Diffusion Model Transfer Learning
- Technical requirements
- Training a neural network model with PyTorch
- Preparing the training data
- Preparing for training
- Training a model
- Training a model with Hugging Face's Accelerate
- Applying Hugging Face's Accelerate
- Putting code together
- Training a model with multiple GPUs using Accelerate
- Training a Stable Diffusion V1.5 LoRA
- Defining training hyperparameters.
- Preparing the Stable Diffusion components
- Loading the training data
- Defining the training components
- Training a Stable Diffusion V1.5 LoRA
- Kicking off the training
- Verifying the result
- Summary
- References
- Chapter 22: Exploring Beyond Stable Diffusion
- What sets this AI wave apart
- The enduring value of mathematics and programming
- Staying current with AI innovations
- Cultivating responsible, ethical, private, and secure AI
- Our evolving relationship with AI
- Summary
- References
- Index
- Other Books You May Enjoy.