RAG-Driven Generative AI Build Custom Retrieval Augmented Generation Pipelines with LlamaIndex, Deep Lake, and Pinecone

Minimize AI hallucinations and build accurate, custom generative AI pipelines with RAG using embedded vector databases and integrated human feedback Purchase of the print or Kindle book includes a free eBook in PDF format Key Features Implement RAG's traceable outputs, linking each response to...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Rothman, Denis, author (author)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham, England : Packt Publishing Ltd [2024]
Edición:	First edition
Colección:	Expert insight.
Materias:	Natural language generation (Computer science) Artificial intelligence > Computer programs.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009853390006719

Tabla de Contenidos:

Cover
Copyright Page
Contributors
Table of Contents
Preface
Chapter 1: Why Retrieval Augmented Generation?
What is RAG?
Naïve, advanced, and modular RAG configurations
RAG versus fine-tuning
The RAG ecosystem
The retriever (D)
Collect (D1)
Process (D2)
Storage (D3)
Retrieval query (D4)
The generator (G)
Input (G1)
Augmented input with HF (G2)
Prompt engineering (G3)
Generation and output (G4)
The evaluator (E)
Metrics (E1)
Human feedback (E2)
The trainer (T)
Naïve, advanced, and modular RAG in code
Part 1: Foundations and basic implementation
1. Environment
2. The generator
3. The Data
4.The query
Part 2: Advanced techniques and evaluation
1. Retrieval metrics
2. Naïve RAG
3. Advanced RAG
4. Modular RAG
Summary
Questions
References
Further reading
Chapter 2: RAG Embedding Vector Stores with Deep Lake and OpenAI
From raw data to embeddings in vector stores
Organizing RAG in a pipeline
A RAG-driven generative AI pipeline
Building a RAG pipeline
Setting up the environment
The installation packages and libraries
The components involved in the installation process
1. Data collection and preparation
Collecting the data
Preparing the data
2. Data embedding and storage
Retrieving a batch of prepared documents
Verifying if the vector store exists and creating it if not
The embedding function
Adding data to the vector store
Vector store information
3. Augmented input generation
Input and query retrieval
Augmented input
Evaluating the output with cosine similarity
Summary
Questions
References
Further reading
Chapter 3: Building Index-Based RAG with LlamaIndex, Deep Lake, and OpenAI
Why use index-based RAG?
Architecture.
Building a semantic search engine and generative agent for drone technology
Installing the environment
Pipeline 1: Collecting and preparing the documents
Pipeline 2: Creating and populating a Deep Lake vector store
Pipeline 3: Index-based RAG
User input and query parameters
Cosine similarity metric
Vector store index query engine
Query response and source
Optimized chunking
Performance metric
Tree index query engine
Performance metric
List index query engine
Performance metric
Keyword index query engine
Performance metric
Summary
Questions
References
Further reading
Chapter 4: Multimodal Modular RAG for Drone Technology
What is multimodal modular RAG?
Building a multimodal modular RAG program for drone technology
Loading the LLM dataset
Initializing the LLM query engine
Loading and visualizing the multimodal dataset
Navigating the multimodal dataset structure
Selecting and displaying an image
Adding bounding boxes and saving the image
Building a multimodal query engine
Creating a vector index and query engine
Running a query on the VisDrone multimodal dataset
Processing the response
Selecting and processing the image of the source node
Multimodal modular summary
Performance metric
LLM performance metric
Multimodal performance metric
Multimodal modular RAG performance metric
Summary
Questions
References
Further reading
Chapter 5: Boosting RAG Performance with Expert Human Feedback
Adaptive RAG
Building hybrid adaptive RAG in Python
1. Retriever
1.1. Installing the retriever's environment
1.2.1. Preparing the dataset
1.2.2. Processing the data
1.3. Retrieval process for user input
2. Generator
2.1. Integrating HF-RAG for augmented document inputs
2.2. Input.
2.3. Mean ranking simulation scenario
2.4.-2.5. Installing the generative AI environment
2.6. Content generation
3. Evaluator
3.1. Response time
3.2. Cosine similarity score
3.3. Human user rating
3.4. Human-expert evaluation
Summary
Questions
References
Further reading
Chapter 6: Scaling RAG Bank Customer Data with Pinecone
Scaling with Pinecone
Architecture
Pipeline 1: Collecting and preparing the dataset
1. Collecting and processing the dataset
Installing the environment for Kaggle
Collecting the dataset
2. Exploratory data analysis
3. Training an ML model
Data preparation and clustering
Implementation and evaluation of clustering
Pipeline 2: Scaling a Pinecone index (vector store)
The challenges of vector store management
Installing the environment
Processing the dataset
Chunking and embedding the dataset
Chunking
Embedding
Duplicating data
Creating the Pinecone index
Upserting
Querying the Pinecone index
Pipeline 3: RAG generative AI
RAG with GPT-4o
Querying the dataset
Querying a target vector
Extracting relevant texts
Augmented prompt
Augmented generation
Summary
Questions
References
Further reading
Chapter 7: Building Scalable Knowledge-Graph-Based RAG with Wikipedia API and LlamaIndex
The architecture of RAG for knowledge-graph-based semantic search
Building graphs from trees
Pipeline 1: Collecting and preparing the documents
Retrieving Wikipedia data and metadata
Preparing the data for upsertion
Pipeline 2: Creating and populating the Deep Lake vector store
Pipeline 3: Knowledge graph index-based RAG
Generating the knowledge graph index
Displaying the graph
Interacting with the knowledge graph index
Installing the similarity score packages and defining the functions.
Re-ranking
Example metrics
Metric calculation and display
Summary
Questions
References
Further reading
Chapter 8: Dynamic RAG with Chroma and Hugging Face Llama
The architecture of dynamic RAG
Installing the environment
Hugging Face
Chroma
Activating session time
Downloading and preparing the dataset
Embedding and upserting the data in a Chroma collection
Selecting a model
Embedding and storing the completions
Displaying the embeddings
Querying the collection
Prompt and retrieval
RAG with Llama
Deleting the collection
Total session time
Summary
Questions
References
Further reading
Chapter 9: Empowering AI Models: Fine-Tuning RAG Data and Human Feedback
The architecture of fine-tuning static RAG data
The RAG ecosystem
Installing the environment
1. Preparing the dataset for fine-tuning
1.1. Downloading and visualizing the dataset
1.2. Preparing the dataset for fine-tuning
2. Fine-tuning the model
2.1. Monitoring the fine-tunes
3. Using the fine-tuned OpenAI model
Metrics
Summary
Questions
References
Further reading
Chapter 10: RAG for Video Stock Production with Pinecone and OpenAI
The architecture of RAG for video production
The environment of the video production ecosystem
Importing modules and libraries
GitHub
OpenAI
Pinecone
Pipeline 1: Generator and Commentator
The AI-generated video dataset
How does a diffusion transformer work?
Analyzing the diffusion transformer model video dataset
The Generator and the Commentator
Step 1. Displaying the video
Step 2. Splitting video into frames
Step 3. Commenting on the frames
Pipeline 1 controller
Pipeline 2: The Vector Store Administrator
Querying the Pinecone index
Pipeline 3: The Video Expert
Summary
Questions
References.
Further reading
Appendix
Packt Page
Other Books You May Enjoy
Index.

RAG-Driven Generative AI Build Custom Retrieval Augmented Generation Pipelines with LlamaIndex, Deep Lake, and Pinecone

Ejemplares similares