RAG-Driven Generative AI Build Custom Retrieval Augmented Generation Pipelines with LlamaIndex, Deep Lake, and Pinecone

Minimize AI hallucinations and build accurate, custom generative AI pipelines with RAG using embedded vector databases and integrated human feedback Purchase of the print or Kindle book includes a free eBook in PDF format Key Features Implement RAG's traceable outputs, linking each response to...

Descripción completa

Detalles Bibliográficos
Otros Autores: Rothman, Denis, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England : Packt Publishing Ltd [2024]
Edición:First edition
Colección:Expert insight.
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009853390006719
Tabla de Contenidos:
  • Cover
  • Copyright Page
  • Contributors
  • Table of Contents
  • Preface
  • Chapter 1: Why Retrieval Augmented Generation?
  • What is RAG?
  • Naïve, advanced, and modular RAG configurations
  • RAG versus fine-tuning
  • The RAG ecosystem
  • The retriever (D)
  • Collect (D1)
  • Process (D2)
  • Storage (D3)
  • Retrieval query (D4)
  • The generator (G)
  • Input (G1)
  • Augmented input with HF (G2)
  • Prompt engineering (G3)
  • Generation and output (G4)
  • The evaluator (E)
  • Metrics (E1)
  • Human feedback (E2)
  • The trainer (T)
  • Naïve, advanced, and modular RAG in code
  • Part 1: Foundations and basic implementation
  • 1. Environment
  • 2. The generator
  • 3. The Data
  • 4.The query
  • Part 2: Advanced techniques and evaluation
  • 1. Retrieval metrics
  • 2. Naïve RAG
  • 3. Advanced RAG
  • 4. Modular RAG
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 2: RAG Embedding Vector Stores with Deep Lake and OpenAI
  • From raw data to embeddings in vector stores
  • Organizing RAG in a pipeline
  • A RAG-driven generative AI pipeline
  • Building a RAG pipeline
  • Setting up the environment
  • The installation packages and libraries
  • The components involved in the installation process
  • 1. Data collection and preparation
  • Collecting the data
  • Preparing the data
  • 2. Data embedding and storage
  • Retrieving a batch of prepared documents
  • Verifying if the vector store exists and creating it if not
  • The embedding function
  • Adding data to the vector store
  • Vector store information
  • 3. Augmented input generation
  • Input and query retrieval
  • Augmented input
  • Evaluating the output with cosine similarity
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 3: Building Index-Based RAG with LlamaIndex, Deep Lake, and OpenAI
  • Why use index-based RAG?
  • Architecture.
  • Building a semantic search engine and generative agent for drone technology
  • Installing the environment
  • Pipeline 1: Collecting and preparing the documents
  • Pipeline 2: Creating and populating a Deep Lake vector store
  • Pipeline 3: Index-based RAG
  • User input and query parameters
  • Cosine similarity metric
  • Vector store index query engine
  • Query response and source
  • Optimized chunking
  • Performance metric
  • Tree index query engine
  • Performance metric
  • List index query engine
  • Performance metric
  • Keyword index query engine
  • Performance metric
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 4: Multimodal Modular RAG for Drone Technology
  • What is multimodal modular RAG?
  • Building a multimodal modular RAG program for drone technology
  • Loading the LLM dataset
  • Initializing the LLM query engine
  • Loading and visualizing the multimodal dataset
  • Navigating the multimodal dataset structure
  • Selecting and displaying an image
  • Adding bounding boxes and saving the image
  • Building a multimodal query engine
  • Creating a vector index and query engine
  • Running a query on the VisDrone multimodal dataset
  • Processing the response
  • Selecting and processing the image of the source node
  • Multimodal modular summary
  • Performance metric
  • LLM performance metric
  • Multimodal performance metric
  • Multimodal modular RAG performance metric
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 5: Boosting RAG Performance with Expert Human Feedback
  • Adaptive RAG
  • Building hybrid adaptive RAG in Python
  • 1. Retriever
  • 1.1. Installing the retriever's environment
  • 1.2.1. Preparing the dataset
  • 1.2.2. Processing the data
  • 1.3. Retrieval process for user input
  • 2. Generator
  • 2.1. Integrating HF-RAG for augmented document inputs
  • 2.2. Input.
  • 2.3. Mean ranking simulation scenario
  • 2.4.-2.5. Installing the generative AI environment
  • 2.6. Content generation
  • 3. Evaluator
  • 3.1. Response time
  • 3.2. Cosine similarity score
  • 3.3. Human user rating
  • 3.4. Human-expert evaluation
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 6: Scaling RAG Bank Customer Data with Pinecone
  • Scaling with Pinecone
  • Architecture
  • Pipeline 1: Collecting and preparing the dataset
  • 1. Collecting and processing the dataset
  • Installing the environment for Kaggle
  • Collecting the dataset
  • 2. Exploratory data analysis
  • 3. Training an ML model
  • Data preparation and clustering
  • Implementation and evaluation of clustering
  • Pipeline 2: Scaling a Pinecone index (vector store)
  • The challenges of vector store management
  • Installing the environment
  • Processing the dataset
  • Chunking and embedding the dataset
  • Chunking
  • Embedding
  • Duplicating data
  • Creating the Pinecone index
  • Upserting
  • Querying the Pinecone index
  • Pipeline 3: RAG generative AI
  • RAG with GPT-4o
  • Querying the dataset
  • Querying a target vector
  • Extracting relevant texts
  • Augmented prompt
  • Augmented generation
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 7: Building Scalable Knowledge-Graph-Based RAG with Wikipedia API and LlamaIndex
  • The architecture of RAG for knowledge-graph-based semantic search
  • Building graphs from trees
  • Pipeline 1: Collecting and preparing the documents
  • Retrieving Wikipedia data and metadata
  • Preparing the data for upsertion
  • Pipeline 2: Creating and populating the Deep Lake vector store
  • Pipeline 3: Knowledge graph index-based RAG
  • Generating the knowledge graph index
  • Displaying the graph
  • Interacting with the knowledge graph index
  • Installing the similarity score packages and defining the functions.
  • Re-ranking
  • Example metrics
  • Metric calculation and display
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 8: Dynamic RAG with Chroma and Hugging Face Llama
  • The architecture of dynamic RAG
  • Installing the environment
  • Hugging Face
  • Chroma
  • Activating session time
  • Downloading and preparing the dataset
  • Embedding and upserting the data in a Chroma collection
  • Selecting a model
  • Embedding and storing the completions
  • Displaying the embeddings
  • Querying the collection
  • Prompt and retrieval
  • RAG with Llama
  • Deleting the collection
  • Total session time
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 9: Empowering AI Models: Fine-Tuning RAG Data and Human Feedback
  • The architecture of fine-tuning static RAG data
  • The RAG ecosystem
  • Installing the environment
  • 1. Preparing the dataset for fine-tuning
  • 1.1. Downloading and visualizing the dataset
  • 1.2. Preparing the dataset for fine-tuning
  • 2. Fine-tuning the model
  • 2.1. Monitoring the fine-tunes
  • 3. Using the fine-tuned OpenAI model
  • Metrics
  • Summary
  • Questions
  • References
  • Further reading
  • Chapter 10: RAG for Video Stock Production with Pinecone and OpenAI
  • The architecture of RAG for video production
  • The environment of the video production ecosystem
  • Importing modules and libraries
  • GitHub
  • OpenAI
  • Pinecone
  • Pipeline 1: Generator and Commentator
  • The AI-generated video dataset
  • How does a diffusion transformer work?
  • Analyzing the diffusion transformer model video dataset
  • The Generator and the Commentator
  • Step 1. Displaying the video
  • Step 2. Splitting video into frames
  • Step 3. Commenting on the frames
  • Pipeline 1 controller
  • Pipeline 2: The Vector Store Administrator
  • Querying the Pinecone index
  • Pipeline 3: The Video Expert
  • Summary
  • Questions
  • References.
  • Further reading
  • Appendix
  • Packt Page
  • Other Books You May Enjoy
  • Index.