Building AI Intensive Python Applications Create Intelligent Apps with LLMs and Vector Databases

Master retrieval-augmented generation architecture and fine-tune your AI stack, along with discovering real-world use cases and best practices to create powerful AI apps Key Features Get to grips with the fundamentals of LLMs, vector databases, and Python frameworks Implement effective retrieval-aug...

Descripción completa

Detalles Bibliográficos
Otros Autores: Palmer, Rachelle, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England : Packt Publishing [2024]
Edición:First edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009849136706719
Tabla de Contenidos:
  • Cover
  • FM
  • Table of Contents
  • Preface
  • Chapter 1: Getting Started with Generative AI
  • Technical requirements
  • Defining the terminology
  • The generative AI stack
  • Python and GenAI
  • OpenAI API
  • MongoDB with Vector Search
  • Important features of generative AI
  • Why use generative AI?
  • The ethics and risks of GenAI
  • Summary
  • Chapter 2: Building Blocks of Intelligent Applications
  • Technical requirements
  • Defining intelligent applications
  • The building blocks of intelligent applications
  • LLMs - reasoning engines for intelligent apps
  • Use cases for LLM reasoning engines
  • Diverse capabilities of LLMs
  • Multi-modal language models
  • A paradigm shift in AI development
  • Embedding models and vector databases - semantic long-term memory
  • Embedding models
  • Vector databases
  • Model hosting
  • Your (soon-to-be) intelligent app
  • Sample application - RAG chatbot
  • Implications of intelligent applications for software engineering
  • Summary
  • Part 1
  • Foundations of AI: LLMs, Embedding Models, Vector Databases, and Application Design
  • Chapter 3: Large Language Models
  • Technical requirements
  • Probabilistic framework
  • n-gram language models
  • Machine learning for language modelling
  • Artificial neural networks
  • Training an artificial neural network
  • ANNs for natural language processing
  • Tokenization
  • Embedding
  • Predicting probability distributions
  • Dealing with sequential data
  • Recurrent neural networks
  • Transformer architecture
  • LLMs in practice
  • The evolving field of LLMs
  • Prompting, fine-tuning, and RAG
  • Summary
  • Chapter 4: Embedding Models
  • Technical requirements
  • What is an embedding model?
  • How do embedding models differ from LLMs?
  • When to use embedding models versus LLMs
  • Types of embedding models
  • Choosing embedding models
  • Task requirements.
  • Dataset characteristics
  • Computational resources
  • Vector representations
  • Embedding model leaderboards
  • Embedding models overview
  • Do you always need an embedding model?
  • Executing code from LangChain
  • Best practices
  • Summary
  • Chapter 5: Vector Databases
  • Technical requirements
  • What is a vector embedding?
  • Vector similarity
  • Exact versus approximate search
  • Measuring search
  • Graph connectivity
  • Navigable small worlds
  • How to search a navigable small world
  • Hierarchical navigable small worlds
  • The need for vector databases
  • How vector search enhances AI models
  • Case studies and real-world applications
  • Okta - natural language access request (semantic search)
  • One AI - language-based AI (RAG over business data)
  • Novo Nordisk - automatic clinical study generation (advanced RAG/RPA)
  • Vector search best practices
  • Data modeling
  • Deployment
  • Summary
  • Chapter 6: AI/ML Application Design
  • Technical requirements
  • Data modeling
  • Enriching data with embeddings
  • Considering search use cases
  • Data storage
  • Determining the type of database cluster
  • Determining IOPS
  • Determining RAM
  • Final cluster configuration
  • Performance and availability versus cost
  • Data flow
  • Handling static data sources
  • Storing operational data enriched with vector embeddings
  • Freshness and retention
  • Real-time updates
  • Data lifecycle
  • Adopting new embedding models
  • Security and RBAC
  • Best practices for AI/ML application design
  • Summary
  • Part 2
  • Building Your Python Application: Frameworks, Libraries, APIs, and Vector Search
  • Chapter 7: Useful Frameworks, Libraries, and APIs
  • Technical requirements
  • Python for AI/ML
  • AI/ML frameworks
  • LangChain
  • LangChain semantic search with score
  • Semantic search with pre-filtering
  • Implementing a basic RAG solution with LangChain.
  • LangChain prompt templates and chains
  • Key Python libraries
  • pandas
  • PyMongoArrow
  • PyTorch
  • AI/ML APIs
  • OpenAI API
  • Hugging Face
  • Summary
  • Chapter 8: Implementing Vector Search in AI Applications
  • Technical requirements
  • Information retrieval with MongoDB Atlas Vector Search
  • Vector search tutorial in Python
  • Vector Search tutorial with LangChain
  • Building RAG architecture systems
  • Chunking or document-splitting strategies
  • Simple RAG
  • Advanced RAG
  • Summary
  • Part 3
  • Optimizing AI Applications: Scaling, Fine-Tuning, Troubleshooting, Monitoring, and Analytics
  • Chapter 9: LLM Output Evaluation
  • Technical requirements
  • What is LLM evaluation?
  • Component and end-to-end evaluations
  • Model benchmarking
  • Evaluation datasets
  • Defining a baseline
  • User feedback
  • Synthetic data
  • Evaluation metrics
  • Assertion-based metrics
  • Statistical metrics
  • LLM-as-a-judge evaluations
  • RAG metrics
  • Human review
  • Evaluations as guardrails
  • Summary
  • Chapter 10: Refining the Semantic Data Model to Improve Accuracy
  • Technical requirements
  • Embeddings
  • Experimenting with different embedding models
  • Fine-tuning embedding models
  • Embedding metadata
  • Formatting metadata
  • Including static metadata
  • Extracting metadata programmatically
  • Generating metadata with LLMs
  • Including metadata with query embedding and ingested content embeddings
  • Optimizing retrieval-augmented generation
  • Query mutation
  • Extracting query metadata for pre-filtering
  • Formatting ingested data
  • Advanced retrieval systems
  • Summary
  • Chapter 11: Common Failures of Generative AI
  • Technical requirements
  • Hallucinations
  • Causes of hallucinations
  • Implications of hallucinations
  • Sycophancy
  • Causes of sycophancy
  • Implications of sycophancy
  • Data leakage
  • Causes of data leakage.
  • Implications of data leakage
  • Cost
  • Types of costs
  • Tokens
  • Performance issues in generative AI applications
  • Computational load
  • Model serving strategies
  • High I/O operations
  • Summary
  • Chapter 12: Correcting and Optimizing Your Generative AI Application
  • Technical requirements
  • Baselining
  • Training and evaluation datasets
  • Few-shot prompting
  • Retrieval and reranking
  • Late interaction strategies
  • Query rewriting
  • Testing and red teaming
  • Testing
  • Red teaming
  • Information post-processing
  • Other remedies
  • Summary
  • Appendix: Further Reading
  • Index
  • Other Books You May Enjoy.