Building AI Intensive Python Applications Create Intelligent Apps with LLMs and Vector Databases
Master retrieval-augmented generation architecture and fine-tune your AI stack, along with discovering real-world use cases and best practices to create powerful AI apps Key Features Get to grips with the fundamentals of LLMs, vector databases, and Python frameworks Implement effective retrieval-aug...
Otros Autores: | |
---|---|
Formato: | Libro electrónico |
Idioma: | Inglés |
Publicado: |
Birmingham, England :
Packt Publishing
[2024]
|
Edición: | First edition |
Materias: | |
Ver en Biblioteca Universitat Ramon Llull: | https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009849136706719 |
Tabla de Contenidos:
- Cover
- FM
- Table of Contents
- Preface
- Chapter 1: Getting Started with Generative AI
- Technical requirements
- Defining the terminology
- The generative AI stack
- Python and GenAI
- OpenAI API
- MongoDB with Vector Search
- Important features of generative AI
- Why use generative AI?
- The ethics and risks of GenAI
- Summary
- Chapter 2: Building Blocks of Intelligent Applications
- Technical requirements
- Defining intelligent applications
- The building blocks of intelligent applications
- LLMs - reasoning engines for intelligent apps
- Use cases for LLM reasoning engines
- Diverse capabilities of LLMs
- Multi-modal language models
- A paradigm shift in AI development
- Embedding models and vector databases - semantic long-term memory
- Embedding models
- Vector databases
- Model hosting
- Your (soon-to-be) intelligent app
- Sample application - RAG chatbot
- Implications of intelligent applications for software engineering
- Summary
- Part 1
- Foundations of AI: LLMs, Embedding Models, Vector Databases, and Application Design
- Chapter 3: Large Language Models
- Technical requirements
- Probabilistic framework
- n-gram language models
- Machine learning for language modelling
- Artificial neural networks
- Training an artificial neural network
- ANNs for natural language processing
- Tokenization
- Embedding
- Predicting probability distributions
- Dealing with sequential data
- Recurrent neural networks
- Transformer architecture
- LLMs in practice
- The evolving field of LLMs
- Prompting, fine-tuning, and RAG
- Summary
- Chapter 4: Embedding Models
- Technical requirements
- What is an embedding model?
- How do embedding models differ from LLMs?
- When to use embedding models versus LLMs
- Types of embedding models
- Choosing embedding models
- Task requirements.
- Dataset characteristics
- Computational resources
- Vector representations
- Embedding model leaderboards
- Embedding models overview
- Do you always need an embedding model?
- Executing code from LangChain
- Best practices
- Summary
- Chapter 5: Vector Databases
- Technical requirements
- What is a vector embedding?
- Vector similarity
- Exact versus approximate search
- Measuring search
- Graph connectivity
- Navigable small worlds
- How to search a navigable small world
- Hierarchical navigable small worlds
- The need for vector databases
- How vector search enhances AI models
- Case studies and real-world applications
- Okta - natural language access request (semantic search)
- One AI - language-based AI (RAG over business data)
- Novo Nordisk - automatic clinical study generation (advanced RAG/RPA)
- Vector search best practices
- Data modeling
- Deployment
- Summary
- Chapter 6: AI/ML Application Design
- Technical requirements
- Data modeling
- Enriching data with embeddings
- Considering search use cases
- Data storage
- Determining the type of database cluster
- Determining IOPS
- Determining RAM
- Final cluster configuration
- Performance and availability versus cost
- Data flow
- Handling static data sources
- Storing operational data enriched with vector embeddings
- Freshness and retention
- Real-time updates
- Data lifecycle
- Adopting new embedding models
- Security and RBAC
- Best practices for AI/ML application design
- Summary
- Part 2
- Building Your Python Application: Frameworks, Libraries, APIs, and Vector Search
- Chapter 7: Useful Frameworks, Libraries, and APIs
- Technical requirements
- Python for AI/ML
- AI/ML frameworks
- LangChain
- LangChain semantic search with score
- Semantic search with pre-filtering
- Implementing a basic RAG solution with LangChain.
- LangChain prompt templates and chains
- Key Python libraries
- pandas
- PyMongoArrow
- PyTorch
- AI/ML APIs
- OpenAI API
- Hugging Face
- Summary
- Chapter 8: Implementing Vector Search in AI Applications
- Technical requirements
- Information retrieval with MongoDB Atlas Vector Search
- Vector search tutorial in Python
- Vector Search tutorial with LangChain
- Building RAG architecture systems
- Chunking or document-splitting strategies
- Simple RAG
- Advanced RAG
- Summary
- Part 3
- Optimizing AI Applications: Scaling, Fine-Tuning, Troubleshooting, Monitoring, and Analytics
- Chapter 9: LLM Output Evaluation
- Technical requirements
- What is LLM evaluation?
- Component and end-to-end evaluations
- Model benchmarking
- Evaluation datasets
- Defining a baseline
- User feedback
- Synthetic data
- Evaluation metrics
- Assertion-based metrics
- Statistical metrics
- LLM-as-a-judge evaluations
- RAG metrics
- Human review
- Evaluations as guardrails
- Summary
- Chapter 10: Refining the Semantic Data Model to Improve Accuracy
- Technical requirements
- Embeddings
- Experimenting with different embedding models
- Fine-tuning embedding models
- Embedding metadata
- Formatting metadata
- Including static metadata
- Extracting metadata programmatically
- Generating metadata with LLMs
- Including metadata with query embedding and ingested content embeddings
- Optimizing retrieval-augmented generation
- Query mutation
- Extracting query metadata for pre-filtering
- Formatting ingested data
- Advanced retrieval systems
- Summary
- Chapter 11: Common Failures of Generative AI
- Technical requirements
- Hallucinations
- Causes of hallucinations
- Implications of hallucinations
- Sycophancy
- Causes of sycophancy
- Implications of sycophancy
- Data leakage
- Causes of data leakage.
- Implications of data leakage
- Cost
- Types of costs
- Tokens
- Performance issues in generative AI applications
- Computational load
- Model serving strategies
- High I/O operations
- Summary
- Chapter 12: Correcting and Optimizing Your Generative AI Application
- Technical requirements
- Baselining
- Training and evaluation datasets
- Few-shot prompting
- Retrieval and reranking
- Late interaction strategies
- Query rewriting
- Testing and red teaming
- Testing
- Red teaming
- Information post-processing
- Other remedies
- Summary
- Appendix: Further Reading
- Index
- Other Books You May Enjoy.