Preprocessing unstructured data for LLMs and RAG systems

This course offers an in-depth exploration of preprocessing unstructured data for large language models and retrieval-augmented generation systems. You'll start by setting up your development environment and configuring essential APIs, ensuring a solid technical foundation. Next, you'll di...

Full description

Bibliographic Details
Corporate Author: Packt Publishing, publisher (publisher)
Other Authors: Dichone, Paulo, instructor (instructor)
Format: Online Video
Language:Inglés
Published: [Birmingham, United Kingdom] : Packt Publishing [2024]
Edition:[First edition]
Subjects:
See on Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009852337706719
Description
Summary:This course offers an in-depth exploration of preprocessing unstructured data for large language models and retrieval-augmented generation systems. You'll start by setting up your development environment and configuring essential APIs, ensuring a solid technical foundation. Next, you'll dive into data preprocessing techniques, tackling challenges like content extraction, cleaning, and data normalization, making your data ready for advanced AI models. As you progress, the course provides hands-on experience with various document types such as PDFs, HTML, and PPTX files. You'll learn to transform these unstructured formats into structured data that AI systems can easily process. Advanced modules cover chunking, metadata extraction, and handling complex documents using cutting-edge techniques like visual transformers and document layout detectors. The final section guides you in building a complete RAG system using the skills acquired throughout the course. You'll preprocess diverse documents, implement semantic similarity searches, and save elements to a vector database. By the end, you'll be equipped to create intelligent data pipelines and interact with your documents using AI, significantly enhancing your data-driven projects.
Physical Description:1 online resource (1 video file (3 hr., 2 min.)) : sound, color
ISBN:9781836642930