Data ingestion with Python cookbook a practical guide to ingesting, monitoring, and identifying errors in the data ingestion process

Deploy your data ingestion pipeline, orchestrate, and monitor efficiently to prevent loss of data and quality Purchase of the print or Kindle book includes a free PDF eBook Key Features Harness best practices to create a Python and PySpark data ingestion pipeline Seamlessly automate and orchestrate...

Descripción completa

Detalles Bibliográficos
Otros Autores: Esppenchutz, Gláucia, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England : Packt Publishing Ltd [2023]
Edición:1st ed
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009752719606719
Tabla de Contenidos:
  • Table of Contents Introduction to Data Ingestion Principals of Data Access – Accessing your Data Data Discovery – Understanding Our Data Before Ingesting It Reading CSV and JSON Files and Solving Problems Ingesting Data from Structured and Unstructured Databases Using PySpark with Defined and Non-Defined Schemas Ingesting Analytical Data Designing Monitored Data Workflows Putting Everything Together with Airflow Logging and Monitoring Your Data Ingest in Airflow Automating Your Data Ingestion Pipelines Using Data Observability for Debugging, Error Handling, and Preventing Downtime.