Rebuilding reliable data pipelines through modern tools

When data-driven applications fail, identifying the cause is both challenging and time-consuming—especially as data pipelines become more and more complex. Hunting for the root cause of application failure from messy, raw, and distributed logs is difficult for performance experts and a nightmare for...

Descripción completa

Detalles Bibliográficos
Otros Autores: Malaska, Ted, author (author), Babu, Shivnath, author
Formato: Libro electrónico
Idioma:Inglés
Publicado: Sebastopol, CA : O'Reilly Media [2019]
Edición:First edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630882406719
Descripción
Sumario:When data-driven applications fail, identifying the cause is both challenging and time-consuming—especially as data pipelines become more and more complex. Hunting for the root cause of application failure from messy, raw, and distributed logs is difficult for performance experts and a nightmare for data operations teams. This report examines DataOps processes and tools that enable you to manage modern data pipelines efficiently. Author Ted Malaska describes a data operations framework and shows you the importance of testing and monitoring to plan, rebuild, automate, and then manage robust data pipelines—whether it’s in the cloud, on premises, or in a hybrid configuration. You’ll also learn ways to apply performance monitoring software and AI to your data pipelines in order to keep your applications running reliably. You’ll learn: How performance management software can reduce the risk of running modern data applications Methods for applying AI to provide insights, recommendations, and automation to operationalize big data systems and data applications How to plan, migrate, and operate big data workloads and data pipelines in the cloud and in hybrid deployment models
Descripción Física:1 online resource (1 volume) : illustrations
ISBN:9781492058175
9781492058168