Essential PySpark for data analytics a beginner's guide to harnessing the power and ease of PySpark 3.0

Get started with distributed computing using PySpark, a single unified framework to solve end-to-end data analytics at scale Key Features Discover how to convert huge amounts of raw data into meaningful and actionable insights Use Spark's unified analytics engine for end-to-end analytics, from...

Descripción completa

Detalles Bibliográficos
Otros Autores: Nudurupati, Sreeram, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England ; Mumbai : Packt [2021]
Edición:1st edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009635717106719
Tabla de Contenidos:
  • Table of Contents Distributed Computing Primer Data Ingestion Data Cleansing and Integration Real-time Data Analytics Scalable Machine Learning with PySpark Feature Engineering – Extraction, Transformation, and Selection Supervised Machine Learning Unsupervised Machine Learning Machine Learning Life Cycle Management Scaling Out Single-Node Machine Learning Using PySpark Data Visualization with PySpark Spark SQL Primer Integrating External Tools with Spark SQL The Data Lakehouse.