Advanced analytics with spark patterns for learning from data at scale

In this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by example. You'll...

Descripción completa

Detalles Bibliográficos
Otros Autores: Ryza, Sandy, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Sebastopol, California : O'Reilly 2015.
Edición:First edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009628947706719
Tabla de Contenidos:
  • Analyzing big data
  • Introduction to data analysis with Scala and Spark
  • Recommending music and the audioscrobbler data set
  • Predicting forest cover with decision trees
  • Anomaly detection in network traffic with K-means clustering
  • Understanding Wikipedia with latent semantic analysis
  • Analyzing co-occurrence networks with GraphX
  • Geospatial and temporal data analysis on the New York City taxi trip data
  • Estimating financial risk through Monte Carlo simulation
  • Analyzing genomics data and the BDG project
  • Analyzing neuroimaging data with PySpark and Thunder.