Advanced analytics with Spark

In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world data sets together to teach you how to approach analytics problems by...

Descripción completa

Detalles Bibliográficos
Otros Autores: Ryza, Sandy, author (author), Ryza, Sandy (-)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Beijing, [China] : O'Reilly 2017.
Edición:2nd edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630361306719
Tabla de Contenidos:
  • Analyzing big data
  • Introduction to data analysis with Scala and Spark
  • Recommending music and the audioscrobbler data set
  • Predicting forest cover with decision trees
  • Anomaly detection in network traffic with K-means clustering
  • Understanding Wikipedia with latent semantic analysis
  • Analyzing co-occurrence networks with GraphX
  • Geospatial and temporal data analysis on the New York City taxi trip data
  • Estimating financial risk through Monte Carlo simulation
  • Analyzing genomics data and the BDG project
  • Analyzing neuroimaging data with PySpark and Thunder.