Data analytics with Hadoop an introduction for data scientists

Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you’ll focus on particular analys...

Descripción completa

Detalles Bibliográficos
Otros Autores: Bengfort, Benjamin, author (author), Kim, Jenny, author
Formato: Libro electrónico
Idioma:Inglés
Publicado: Beijing, [China] : O'Reilly Media 2016.
Edición:First edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009630353506719
Tabla de Contenidos:
  • The age of the data product
  • An operating system for big data
  • A framework for Python and Hadoop streaming
  • In-memory computing with Spark
  • Distributed analysis and patterns
  • Data mining and warehousing
  • Data ingestion
  • Analytics with higher-level APIs
  • Machine learning
  • Summary : doing distributed data science.