Parallel R

It's tough to argue with R as a high-quality, cross-platform, open source statistical software product-unless you're in the business of crunching Big Data. This concise book introduces you to several strategies for using R to analyze large datasets. You'll learn the basics of Snow, M...

Descripción completa

Detalles Bibliográficos
Autor principal: McCallum, Q. Ethan (-)
Otros Autores: Weston, Stephen (illustrator), Loukides, Michael Kosta, Blanchette, Meghan, Romano, Robert (Illustrator), illustrator
Formato: Libro electrónico
Idioma:Inglés
Publicado: Sebastopol, CA : O'Reilly 2011.
Edición:First edition
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009628001906719
Tabla de Contenidos:
  • Table of Contents; Preface; Conventions Used in This Book; Using Code Examples; Safari® Books Online; How to Contact Us; Acknowledgments; Q. Ethan McCallum; Stephen Weston; Chapter 1. Getting Started; Why R?; Why Not R?; The Solution: Parallel Execution; A Road Map for This Book; What We'll Cover; Looking Forward...; What We'll Assume You Already Know; In a Hurry?; snow; multicore; parallel; R+Hadoop; RHIPE; Segue; Summary; Chapter 2. snow; Quick Look; How It Works; Setting Up; Working with It; Creating Clusters with makeCluster; Parallel K-Means; Initializing Workers
  • Load Balancing with clusterApplyLBTask Chunking with parLapply; Vectorizing with clusterSplit; Load Balancing Redux; Functions and Environments; Random Number Generation; snow Configuration; Installing Rmpi; Executing snow Programs on a Cluster with Rmpi; Executing snow Programs with a Batch Queueing System; Troubleshooting snow Programs; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 3. multicore; Quick Look; How It Works; Setting Up; Working with It; The mclapply Function; The mc.cores Option; The mc.set.seed Option; Load Balancing with mclapply; The pvec Function
  • The parallel and collect FunctionsUsing collect Options; Parallel Random Number Generation; The Low-Level API; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 4. parallel; Quick Look; How It Works; Setting Up; Working with It; Getting Started; Creating Clusters with makeCluster; Parallel Random Number Generation; Summary of Differences; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 5. A Primer on MapReduce and Hadoop; Hadoop at Cruising Altitude; A MapReduce Primer; Thinking in MapReduce: Some Pseudocode Examples; Calculate Average Call Length for Each Date
  • Number of Calls by Each User, on Each DateRun a Special Algorithm on Each Record; Binary and Whole-File Data: SequenceFiles; No Cluster? No Problem! Look to the Clouds...; The Wrap-up; Chapter 6. R+Hadoop; Quick Look; How It Works; Setting Up; Working with It; Simple Hadoop Streaming (All Text); Streaming, Redux: Indirectly Working with Binary Data; The Java API: Binary Input and Output; Processing Related Groups (the Full Map and Reduce Phases); When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 7. RHIPE; Quick Look; How It Works; Setting Up; Working with It; Phone Call Records, Redux
  • Tweet BrevityMore Complex Tweet Analysis; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 8. Segue; Quick Look; How It Works; Setting Up; Working with It; Model Testing: Parameter Sweep; When It Works...; ...And When It Doesn't; The Wrap-up; Chapter 9. New and Upcoming; doRedis; RevoScale R and RevoConnectR (RHadoop); cloudNumbers.com