Apache Spark graph processing build, process, and analyze large-scale graphs with Spark

Build, process and analyze large-scale graph data effectively with Spark About This Book Find solutions for every stage of data processing from loading and transforming graph data to Improve the scalability of your graphs with a variety of real-world applications with complete Scala code. A concise...

Descripción completa

Detalles Bibliográficos
Otros Autores: Ramamonjison, Rindra, author (author), Lee, Denny, author of foreword (author of foreword)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham : Packt Publishing 2015.
Edición:1st edition
Colección:Community experience distilled.
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009629727506719
Tabla de Contenidos:
  • Cover; Copyright; Credits; Foreword; About the Author; About the Reviewer; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Getting Started with Spark and GraphX; Downloading and installing Spark 1.4.1; Experimenting with the Spark shell; Getting started with GraphX; Building a tiny social network; Loading the data; The property graph; Transforming RDDs to VertexRDD and EdgeRDD; Introducing graph operations; Building and submitting a standalone application; Writing and configuring a Spark program; Building the program with the Scala Build Tool; Deploying and running with spark-submit
  • SummaryChapter 2: Building and Exploring Graphs; Network datasets; The communication network; Flavor networks; Social ego networks; Graph builders; The Graph factory method; edgeListFile; fromEdges; fromEdgeTuples; Building graphs; Building directed graphs; Building a bipartite graph; Building a weighted social ego network; Computing the degrees of the network nodes; In-degree and out-degree of the Enron email network; Degrees in the bipartite food network; Degree histogram of the social ego networks; Summary; Chapter 3: Graph Analysis and Visualization; Network datasets
  • The graph visualizationInstalling the GraphStream and BreezeViz libraries; Visualizing the graph data; Plotting the degree distribution; The analysis of network connectedness; Finding the connected components; Counting triangles and computing clustering coefficients; The network centrality and PageRank; How PageRank works; Ranking web pages; Scala Build Tool revisited; Organizing build definitions; Managing library dependencies; A preview of the steps; Running tasks with SBT commands; Summary; Chapter 4: Transforming and Shaping Up Graphs to Your Needs
  • Transforming the vertex and edge attributesmapVertices; mapEdges; mapTriplets; Modifying graph structures; The reverse operator; The subgraph operator; The mask operator; The groupEdges operator; Joining graph datasets; joinVertices; outerJoinVertices; Example - Hollywood movie graph; Data operations on VertexRDD and EdgeRDD; Mapping VertexRDD and EdgeRDD; Filtering VertexRDDs; Joining VertexRDDs; Joining EdgeRDDs; Reversing edge directions; Collecting neighboring information; Example - from food network to flavor pairing; Summary; Chapter 5: Creating Custom Graph Aggregation Operators
  • NCAA College Basketball datasetsThe aggregateMessages operator; EdgeContext; Abstracting out the aggregation; Keeping things DRY; Coach wants more numbers; Calculating average points per game; Defense stats - D matters as in direction; Joining average stats into a graph; Performance optimization; The MapReduceTriplets operator; Summary; Chapter 6: Iterative Graph-Parallel Processing with Pregel; The Pregel computational model; Example - iterating towards the social equality; The Pregel API in GraphX; Community detection through label propagation; The Pregel implementation of PageRank; Summary
  • Chapter 7: Learning Graph Structures