Graph data science with Neo4j learn how to use Neo4j 5 with Graph Data Science Library 2.0 and its Python driver for your project

The ever-increasing need of graph representation among data scientists for modeling complex relationships and extracting contextual information is addressed by the latest version of Neo4j. This book shows you how to set up a graph machine learning pipeline using Neo4j 5, its Graph Data Science Libra...

Descripción completa

Detalles Bibliográficos
Otros Autores: Scifo, Estelle, author (author)
Formato: Libro electrónico
Idioma:Inglés
Publicado: Birmingham, England : Packt Publishing, Limited [2023]
Edición:1st ed
Materias:
Ver en Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009720313006719
Tabla de Contenidos:
  • Cover
  • Copyright
  • Contributors
  • Table of Contents
  • Preface
  • Part 1 - Creating Graph Data in Neo4j
  • Chapter 1: Introducing and Installing Neo4j
  • Technical requirements
  • What is a graph database?
  • Databases
  • Graph database
  • Finding or creating a graph database
  • A note about the graph dataset's format
  • Modeling your data as a graph
  • Neo4j in the graph databases landscape
  • Neo4j ecosystem
  • Setting up Neo4j
  • Downloading and starting Neo4j Desktop
  • Creating our first Neo4j database
  • Creating a database in the cloud - Neo4j Aura
  • Inserting data into Neo4j with Cypher, the Neo4j query language
  • Extracting data from Neo4j with Cypher pattern matching
  • Summary
  • Further reading
  • Exercises
  • Chapter 2: Importing Data into Neo4j to Build a Knowledge Graph
  • Technical requirements
  • Importing CSV data into Neo4j with Cypher
  • Discovering the Netflix dataset
  • Defining the graph schema
  • Importing data
  • Introducing the APOC library to deal with JSON data
  • Browsing the dataset
  • Getting to know and installing the APOC plugin
  • Loading data
  • Dealing with temporal data
  • Discovering the Wikidata public knowledge graph
  • Data format
  • Query language - SPARQL
  • Enriching our graph with Wikidata information
  • Loading data into Neo4j for one person
  • Importing data for all people
  • Dealing with spatial data in Neo4j
  • Importing data in the cloud
  • Summary
  • Further reading
  • Exercises
  • Part 2 - Exploring and Characterizing Graph Data with Neo4j
  • Chapter 3: Characterizing a Graph Dataset
  • Technical requirements
  • Characterizing a graph from its node and edge properties
  • Link direction
  • Link weight
  • Node type
  • Computing the graph degree distribution
  • Definition of a node's degree
  • Computing the node degree with Cypher
  • Visualizing the degree distribution with NeoDash.
  • Installing and using the Neo4j Python driver
  • Counting node labels and relationship types in Python
  • Building the degree distribution of a graph
  • Improved degree distribution
  • Learning about other characterizing metrics
  • Triangle count
  • Clustering coefficient
  • Summary
  • Further reading
  • Exercises
  • Chapter 4: Using Graph Algorithms to Characterize a Graph Dataset
  • Technical requirements
  • Digging into the Neo4j GDS library
  • GDS content
  • Installing the GDS library with Neo4j Desktop
  • GDS project workflow
  • Projecting a graph for use by GDS
  • Native projections
  • Cypher projections
  • Computing a node's degree with GDS
  • stream mode
  • The YIELD keyword
  • write mode
  • mutate mode
  • Algorithm configuration
  • Other centrality metrics
  • Understanding a graph's structure by looking for communities
  • Number of components
  • Modularity and the Louvain algorithm
  • Summary
  • Further reading
  • Chapter 5: Visualizing Graph Data
  • Technical requirements
  • The complexity of graph data visualization
  • Physical networks
  • General case
  • Visualizing a small graph with networkx and matplotlib
  • Visualizing a graph with known coordinates
  • Visualizing a graph with unknown coordinates
  • Configuring object display
  • Discovering the Neo4j Bloom graph application
  • What is Bloom?
  • Bloom installation
  • Selecting data with Neo4j Bloom
  • Configuring the scene in Bloom
  • Visualizing large graphs with Gephi
  • Installing Gephi and its required plugin
  • Using APOC Extended to synchronize Neo4j and Gephi
  • Configuring the view in Gephi
  • Summary
  • Further reading
  • Exercises
  • Part 3 - Making Predictions on a Graph
  • Chapter 6: Building a Machine Learning Model with Graph Features
  • Technical requirements
  • Introducing the GDS Python client
  • GDS Python principles
  • Input and output types.
  • Creating a projected graph from Python
  • Running GDS algorithms from Python and extracting data in a dataframe
  • write mode
  • stream mode
  • Dropping the projected graph
  • Using features from graph algorithms in a scikit-learn pipeline
  • Machine learning tasks with graphs
  • Our task
  • Computing features
  • Extracting and visualizing data
  • Building the model
  • Summary
  • Further reading
  • Exercise
  • Chapter 7: Automatically Extracting Features with Graph Embeddings for Machine Learning
  • Technical requirements
  • Introducing graph embedding algorithms
  • Defining embeddings
  • Graph embedding classification
  • Using a transductive graph embedding algorithm
  • Understanding the Node2Vec algorithm
  • Using Node2Vec with GDS
  • Training an inductive embedding algorithm
  • Understanding GraphSAGE
  • Introducing the GDS model catalog
  • Training GraphSAGE with GDS
  • Computing new node representations
  • Summary
  • Further reading
  • Exercises
  • Chapter 8: Building a GDS Pipeline for Node Classification Model Training
  • Technical requirements
  • The GDS pipelines
  • What is a pipeline?
  • Building and training a pipeline
  • Creating the pipeline and choosing the features
  • Setting the pipeline configuration
  • Training the pipeline
  • Making predictions
  • Computing the confusion matrix
  • Using embedding features
  • Choosing the graph embedding algorithm to use
  • Training using Node2Vec
  • Training using GraphSAGE
  • Summary
  • Further reading
  • Exercise
  • Chapter 9: Predicting Future Edges
  • Technical requirements
  • Introducing the LP problem
  • LP examples
  • LP with the Netflix dataset
  • Framing an LP problem
  • LP features
  • Topological features
  • Features based on node properties
  • Building an LP pipeline with the GDS
  • Creating and configuring the pipeline
  • Pipeline training and testing
  • Summary
  • Further reading.
  • Chapter 10: Writing Your Custom Graph Algorithms with the Pregel API in Java
  • Technical requirements
  • Introducing the Pregel API
  • GDS's features
  • The Pregel API
  • Implementing the PageRank algorithm
  • The PageRank algorithm
  • Simple Python implementation
  • Pregel Java implementation
  • Implementing the tolerance-stopping criteria
  • Testing our code
  • Test for the PageRank class
  • Test for the PageRankTol class
  • Using our algorithm from Cypher
  • Adding annotations
  • Building the JAR file
  • Updating the Neo4j configuration
  • Testing our procedure
  • Summary
  • Further reading
  • Exercises
  • Index
  • Other Books You May Enjoy.