Graph data science with Neo4j learn how to use Neo4j 5 with Graph Data Science Library 2.0 and its Python driver for your project

The ever-increasing need of graph representation among data scientists for modeling complex relationships and extracting contextual information is addressed by the latest version of Neo4j. This book shows you how to set up a graph machine learning pipeline using Neo4j 5, its Graph Data Science Libra...

Descripción completa

Detalles Bibliográficos
Otros Autores:	Scifo, Estelle, author (author)
Formato:	Libro electrónico
Idioma:	Inglés
Publicado:	Birmingham, England : Packt Publishing, Limited [2023]
Edición:	1st ed
Materias:	Non-relational databases. Graphic methods.
Ver en Biblioteca Universitat Ramon Llull:	https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009720313006719

Tabla de Contenidos:

Cover
Copyright
Contributors
Table of Contents
Preface
Part 1 - Creating Graph Data in Neo4j
Chapter 1: Introducing and Installing Neo4j
Technical requirements
What is a graph database?
Databases
Graph database
Finding or creating a graph database
A note about the graph dataset's format
Modeling your data as a graph
Neo4j in the graph databases landscape
Neo4j ecosystem
Setting up Neo4j
Downloading and starting Neo4j Desktop
Creating our first Neo4j database
Creating a database in the cloud - Neo4j Aura
Inserting data into Neo4j with Cypher, the Neo4j query language
Extracting data from Neo4j with Cypher pattern matching
Summary
Further reading
Exercises
Chapter 2: Importing Data into Neo4j to Build a Knowledge Graph
Technical requirements
Importing CSV data into Neo4j with Cypher
Discovering the Netflix dataset
Defining the graph schema
Importing data
Introducing the APOC library to deal with JSON data
Browsing the dataset
Getting to know and installing the APOC plugin
Loading data
Dealing with temporal data
Discovering the Wikidata public knowledge graph
Data format
Query language - SPARQL
Enriching our graph with Wikidata information
Loading data into Neo4j for one person
Importing data for all people
Dealing with spatial data in Neo4j
Importing data in the cloud
Summary
Further reading
Exercises
Part 2 - Exploring and Characterizing Graph Data with Neo4j
Chapter 3: Characterizing a Graph Dataset
Technical requirements
Characterizing a graph from its node and edge properties
Link direction
Link weight
Node type
Computing the graph degree distribution
Definition of a node's degree
Computing the node degree with Cypher
Visualizing the degree distribution with NeoDash.
Installing and using the Neo4j Python driver
Counting node labels and relationship types in Python
Building the degree distribution of a graph
Improved degree distribution
Learning about other characterizing metrics
Triangle count
Clustering coefficient
Summary
Further reading
Exercises
Chapter 4: Using Graph Algorithms to Characterize a Graph Dataset
Technical requirements
Digging into the Neo4j GDS library
GDS content
Installing the GDS library with Neo4j Desktop
GDS project workflow
Projecting a graph for use by GDS
Native projections
Cypher projections
Computing a node's degree with GDS
stream mode
The YIELD keyword
write mode
mutate mode
Algorithm configuration
Other centrality metrics
Understanding a graph's structure by looking for communities
Number of components
Modularity and the Louvain algorithm
Summary
Further reading
Chapter 5: Visualizing Graph Data
Technical requirements
The complexity of graph data visualization
Physical networks
General case
Visualizing a small graph with networkx and matplotlib
Visualizing a graph with known coordinates
Visualizing a graph with unknown coordinates
Configuring object display
Discovering the Neo4j Bloom graph application
What is Bloom?
Bloom installation
Selecting data with Neo4j Bloom
Configuring the scene in Bloom
Visualizing large graphs with Gephi
Installing Gephi and its required plugin
Using APOC Extended to synchronize Neo4j and Gephi
Configuring the view in Gephi
Summary
Further reading
Exercises
Part 3 - Making Predictions on a Graph
Chapter 6: Building a Machine Learning Model with Graph Features
Technical requirements
Introducing the GDS Python client
GDS Python principles
Input and output types.
Creating a projected graph from Python
Running GDS algorithms from Python and extracting data in a dataframe
write mode
stream mode
Dropping the projected graph
Using features from graph algorithms in a scikit-learn pipeline
Machine learning tasks with graphs
Our task
Computing features
Extracting and visualizing data
Building the model
Summary
Further reading
Exercise
Chapter 7: Automatically Extracting Features with Graph Embeddings for Machine Learning
Technical requirements
Introducing graph embedding algorithms
Defining embeddings
Graph embedding classification
Using a transductive graph embedding algorithm
Understanding the Node2Vec algorithm
Using Node2Vec with GDS
Training an inductive embedding algorithm
Understanding GraphSAGE
Introducing the GDS model catalog
Training GraphSAGE with GDS
Computing new node representations
Summary
Further reading
Exercises
Chapter 8: Building a GDS Pipeline for Node Classification Model Training
Technical requirements
The GDS pipelines
What is a pipeline?
Building and training a pipeline
Creating the pipeline and choosing the features
Setting the pipeline configuration
Training the pipeline
Making predictions
Computing the confusion matrix
Using embedding features
Choosing the graph embedding algorithm to use
Training using Node2Vec
Training using GraphSAGE
Summary
Further reading
Exercise
Chapter 9: Predicting Future Edges
Technical requirements
Introducing the LP problem
LP examples
LP with the Netflix dataset
Framing an LP problem
LP features
Topological features
Features based on node properties
Building an LP pipeline with the GDS
Creating and configuring the pipeline
Pipeline training and testing
Summary
Further reading.
Chapter 10: Writing Your Custom Graph Algorithms with the Pregel API in Java
Technical requirements
Introducing the Pregel API
GDS's features
The Pregel API
Implementing the PageRank algorithm
The PageRank algorithm
Simple Python implementation
Pregel Java implementation
Implementing the tolerance-stopping criteria
Testing our code
Test for the PageRank class
Test for the PageRankTol class
Using our algorithm from Cypher
Adding annotations
Building the JAR file
Updating the Neo4j configuration
Testing our procedure
Summary
Further reading
Exercises
Index
Other Books You May Enjoy.

Graph data science with Neo4j learn how to use Neo4j 5 with Graph Data Science Library 2.0 and its Python driver for your project

Ejemplares similares