How does "gds.graph.project" actually work in Neo4j?

Question

I'm a neo4j beginner.

While praticing in neo4j online course-Neo4j Graph Data Science Fundamentals, I'm confused about the following "Course Example".

Course Example

First, create the graph projection.

CALL gds.graph.project('proj',
    ['Person','Movie'],
    {
        ACTED_IN:{orientation:'UNDIRECTED'},
        DIRECTED:{orientation:'UNDIRECTED'}
    }
);

Then we can run Dijkstra’s shortest path.

MATCH (a:Actor)
WHERE a.name IN ['Kevin Bacon', 'Denzel Washington']
WITH collect(id(a)) AS nodeIds
CALL gds.shortestPath.dijkstra.stream('proj', {sourceNode:nodeIds[0], TargetNode:nodeIds[1]})
YIELD sourceNode, targetNode, path
RETURN gds.util.asNode(sourceNode).name AS sourceNodeName,
    gds.util.asNode(targetNode).name AS targetNodeName,
    nodes(path) as path;

Question

Regarding gds.graph.project my understanding is that it creates a subgraph (subset) from the original graph.

However, in the creation of this subgraph, there are no nodes with the Actor label. So why is it possible to execute MATCH (a:Actor) when performing the Dijkstra algorithm?

score 2 · Accepted Answer · answered May 25 '23 at 07:53

The MATCH (a:Actor) clause is querying the full database, not the GDS projection ('proj'). Only the procedures whose names start with gds. access the GDS.

Also, a node can have any number of labels. So, a Person node can also have the Actor label, or Director, or all 3.

How does "gds.graph.project" actually work in Neo4j?

Course Example

Question

1 Answers1