I am trying to load a CSV file into a Neo4j database where the file contains different types of edges between nodes. I would like to load all the different types of edges from one file in one query (as opposed to breaking the file into separate files each containing a different type of edge). For instance:
Source|Target|Relationship
x1|y1|Creates
x2|y2|Uses
x3|y1|Uses
Cypher does not like the following load query:
LOAD CSV WITH HEADERS FROM 'file:///filename.csv' AS line FIELDTERMINATOR '|'
MERGE (x:Node {name: line.Source})
MERGE (y:Node {name: line.Target})
CREATE (x)-[:line.Relationship]->(y)
As suggested here, I can use APOC instead as the following:
LOAD CSV WITH HEADERS FROM 'file:///filename.csv' AS line FIELDTERMINATOR '|'
MERGE (x:Node {name: line.Source})
MERGE (y:Node {name: line.Target})
CALL apoc.create.relationship(x, line.relationship, y) YIELD rel
RETURN *
However, this performs very slowly when run on a large scale (50,000) compared to the first example, and I suspect it is related to YIELD rel RETURN *
. I am using Neo4j's .NET driver, and executing this query, returns a list of all the edges it has created.
Naively dropping YIELD
or RETURN
results in errors such as the following (see this for some explanation):
Query cannot conclude with CALL together with YIELD
So, I was wondering how best I can improve the above query, ideally without having to return or yield any of the results.