0

Hi everyone on Stackoverflow,

I have simple query to load csv, but browser takes long time to load and still shows loading sign.

Here is my query:

LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line 
WITH line MERGE (u:User{name:line.userId}) 
MERGE (m:Movie {movieId:line.movieId}) 
MERGE (u)-[r:RATED {rating:line.rating}]->(m) 
RETURN u, m, r

I don't see mistake in the code..

ratings_small.csv has following columns:

userId movieId rating timestamp

1 31 2.5 1260759144

1 1029 3 1260759179

1 1061 3 1260759182

1 1129 2 1260759185

1 1172 4 1260759205

There are 100 004 records.

Kind regards, Anna

Christophe Willemsen
  • 19,399
  • 2
  • 29
  • 36
Anna
  • 1
  • 1
  • 4

1 Answers1

2

I believe you don't have indexes on both :User(name) and :Movie(movieId).

Perform the two following queries and retry your LOAD CSV command, it should then import it in a couple of seconds :

CREATE CONSTRAINT ON (u:User) ASSERT u.name IS UNIQUE;
CREATE CONSTRAINT ON (m:Movie) ASSERT m.moveId IS UNIQUE;

Secondly, I suppose users rate a movie only once, so the MERGE of the relationship is not necessary to happen with the property, you can change your query like this :

LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line 
WITH line MERGE (u:User{name:line.userId}) 
MERGE (m:Movie {movieId:line.movieId}) 
MERGE (u)-[r:RATED]->(m)
SET r.rating = line.rating
RETURN u, m, r

To avoid Neo4j to try to process the full CSV in one transaction, you can specify the batch size of transactional operations (to avoid memory issues) like this :

USING PERIODIC COMMIT 2000
LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line 
WITH line MERGE (u:User{name:line.userId}) 
MERGE (m:Movie {movieId:line.movieId}) 
MERGE (u)-[r:RATED]->(m)
SET r.rating = line.rating
RETURN u, m, r

Lastly, I doubt you have any application or even the Neo4j browser that can visualise everything at once, so I would not return anything from the LOAD CSV query :

USING PERIODIC COMMIT 2000
LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line 
WITH line MERGE (u:User{name:line.userId}) 
MERGE (m:Movie {movieId:line.movieId}) 
MERGE (u)-[r:RATED]->(m)
SET r.rating = line.rating
Christophe Willemsen
  • 19,399
  • 2
  • 29
  • 36
  • Thank you I added constraints, now I have this error: Neo.DatabaseError.Statement.ExecutionFailed - java heap space – Anna Dec 29 '20 at 12:13
  • I set as follows: dbms.memory.heap.initial_size=5G dbms.memory.heap.max_size=5G – Anna Dec 29 '20 at 12:14
  • dbms.memory.pagecache.size=2G – Anna Dec 29 '20 at 12:39
  • The NODES (:Movie, :User) are loaded Ok one by one. Now I run the query: -------> USING PERIODIC COMMIT 2000 LOAD CSV WITH HEADERS FROM "file:///ratings_small.csv" AS line WITH line MATCH (u:User{userId:line.userId}), (m:Movie{movieId:line.movieId}) CREATE (u)-[r:RATED ]->(m) SET r.rating = line.rating ------> It outputs: Executing queries that use periodic commits in an open transaction is not possible – Anna Dec 29 '20 at 12:54
  • Remove the `WITH line` – Christophe Willemsen Dec 29 '20 at 15:22
  • Thank you. It says: The query builds a cartesian product between disconnected patterns – Anna Dec 29 '20 at 21:36