3

I'm trying to insert data from my SQL db into Neo4J. I have a CSV file where every row generates 4-5 entities and some relations between them. Entities might be duplicate between rows and I want to force uniqueness.

What I currently do is:

  • create constraints for each label to force uniqueness.
  • iterate the CSV:
    • start transaction
    • create merge statements for the entities
    • create merge statements for the relations
    • commit transaction

I got bad results. Then I tried to commit the transaction every X rows (X was 100, 500, 1000 and 5000). It's better now but I still have 2 problems:

  • it's slow. on average around 1-1.5 seconds per 100 rows. (row = 4-5 entities and 4-5 relations).
  • it's getting worse as I keep adding data. I usually start with 400-500 ms per 100 rows and after ~5000 rows I'm at ~4-5 seconds per 100 rows.

From what I know, my constraint also creates an index for that field. That's the field that is used when I create the new node with MERGE. Any chance it doesn't use the index?

What's the best practice for improving performance? I saw BatchInserter but wasn't sure if I can use it with MERGE operations.

Thanks

Zach Moshe
  • 2,782
  • 4
  • 24
  • 40
  • 1
    I'm dealing with this exact same issue. Check out some of my SO questions to see things I've tried. I'm going to try filtering IDs so that I can batch with 1 match and 1 merge. – dcinzona Mar 02 '14 at 16:40
  • filtering by id's as recommended in this so: http://stackoverflow.com/questions/22102181/neo4j-best-way-to-batch-relate-nodes-using-cypher/22116658?noredirect=1#22116658 did not work. – dcinzona Mar 02 '14 at 22:47
  • What driver are you using to programmatically generate the insert statements off of your CSV? – Kenny Bastani Mar 04 '14 at 22:09
  • Do you need MERGE? I found that CREATE is order of magnitude faster and causes less db hits. Try running a small query using PROFILE; first with MERGE and then with CREATE to see if it makes a difference for you. – untitled May 21 '15 at 13:09

0 Answers0