1

I used the Neo4j API to batch insert a few million records, which normally would take much longer if done individually. The import finished considerably faster but I do not see the millions of records I inserted. Does Neo4j keep some form of queue for the batch insertions and insert it over time?

If so, how can I see the progress of this queue? I am doing a count and I notice the records are increasing, but at a very slow pace.

I am using the Neography gem's batch insertion (https://github.com/maxdemarzi/neography/wiki/Batch) and the code that does the batch insert is below:

User.find_in_batches(batch_size: 500) do |group|
   $neo4j.batch(*group.map { |user| ["create_unique_node", "users", "id", user.id, user.graph_node_properties] })
end

Running Neo4j 2.1.2 enterprise edition on Ubuntu 12.04 LTS.

Jey Balachandran
  • 3,585
  • 5
  • 27
  • 36

1 Answers1

0

Did you shutdown the batch inserter correctly after finishing your insert?

Which Neo4j version and OS are you using?

Also make sure your memory config is correct. Configure the memory-mapping fore the nodestore and relationshipstore correctly, see Rik's blog post for that.

It is not a queue, it is writing those records directly. How do you "check" and count? You must not access the database concurrently.

Can you share your code?

Michael Hunger
  • 41,339
  • 3
  • 57
  • 80
  • I'm using Neography gem and it's batch insertion: https://github.com/maxdemarzi/neography/wiki/Batch. Neo4j Enterprise 2.1.2 on Ubuntu 12.04 LTS. I'm checking the count via `match (m:User) return count(*)`. – Jey Balachandran Jul 27 '14 at 21:26
  • Can you try `match (m) return count(*) ` I don' think that batch-mode supports labels, the unique index there is a legacy index. You'd have to run `start n=node:users("id:*") return count(*)` – Michael Hunger Jul 30 '14 at 09:23
  • You are correct, it was because I wasn't adding the labels. – Jey Balachandran Jul 30 '14 at 15:44