2

I've been playing with a neo4j 2.0 db for a few months and I plan to install the db on a dedicated server. I've already tried several configs of neo4j (jvm, caches, ...) but I'm still not sure to have found the best one. Therefore, it seems better to ask to the experts :)

Context

Db primitives:

Nodes = 224,114,478

Relationships = 417,681,104

Properties = 224,342,951

Db files:

  • nodestore.db = 3.064Gb

  • relationshipstore.db = 13.460Gb

  • propertystore.db = 8.982 Gb

  • propertystore.db.string = 5.998 Gb

  • propertystore.db.arrays = 1kb

OS server:

Windows server 2012 (64b)

Db usage:

Mostly graph traversals using cypher queries.

Perfs are not too bad on my dev laptop even if some queries have huge lags (I suspect that the main reason is swap caused by lack of RAM)

Graph specificities:

I suspect that some nodes may be huge hubs (till 1M relationships) but it should remain exceptional.


What would be your advices regarding:

  • hardware sizing,

  • neo4j configuration:

    • heap size,

    • use memory map buffer (is there any reason to keep the value to false with windows ?)

    • cache type,

    • recommended jvm settings for windows,

    • ...

Thanks in advance !

laurent

laurent
  • 25
  • 1
  • 4

2 Answers2

1

Please take a look at these two guides for performance tuning and hardware sizing calculations:

Kenny Bastani
  • 3,268
  • 15
  • 20
  • I've read the performance tuning guide (several times) but was still not sure about potential specificities in windows environment. But you're right, they're very good resources on the subject. – laurent Nov 08 '14 at 16:09
1

The on-disk size of your graph is in total ~32 GB. Neo4j has a two layered cache architecture. The first layer is the file buffer cache. Ideally it should have the same size as the on disk graph, so ~32GB in your case.

IMPORTANT When running Neo4j on Windows, the file buffer cache is part of Java heap (due to the suckiness of Windows by itself). On Linux/Mac it's off heap. That is reason why I generally do not recommend production environments for Neo4j on Windows.

cache_type should be hpc when using enterprise edition and soft for community.

To have some reasonable amount for the second cache layer (object cache) I'd suggest to have a machine with at least 64GB RAM. Since file buffer and object cache are both on heap, make heap large and consider using G1: -Xmx60G -XX:+UseG1GC. Observe GC behaviour by uncommenting gc logging in neo4j-wrapper.conf and tweak the settings step by step.

Please note that Neo4j 2.2 might come with a different file buffer cache implementation that works off heap on Windows as well.

Stefan Armbruster
  • 39,465
  • 6
  • 87
  • 97