I am currently using SpringDataNeo4jā4.1.0.M1 for a data set containing about 1.2 million nodes and 12 million relationships. The data structure behind the graph is very complex and hierarchical. In total there are 79 NodeEntities and some of them can contain more than 20 @Relationship attributes.
Below an example of the hierarchy
An example of my domain
@NodeEntity
public abstract class DatabaseObject implements java.io.Serializable {
@GraphId
private Long id;
private Long dbId;
private String stableIdentifier;
private String displayName;
@Relationship(type = "created")
private InstanceEdit created;
@Relationship(type = "modified")
private List<InstanceEdit> modified
...
@NodeEntity
public abstract class Event extends DatabaseObject {
private String definition;
private List<String> names;
private Boolean isInferred;
@Relationship(type = "authored", direction = "OUTGOING")
private List<InstanceEdit> authored;
@Relationship(type = "precedingEvent")
private List<Event> precedingEvent;
@Relationship(type = "literatureReference")
private List<Publication> literatureReference;
@Relationship(type = "regulatedBy")
private List<Regulation> regulatedBy;
...
@NodeEntity
public class ReactionLikeEvent extends Event {
private Boolean isChimeric;
private String systematicName;
@Relationship(type = "input")
private List<Input> input;
@Relationship(type = "output")
private List<Output> output;
...
An example of my Repositories
@Repository
public interface DatabaseObjectRepository extends GraphRepository<DatabaseObject>{
DatabaseObject findByDbId(Long dbId);
...
While queries (to be specific queries for objects that have many relationships) to the Neo4j-Restful-Service or to the Remote-Web-Admin perform as expected (10-100ms max) the query performance drops drastically when retrieving entries using SDN (100 - 500ms). This performance drops only occur when setting to query depth to 1. If query depth is 0 and no relationships are returned then the response is fast. Indexes are created and response times do not change when performing a query with the neo4j native id.
For other use cases (for example specific queries for smaller objects, @QueryResult objects or Collections of objects) SDN performs nicely. My problem is specific to queries that retrieve objects with many relationships or queries with increased depth (more than one). Is the bad performance a result of the complex domain hierarchy and too rich NodeEntites, do I need to reduce my hierarchy to achieve a better performance?
Thanks for your help