0

I am designing a brand new application, which relies a lot on dates. Basically, every query I make starts with a range of dates. I made a date tree like this :

(:Date)-[:NEXT_DAY]->(:Date)-[:NEXT_DAY]-> ....

I found that using [:NEXT_DAY] relations is very efficient to query ranges and ordering results.

I have many documents linked to these days :

(:Document)-[:PUBLISHED_ON]->(:Day)

The more basic query is to match all the documents published along the date path. This is my actual query :

MATCH DatePath = (b:Date)-[:NEXT*30]->(e:Date {day:20150101})
UNWIND nodes(DatePath) as date

WITH date
MATCH (doc:Document)-[:PUBLISHED_ON]->(date)

RETURN count(doc)

The query above takes almost a second to return less than 30K nodes. So my question is : is it a normal behaviour? Or maybe there is a better way to match relationships along path?

GuCier
  • 6,919
  • 1
  • 29
  • 36

1 Answers1

3

There are three optimizations you can do :

First of all, make sure your day property is indexed :

CREATE INDEX ON :Date(day);

Secondly, instead of matching first on a pattern which will result in a global graph lookup, try to split your query and start by using the day index :

MATCH (e:Date {day:20150101})
WITH e
MATCH DatePath = (b:Date)-[:NEXT*30]->(e)
UNWIND nodes(DatePath) as date
WITH date
MATCH (doc:Document)-[:PUBLISHED_ON]->(date)
RETURN count(doc)

Thirdly, If you are sure that the b nodes for (b:Date)-[:NEXT*30]->(e) will have the Date label and same for doc nodes in the last MATCH, omitting the label will be more performant, you can look at my answer here for the details :

Neo4j: label vs. indexed property?

MATCH (e:Date {day:20150101})
WITH e
MATCH DatePath = (e)<-[:NEXT*30]-(b)
UNWIND nodes(DatePath) as date
WITH date
MATCH (doc)-[:PUBLISHED_ON]->(date)
RETURN count(doc)
Community
  • 1
  • 1
Christophe Willemsen
  • 19,399
  • 2
  • 29
  • 36
  • My `:Date(day)` INDEX was ok, but you are absolutely right : splitting the query first and omitting the made a great improvement! I came from 1800ms to 80 ms! Many thanks for your help – GuCier Jan 16 '15 at 14:34
  • 3
    Have a look at GraphAware TimeTree, might be of interest for your use case: https://github.com/graphaware/neo4j-timetree – Michal Bachman Jan 17 '15 at 08:37