0

I'm having difficulties to get all nodes in a specific time range. I have two types of node attached to the timetree, Nodes Tweet and Nodes News.

I want all the Tweets nodes. I'm using this query (10+ min stopped):

CALL ga.timetree.events.range({start: 148029120000, end: 1480896000000, relationshipType: "LAST_UPDATE", resolution: 'DAY'}) 
YIELD node
MATCH (a:TwitterUser)-[:POSTS]->(:Tweet)-[r:RETWEETS]->(:Tweet)<-[:POSTS]-(m:TwitterUser) 
RETURN id(a), id(m), count(r) AS NumRetweets 
ORDER BY NumRetweets DESC

But this takes a lot compared to the simple query (8 seconds):

MATCH (a:TwitterUser)-[:POSTS]->(:Tweet)-[r:RETWEETS]->(:Tweet)<-[:POSTS]-(m:TwitterUser) 
RETURN id(a), id(m), count(r) AS NumRetweets 
ORDER BY NumRetweets DESC

Actually, with my data, the 2 query should return the same nodes, so i dont understand the big time difference.

InverseFalcon
  • 29,576
  • 4
  • 38
  • 51
Cezar Sas
  • 306
  • 3
  • 14
  • I'm confused. You make the timetree range call to get events, but you don't use the returned nodes at all. Did you omit something in the query? – InverseFalcon Feb 05 '17 at 10:55
  • The problem is that i don't know exactly how to use the timetree. I need all the tweets in a specific range that matches the query pattern. – Cezar Sas Feb 05 '17 at 11:03
  • Your timetree range is from 9/10/1974 - 12/5/2016. That's a very long range. Is that really the range of tweets you want to get? Typically a timetree is used to get events in some window of time, usually narrower than decades. Depending on the size of your graph, this could be a mountain of data. – InverseFalcon Feb 05 '17 at 11:30
  • @InverseFalcon Yeah the interval is wrong, thanks for the note. My interval is a week, i calculated the wrong timestamp. – Cezar Sas Feb 05 '17 at 11:53
  • 1
    If you aren't using it already, you may want to take a look at APOC Procedures' [date/time functions](https://neo4j-contrib.github.io/neo4j-apoc-procedures/#_date_time_support), that can better help you get the timestamps you need easily from date strings. – InverseFalcon Feb 05 '17 at 11:55

1 Answers1

2

The problem with your first query is that you're not doing anything with the results of the timetree query. It is literally just wasting cycles and bloating up the built up rows with data that's not even used.

You need to take the :Tweet nodes returned from your timetree query and include them into the next part of your query.

CALL ga.timetree.events.range({start: 148029120000, end: 1480896000000, relationshipType: "LAST_UPDATE", resolution: 'DAY'}) 
YIELD node
WITH node as tweet
WHERE tweet:Tweet
MATCH (a:TwitterUser)-[:POSTS]->(:Tweet)-[r:RETWEETS]->(tweet)<-[:POSTS]-(m:TwitterUser) 
RETURN id(a), id(m), count(r) AS NumRetweets 
ORDER BY NumRetweets DESC
InverseFalcon
  • 29,576
  • 4
  • 38
  • 51
  • Thanks a lot. May I ask you one more question? If i want also the Tweets that do not have the timestamp, how should i edit the query. Thanks – Cezar Sas Feb 05 '17 at 12:12
  • So you want tweets that occur during that time range (where the timestamp exists) as well as tweets without a timestamp at all? Any particular reason they're lacking a timestamp, and any way to give them one? – InverseFalcon Feb 05 '17 at 12:17
  • These without date are extracted form the data that i use. I create them from the retweet URL, so i dont have this information. To be more precise i would like these without date that are linked with the one given by the query. – Cezar Sas Feb 05 '17 at 12:20