1

I Need to show the latest posts. In future, there will be around billions of posts.

So which is the optimized way to show the latest posts list.

  1. By storing every post's month as 201506 and indexing it. or

  2. By creating label as 201506 .. 201508 and storing the post in their particular label.

Then retrive the posts in descending order based on every month, Or is there any other way to do this.

Also if i have more labels, whether it will affect the performance or not.

Charlotte Skardon
  • 6,220
  • 2
  • 31
  • 42
Dheena
  • 117
  • 9
  • 1
    Have you considered using [graphaware timetree?](https://github.com/graphaware/neo4j-timetree) Sounds like a perfect use case for it... – drew moore Jun 27 '15 at 08:42
  • Thanks,I am not using java, anyway i will look the node architecture – Dheena Jun 27 '15 at 09:19
  • Should be easy if you use a single property for year, month, day and index it and then retrieve the posts from today and potentially yesterday. – Michael Hunger Jun 27 '15 at 14:24

1 Answers1

1

If you want to have an ordered list of all posts in your system (regardless of the author) you might organize it as a linked list representing your timeline:

(post1:Post) -[:PREV_POST]-> (post2:Post) -[:PREV_POST]-> ...

So the PREV_POST relationship connects the most recent post to the previous one.

Additionally you might have a timetree (see http://graphaware.com/neo4j/2014/08/20/graphaware-neo4j-timetree.html as a sample implementation). Since your maximum domain granularity is month, you have years and months in the timetree.

Only the first post for every month is then connected to the month node in the time tree. See below for a sample model:

enter image description here

To query e.g. the posts in decending order for Dec 2014 we first find the respective month (Dec 2014) in the timetree, go to the next month (Jan 2015). From the two month nodes we go to the first post of that month and find everything in between:

MATCH (:TimeRoot)-[:HAS_YEAR]->(startMonth:Year{year:2014})-[:HAS_MONTH]->(endMonth:Month{month:Dec}),
  (startMonth)<-[:FIRST_IN_MONTH]-(firstPost:Post),
  (endMonth)<-[:FIRST_IN_MONTH]-()-[:PREV_POST]->(lastPost:Post),
  path = (lastPost)-[:PREV_POST*]->(firstPost)
UNWIND nodes(path) as post
RETURN post

Please note that I've not actually tested the query, so there might be some typos. The intention was to demo the model, not the full solution.

Stefan Armbruster
  • 39,465
  • 6
  • 87
  • 97
  • 1
    here, we are connecting more nodes, How about this query performance with 1 billion posts. – Dheena Jun 27 '15 at 09:13
  • Depends how many posts you fetch. In an ideal situation Neo4j can traverse multi millions of relationships per second per core. If you want to organize the post on a per-user context, see the Graphity model http://www.rene-pickhardt.de/graphity-an-efficient-graph-model-for-retrieving-the-top-k-news-feeds-for-users-in-social-networks/ – Stefan Armbruster Jun 28 '15 at 12:38