I'm writing a program that analyse posts in a forum.
After loading forum threads into neo4j DB,
I'm trying to "Rank" posts by the number of responses they got.
Responses include direct responses as well as the entire sub-tree for each direct response.
The idea is to count all children down the tree (the tree is a simple tree without any loops)
Every post is a neo4j node
# Create MSG nodes: statement = "CREATE (c:MSG {id:{N}, title:{T}}) RETURN c" for msg in msgs: graph.cypher.execute(statement, {"N": msg[0], "T": msg[1]})
Node that represent a post which is a response to another post has a relation r:CHILD_OF to his parent node.
root nodes will not have r:CHILD_OF relation, but will have a "0" as their parent ID
|parent id | msg id | Rank | List of all responses
+----------+--------+------+----------------------
|0 | 1051 | 3 | (1054, 1056, 1060)
|1051 | 1054 | 0 |
|1051 | 1056 | 1 | (1060)
|1056 | 1060 | 0 |
|0 | 1052 | 0 |
in this table,
- msg 1051 is a first post in a thread
- msg 1052 is a first post in another thread
- msg 1051 got 2 direct responses (1054, 1056) and another in-direct response (1060)
- msg 1056 got 1 direct response (1060)
I need to get the cypher that can create this ranking.
But not sure how to write it.
The project is in python and I'm using python 2.7, py2neo 2.0.3, neo4j 2.1.6