2

I'm writing a program that analyse posts in a forum.
After loading forum threads into neo4j DB,
I'm trying to "Rank" posts by the number of responses they got.

Responses include direct responses as well as the entire sub-tree for each direct response.
The idea is to count all children down the tree (the tree is a simple tree without any loops)

Every post is a neo4j node

# Create MSG nodes:
statement = "CREATE (c:MSG {id:{N}, title:{T}}) RETURN c"
for msg in msgs:
    graph.cypher.execute(statement, {"N": msg[0], "T": msg[1]})

Node that represent a post which is a response to another post has a relation r:CHILD_OF to his parent node.
root nodes will not have r:CHILD_OF relation, but will have a "0" as their parent ID

|parent id | msg id | Rank | List of all responses
+----------+--------+------+----------------------
|0         | 1051   | 3    | (1054, 1056, 1060)
|1051      | 1054   | 0    |
|1051      | 1056   | 1    | (1060)
|1056      | 1060   | 0    |
|0         | 1052   | 0    |

in this table,

  • msg 1051 is a first post in a thread
  • msg 1052 is a first post in another thread
  • msg 1051 got 2 direct responses (1054, 1056) and another in-direct response (1060)
  • msg 1056 got 1 direct response (1060)

I need to get the cypher that can create this ranking.
But not sure how to write it.
The project is in python and I'm using python 2.7, py2neo 2.0.3, neo4j 2.1.6

Izack
  • 823
  • 7
  • 13

2 Answers2

3

This query should return a result set similar to your table (but without the first column):

MATCH (m:MSG)
OPTIONAL MATCH (c:MSG)-[:CHILD_OF*1..]->(m)
WITH m, COLLECT(DISTINCT c.id) AS childMsgIds
RETURN m.id AS `msg id`, LENGTH(childMsgIds) AS Rank, childMsgIds AS `List of all responses`

Does this suit your needs?

cybersam
  • 63,203
  • 6
  • 53
  • 76
1

This should return all distinct children in the tree:

MATCH (message:MSG {id: {message_id}})<-[:CHILD_OF*0..]-(child:MSG)
RETURN DISTINCT child

If you want a count, you can do RETURN COUNT(DISTINCT child)

Brian Underwood
  • 10,746
  • 1
  • 22
  • 34
  • Thanks, I up voted this answer because it is in the right direction. Two minor issues: You have to count from 1 and not from 0, and this will return answer for specific node only. – Izack Jan 08 '15 at 12:56
  • Ah, good point about the 0/1. I put in the `id` though because that seems to have been what you wanted from your question. – Brian Underwood Jan 09 '15 at 11:53