Hi everyone at Stackoverflow,
I want to understand query that is using Pearson.
What can be nom
and denom
?
What is r1: r1
and r2: r2
?
And I don't understand what is r.r1.rating
and r.r2.rating
.
This query should be recommending Movies that are rated by other Users.
MATCH (u1:User {id: 3})-[r:RATED]->(m:Movie)
WITH u1, avg(r.rating) AS u1_mean
MATCH (u1)-[r1:RATED]->(m:Movie)<-[r2:RATED]-(u2)
WITH u1, u1_mean, u2, COLLECT({r1: r1, r2: r2}) AS ratings WHERE size(ratings) > 10
MATCH (u2)-[r:RATED]->(m:Movie)
WITH u1, u1_mean, u2, avg(r.rating) AS u2_mean, ratings
UNWIND ratings AS r
WITH sum( (r.r1.rating-u1_mean) * (r.r2.rating-u2_mean) ) AS nom,
sqrt( sum( (r.r1.rating - u1_mean)^2) * sum( (r.r2.rating - u2_mean) ^2)) AS denom,
u1, u2 WHERE denom <> 0
WITH u1, u2, nom/denom AS pearson
ORDER BY pearson DESC LIMIT 10
MATCH (u2)-[r:RATED]->(m:Movie) WHERE NOT EXISTS( (u1)-[:RATED]->(m) )
RETURN m.name, SUM( pearson * r.rating) AS score
ORDER BY score DESC LIMIT 25
The output is as follows:
"m.name" │"score" │
│"Sleepless in Seattle" │25.859451877376813│
│"The Tunnel" │22.652532472101605│
│"Beetlejuice" │22.21835919736008 │
│"Shriek If You Know What .."│21.935357890253528│
│"Dawn of the Dead" │21.421377433824798│
│"The Prisoner of Zenda" │21.225502683325033│
│"The Talented Mr. Ripley" │20.83938743140176 │
Any suggestions will be helpful.