0

I'm running Neo4j Desktop v1.4.1 the db is 4.2.1 enterprise.

I have a simple graph of placements, campaigns and a placement to campaign "contains" relationship. This is a fresh dataset, every node is unique. Some placements "contain" thousands of campaigns, so I want to filter the returned campaigns by an inclusion list of campaign ids.

When I return all the matched nodes it works:

neo4j@neo4j> MATCH (:Placement {id: 5})-[:CONTAINS]->(c:Campaign)
             WHERE c.id IN [400,263,150470,25810,37578]
             RETURN *;
+--------------------------+
| c                        |
+--------------------------+
| (:Campaign {id: 37578})  |
| (:Campaign {id: 263})    |
| (:Campaign {id: 25810})  |
| (:Campaign {id: 150470}) |
+--------------------------+

When I request just the campaign:id, I get duplicates:

neo4j@neo4j> MATCH (:Placement {id: 5})-[:CONTAINS]->(c:Campaign)
             WHERE c.id IN [400,263,150470,25810,37578]
             RETURN c.id;
+--------+
| c.id   |
+--------+
| 150470 |
| 150470 |
| 150470 |
| 150470 |
+--------+

There is only one CONTAINS relationship between placement 5 and campaign 15070:

neo4j@neo4j> MATCH (:Placement {id: 5})-[rel:CONTAINS]->(:Campaign {id:150470}) 
             RETURN count(rel);
+------------+
| count(rel) |
+------------+
| 1          |
+------------+

EXPLAIN returns the following query plan, the cache[c.id] seems like it might be the culprit?

+---------------------------+------------------------------------------------------------------------------------------------------+----------------+---------------------+
| Operator                  | Details                                                                                              | Estimated Rows | Other               |
+---------------------------+------------------------------------------------------------------------------------------------------+----------------+---------------------+
| +ProduceResults@neo4j     | `c.id`                                                                                               |              4 | Fused in Pipeline 1 |
| |                         +------------------------------------------------------------------------------------------------------+----------------+---------------------+
| +Projection@neo4j         | cache[c.id] AS `c.id`                                                                                |              4 | Fused in Pipeline 1 |
| |                         +------------------------------------------------------------------------------------------------------+----------------+---------------------+
| +Expand(Into)@neo4j       | (anon_7)-[anon_27:CONTAINS]->(c)                                                                     |              4 | Fused in Pipeline 1 |
| |                         +------------------------------------------------------------------------------------------------------+----------------+---------------------+
| +MultiNodeIndexSeek@neo4j | UNIQUE anon_7:Placement(id) WHERE id = $autoint_0, cache[c.id], UNIQUE c:Campaign(id) WHERE id IN $a |             25 | In Pipeline 0       |
|                           | utolist_1, cache[c.id]                                                                               |                |                     |
+---------------------------+------------------------------------------------------------------------------------------------------+----------------+---------------------+

Edit: if I prepend the query with CYPHER runtime=SLOTTED I get the expected output:

+--------+
| c.id   |
+--------+
| 37578  |
| 263    |
| 25810  |
| 150470 |
+--------+

If I omit the WHERE clause I get unique campaign ids (but too many). I feel like I'm missing something obvious, but I've read the neo4j docs and I'm not getting it. Thanks!

David Farrell
  • 427
  • 6
  • 16
  • Is it possible that the Placement with id 5 has multiple relationships of type CONTAINS to campaign id 150470? – Luanne Feb 27 '21 at 03:44
  • I checked but there is only one relationship – David Farrell Feb 27 '21 at 15:58
  • Very strange, any chance you can share a script to recreate your graph? And which version of Neo4j is this? – Luanne Feb 28 '21 at 11:25
  • 1
    Can you let us know what version of Neo4j you are running? Also, can you let us know if you get the same duplicate results if you prefix your query with `CYPHER runtime=SLOTTED ` ? – InverseFalcon Mar 02 '21 at 01:17
  • @InverseFalcon thanks, `CYPHER runtime=SLOTTED ` returns the expected results! – David Farrell Mar 02 '21 at 15:01
  • @DavidFarrell Can you let us know the version of Neo4j you're using? Since SLOTTED runtime avoids the issue, this must be a bug in PIPELINED runtime, but depending on the version you are using, it might have already been found and fixed in a later patch. – InverseFalcon Mar 02 '21 at 20:22
  • Neo4j Desktop v1.4.1 @InverseFalcon – David Farrell Mar 03 '21 at 22:47
  • Thanks, that gives us the Desktop version, but that's separate from the Neo4j version. Could you check the version of the Neo4j database you're running within Desktop? – InverseFalcon Mar 03 '21 at 23:40
  • Gotcha @InverseFalcon it's 4.2.1 enterprise – David Farrell Mar 04 '21 at 16:14
  • @DavidFarrell Thanks. Any chance you could test with 4.2.3, the latest patch? It is possible the bug behind this may have been fixed. – InverseFalcon Mar 05 '21 at 04:11

0 Answers0