2

I have this simple graph:

create (a:ent {id:'a'})-[:rel]->(:ent {id:'b'})-[:rel]->(c:ent {id:'c'})-[:rel]->(d:ent {id:'d'})-[:rel]->(:ent {id:'e'})-[:rel]->(a),
   (d)-[:rel]->(c),
   (c)-[:rel]->(f:ent {id:'f'})-[:rel]->(g:ent {id:'g'})-[:rel]->(a),
   (g)-[:rel]->(f)

It looks like this:
enter image description here

Given is 'a', that is node (:ent {id:'a'}), I want to write query that returns "exactly two unique longest" paths:

a->b->c->d->e
a->b->c->f->g

As you can see here I should be considering cycles in the graph here. It seems that query suggested here should be fine, which I rewrote as follows:

MATCH path=(:ent{id:'a'})-[:rel*]->(:ent)
WHERE ALL(n in nodes(path)
      WHERE 1=size(filter(m in nodes(path)WHERE m.id=n.id))
     )
RETURN path

I know the query does not give exact result as I intend to get, however if I understand the logic correctly, it does at least avoid the cyclic paths. I felt this query can be a good starting point. But it is giving me weird error:

key not found:   UNNAMED26

Whats wrong here? I am unable to pinpoint the error in the cypher with this unclear error description.

Update

I tried a new simpler query:

MATCH path=(s:ent{id:'a'})-[:rel*]->(d:ent)
WHERE not (d)-[:rel]->() OR (d)-[:rel]->(s)
RETURN extract(x IN nodes(path)| x.id) as result

It returns:

╒═════════════════════╕
│result               │
╞═════════════════════╡
│[a, b, c, d, e]      │
├─────────────────────┤
│[a, b, c, d, c, f, g]│
├─────────────────────┤
│[a, b, c, f, g]      │
└─────────────────────┘

As you can see it has one redundant path [a, b, c, d, c, f, g] caused due to the cycle (d)<->(c) cycle. I honestly feel the original/first query in this post should eliminate it. Can someone please tell me how can I make it work...?

Mahesha999
  • 22,693
  • 29
  • 116
  • 189

1 Answers1

3

You can use the apoc.path.expandConfig to find the paths and then just filter out the shortest ones leaving only the longest.

// start with the a node
MATCH (a:ent {id :"a"})

// use expandConfig with the relationship type and direction
// use uniqueness of NODE_PATH to ensure that you don't backtrack
// optionally include a minlevel, maxlevel
CALL apoc.path.expandConfig(a, 
{
  relationshipFilter: 'rel>',
  uniqueness: 'NODE_PATH',
  minLevel: 2,
  maxLevel: 10
} ) yield path

// collect the paths and the get the max length
WITH COLLECT(path) AS paths, MAX(length(path)) AS longest_length

// remove any of the paths that are not the max length
WITH FILTER(path IN paths WHERE length(path)= longest_length) AS longest_paths

// return each path matching the max length
UNWIND longest_paths AS path
RETURN path

Updated: Alternate answer based on clarification from OP.

// alternate set of test data based on OP's comments
MERGE (a:ent {id:'a'})-[:rel]->(b:ent {id:'b'})-[:rel]->(c:ent {id:'c'})
MERGE (b)-[:rel]->(d:ent {id:'d'})
MERGE (b)-[:rel]->(e:ent {id:'e'})-[:rel]->(f:ent {id:'f'})
RETURN *

And the updated query

// start with the a node
MATCH (a:ent {id :"a"})

// use expandConfig with the relationship type and direction
// use uniqueness of NODE_PATH to ensure that you don't backtrack
// optionally include a minlevel, maxlevel
CALL apoc.path.expandConfig(a, 
{
  relationshipFilter: 'rel>',
  uniqueness: 'NODE_PATH',
  minLevel: 1,
  maxLevel: 10
} ) yield path

// create two collections: one of nodes of full paths and the other nodes of paths shorted by one
WITH COLLECT(path) AS paths,
     COLLECT(DISTINCT nodes(path)[0..size(nodes(path))-1]) AS all_but_last_nodes_of_paths

// filter out the nodes of paths that match nodes in the shorted list
// i.e. the paths that are already included in another list
WITH [p IN paths WHERE NOT nodes(p) IN all_but_last_nodes_of_paths] AS paths_to_keep

// return each path
UNWIND paths_to_keep AS path
RETURN path
Dave Bennett
  • 10,996
  • 3
  • 30
  • 41
  • Getting `There is no procedure with the name `apoc.path.expandConfig` registered for this database instance. Please ensure you've spelled the procedure name correctly and that the procedure is properly deployed.` Am on Neo4j CE 3.0.4. Is this procedure added in recent version? – Mahesha999 Jun 29 '17 at 12:57
  • 1
    It worked for ensuring same node is not re-visited, but it returned two desired paths accidentally because the graph I gave above contains exactly two paths that equals max length that is 5. This may be because of misinterpretation of "longest" term. If graph have paths `a->b`,`a->b->c`,`a->b->d` and `a->b->e->f`, then the above cypher only returns `a->b->e->f` path, right? However I want `a->b->c`,`a->b->d`,`a->b->e->f`. So out of `a->b`,`a->b->c`, it omitted `a->b` since its not longest and is contained in `a->b->c` which is unique longest. Sorry for not being clear at first with "longest". – Mahesha999 Jun 29 '17 at 14:04
  • I feel this should be doable if `uniqueness` is configurable in `apoc.path.subgraphNodes()`. Am I right? Is getting output as stated in above comment is not possible currently? – Mahesha999 Jun 29 '17 at 14:16
  • np. so to make sure I fully understand, if the graph is `a->b`, `a->b->c`, `a->b->d` and `a->b->e->f`. You don't want `a->b->e` because it is already included in `a->b->e->f` - correct? – Dave Bennett Jun 29 '17 at 14:25
  • ohh yeah `a->b->e` too will not be there. Also I realised the issue with that first query having `ALL (n in nodes(path) WHERE size(filter(m in nodes(path)WHERE m.id=n.id))=1)`. If I change `ALL` to `ANY` the query works. – Mahesha999 Jun 29 '17 at 14:31
  • clever logic, I was just guessing how it clicked to you...Is it due to the practice of solving such problems? Also I was guessing if there is any way to return back path objects, instead of node lists...Anyways thanks a billion... – Mahesha999 Jun 30 '17 at 09:27
  • 1
    np. I changed it to return paths. Sorry about that, it was an artifact of iterating on it yesterday that I forgot to remove. Yes, I think practice with paths is essential in writing cypher queries. – Dave Bennett Jun 30 '17 at 11:54