1

I've recently had to change from using Cypher to Gremlin and I'm trying to convert a query that allowed a user to 'delete' a node and all of the subgraph nodes that would be affected by this. It wasn't actually removing nodes but just adding a 'DELETED' label to the affected nodes.

I can get a subgraph in Gremlin using:

g.V(nodeId).repeat(__.inE('memberOf').subgraph('subGraph').outV()).cap('subGraph')

but this doesn't take into account any nodes in the subgraph that might have a route back past the originally 'deleted' node and therefore shouldn't be orphaned.

enter image description here

If you take the graph above; B is the node being deleted. It's subgraph would include D, E, G and H. However, since E still has a route back to A through C, we don't want to 'delete' it. D, G and H will be left without a route back to A and should therefore also be deleted.

My Cypher query worked like this (using Neo4jClient.Cypher in C#):

// Find the node to be deleted i.e. B
.Match("(b {Id: 'B'})")  
// Set a DELETED label to B   
.Set("b:DELETED")     
.With("b")
// Find the central node i.e A
.Match("(a {Id: 'A'})") 
// Find the subgraph of B ignoring previously deleted nodes
.Call("apoc.path.subgraphAll(b, { relationshipFilter: '<memberOf', labelFilter: '-DELETED'})")     
.Yield("nodes AS subgraph1")
// Get each node in subgraph1 as sg1n
.Unwind("subgraph1", "sg1n") 
// Check if each sg1n node has a route back to A ignoring DELETED routes    
.Call("apoc.path.expandConfig(sg1n, { optional: true, relationshipFilter: 'memberOf>', labelFilter: '-DELETED', blacklistNodes:[b],terminatorNodes:[a]})")     
.Yield("path")
// If there is a path then store the nodes as n
.Unwind("CASE WHEN path IS NULL THEN [null] ELSE nodes(path) END", "n")     
// Remove the nodes in n from the original subgraph (This should leave the nodes without a route back)
.With("apoc.coll.subtract(subgraph1, collect(n)) AS subgraph2") 
// Set the DELETED label on the remaining nodes     
.ForEach("(n IN(subgraph2) | SET n:DELETED)")  

Is there any way I can get similar functionality in Gremlin?

UPDATE

Thanks to sel-fish's help in this question and in this one, I now have this working using:

g.V(itemId)                                            // Find the item to delete.
  .union(                                              // Start a union to return
    g.V(itemId),                                       // both the item 
    g.V(itemId)                                        // and its descendants.
      .repeat(__.inE('memberOf').outV().store('x'))    // Find all of its descendants.
      .cap('x').unfold()                               // Unfold them.
      .where(repeat(out('memberOf')                    // Check each descendant
        .where(hasId(neq(itemId))).simplePath())       // to see if it has a path back that doesn't go through the original vertex
        .until(hasId(centralId)))                      // that ends at the central vertex .
      .aggregate('exception')                          // Aggregate these together.
      .cap('x').unfold()                               // Get all the descendants again.
      .where(without('exception')))                    // Remove the exceptions.
  .property('deleted', true)                           // Set the deleted property.
  .valueMap(true)                                      // Return the results.
ghertyish
  • 193
  • 1
  • 11

1 Answers1

1

First, save the vertices in subgraph as candidates:

candidates = g.V().has('Id', 'B').repeat(__.inE('memberOf').subgraph('subGraph').outV()).cap('subGraph').next().traversal().V().toList()

Then, filter the candidates, remains those which doesn't get a path towards Vertex('A') which not including Vertex('B'):

g.V(candidates).where(repeat(out('memberOf').where(has('Id', neq('B'))).simplePath()).until(has('Id','A'))).has('Id', neq('B')).aggregate('expection').V(candidates).where(without('expection'))
sel-fish
  • 4,308
  • 2
  • 20
  • 39
  • Awesome! Thanks for the response. My only worry about this approach is that 'A' could grow to have thousands of descendants and so finding all of them first and then filtering by them could be an issue. Is there any way I can look at the subgraph of 'B' first and work off of it's descendants instead? – ghertyish Mar 21 '19 at 11:41
  • I've accepted this as correct because this is the right approach and I'm pretty sure this would work with a database that fully supports Gremlin. Unfortunately, I'm using CosmosDB which doesn't support the next().traversal() and toList() steps. Gremlin.NET also doesn't seem to support assigning queries to variables like you've assigned the first query to 'candidates'. It might just not be possible but can you think of any way to do this with those restrictions? – ghertyish Mar 22 '19 at 15:19
  • So I've reached a point where I can call a where function on the reults of the first query using: `g.V().has('Id', 'B').repeat(__.inE('memberOf').subgraph('subGraph').outV().store('x')).cap('x').unfold().unfold().where(has('Id', 'G'))` (i'm not sure why I need to unfold it twice but that's the only thing that I've found that seems to work). This works, and just return G, however, when I put in the repeat bit it doesn't return anything. – ghertyish Mar 22 '19 at 16:13
  • For example, this doesn't return anything: `g.V().has('Id', 'B').repeat(__.inE('memberOf').subgraph('subGraph').outV().store('x')).cap('x').unfold().unfold().where(repeat(out('memberOf').simplePath()).until(has('Id', 'B')))`. Shouldn't this be returning everything because the whole subGraph should have a route back to B? – ghertyish Mar 22 '19 at 16:23
  • @ghertyish how about `g.V().has('Id', 'B').repeat(__.inE('memberOf').subgraph('subGraph').outV().store('x')).cap('x').unfold().where(repeat(out('memberOf').simplePath()).until(has('Id','B')))`, in your last comment, I don't need unfold it twice but only once to get G. I test this dsl towards TinkerGraph as I don't get a cosmosdb instance – sel-fish Mar 26 '19 at 06:20