Problem
I am running a query which finds duplicate vertices by the name
property. I would like to know the IDs for all the corresponding vertices.
At this time, only the ids in the where clause are returned.
Example graph
Here is a toy example graph. There are two vertices with the same name ex
.
==>tinkergraph[vertices:0 edges:0]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.addV('X').property('name', 'ex')
==>v[0]
gremlin> g.addV('Y').property('name', 'why')
==>v[2]
gremlin> g.addV('Y').property('name', 'ex')
==>v[4]
gremlin> g.V().elementMap()
==>[id:0,label:X,name:ex]
==>[id:2,label:Y,name:why]
==>[id:4,label:Y,name:ex]
Detecting duplicates
When I find the duplicates and get the elementMap()
, the IDs are only for the vertex matched in the where
clause.
gremlin> g.V().hasLabel('X').as('x').V().hasLabel('Y').as('y').where('x', P.eq('y')).by('name').elementMap()
==>[id:4,label:Y,name:ex]
Whereas I would like to see the id
for both which would be id:0
and id:4
I would like something like:
==>[[id:0,label:X,name:ex], [id:4,label:Y,name:ex]]