2

Problem

I am running a query which finds duplicate vertices by the name property. I would like to know the IDs for all the corresponding vertices.

At this time, only the ids in the where clause are returned.

Example graph

Here is a toy example graph. There are two vertices with the same name ex.

==>tinkergraph[vertices:0 edges:0]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> g.addV('X').property('name', 'ex')
==>v[0]
gremlin> g.addV('Y').property('name', 'why')
==>v[2]
gremlin> g.addV('Y').property('name', 'ex')
==>v[4]
gremlin> g.V().elementMap()
==>[id:0,label:X,name:ex]
==>[id:2,label:Y,name:why]
==>[id:4,label:Y,name:ex]

Detecting duplicates

When I find the duplicates and get the elementMap(), the IDs are only for the vertex matched in the where clause.

gremlin> g.V().hasLabel('X').as('x').V().hasLabel('Y').as('y').where('x', P.eq('y')).by('name').elementMap()
==>[id:4,label:Y,name:ex]

Whereas I would like to see the id for both which would be id:0 and id:4

I would like something like:

==>[[id:0,label:X,name:ex], [id:4,label:Y,name:ex]]
Nathan McCoy
  • 3,092
  • 1
  • 24
  • 46

1 Answers1

3

You had actually got very close

gremlin> g.V().hasLabel('X').as('x').
......1>   V().hasLabel('Y').as('y').
......2>   where(eq('x')).by('name').
......3>   select('x','y').
......4>     by(valueMap().by(unfold()).
                with(WithOptions.tokens)).
......5>   select(values)

==>[[id:0,label:X,name:ex],[id:4,label:Y,name:ex]]     
Kelvin Lawrence
  • 14,674
  • 2
  • 16
  • 38