0

I have projects that have "name" as a property and I want to generate a list of the duplicates. I tried to do this by grouping the projects by name, and using the where clause to filter the results where the count of the project name is greater than 1 and showing those names.

The below generates a list of the project names with the count of each
g.V().hasLabel('project').groupCount().by('name')

So I added the filter to find only the duplicate values and it does not work:
g.V().hasLabel('project').groupCount().by('name').where(select(values).is(gt(1))).values('name')

Veronica
  • 145
  • 12

2 Answers2

0

You need to unfold() the count Map(), thus:

g.V().hasLabel('project').
  groupCount().
    by('name').
  unfold().
  where(select(values).is(gt(1))).
  values('name')

If you don't unfold(), you have a Map in the pipeline and it tries to apply your where() to that object as a whole when you really want to apply it to each individual key/value pair in the Map.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Thank you, Stephen. I ran this and it did not work. This is the error I am getting: java.lang.ClassCastException – Veronica Mar 21 '19 at 16:11
  • please provide some sample data so that i can recreate the problem specifically with your query in the Gremlin Console - here's an example of a script: https://stackoverflow.com/questions/51388315/gremlin-choose-one-item-at-random fwiw, i did try the essence of the traversal above on the modern toy graph and it worked fine. – stephen mallette Mar 21 '19 at 18:43
0

This worked for me:

g.V().hasLabel('project')
.group().by(values('name')
.fold()).unfold().filter(select(values)
.count(local).is(gt(1))).select(keys)
Veronica
  • 145
  • 12