0

I have two counts, calculated as follows:

1)g.V().hasLabel('brand').where(__.inE('client_brand').count().is(gt(0))).count()

2)g.V().hasLabel('brand').count()

and I want to get one line of code that results in the first count divided by the second.

Veronica
  • 145
  • 12

2 Answers2

1

Here's one way to do it:

g.V().hasLabel('brand').
  fold().as('a','b').
  math('a/b').
    by(unfold().where(inE('client_brand')).count())
    by(unfold().count())

Note that I simplify the first traversal to just .where(inE('client_brand')).count() since you only care to count that there is at least one edge, there's no need to count them all and do a compare.

You could also union() like:

g.V().hasLabel('brand').
  union(where(inE('client_brand')).count(),
        count())
  fold().as('a','b').
  math('a/b').
    by(limit(local,1))
    by(tail(local))

While the first one was a bit easier to read/follow, I guess the second is nicer because it only stores a list of the two counts whereas, the first stores a list of all the "brand" vertices which would be more memory intensive I guess.

Yet another way, provided by Daniel Kuppitz, that uses groupCount() in an interesting way:

g.V().hasLabel('brand').
  groupCount().
    by(choose(inE('client_brand'),
                constant('a'),
                constant('b'))).
  math('a/(a+b)')

The following solution that uses sack() step shows why we have math() step:

g.V().hasLabel('brand').
  groupCount().
    by(choose(inE('client_brand'),
                constant('a'),
                constant('b'))).
  sack(assign).
    by(coalesce(select('a'), constant(0))).
  sack(mult).
    by(constant(1.0)). /* we need a double */
  sack(div).
    by(select(values).sum(local)).
  sack()

If you can use lambdas then:

g.V().hasLabel('brand').
  union(where(inE('client_brand')).count(),
        count())
  fold().
  map{ it.get()[0]/it.get()[1]} 
stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Hi Stephen, thank you! Unfortunately I am getting this error when I try to run the first one: groovy.lang.MissingMethodException: No signature of method: org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal.math() is applicable for argument types: (java.lang.String) values: [a/b] Possible solutions: max(), max(groovy.lang.Closure), with(groovy.lang.Closure), max(java.util.Comparator), each(groovy.lang.Closure), each(groovy.lang.Closure) – Veronica Mar 21 '19 at 19:23
  • 'brand' has properties name, id, description, and entityNameType, and incoming edge 'client_brand', and 'client' has the same properties. – Veronica Mar 21 '19 at 19:24
  • `math()` step was introduced in TinkerPop 3.3.0 - perhaps you are on an old version? daniel kuppitz mentioned yet another way to do this with `math()` step which i just added to my answer as he didn't feel like adding another one. if things still don't work once you sorted out the issue with `math()` i suggest providing some sample data if you have other issues to address: - here is an example https://stackoverflow.com/questions/51388315/gremlin-choose-one-item-at-random – stephen mallette Mar 21 '19 at 19:34
-3

This is what worked for me:

g.V().limit(1).project('client_brand_count','total_brands')
.by(g.V().hasLabel('brand')
.where(__.inE('client_brand').count().is(gt(0))).count())
.by(g.V().hasLabel('brand').count())
.map{it.get().values()[0] / it.get().values()[1]}
.project('brand_client_pct')
Veronica
  • 145
  • 12
  • 1
    well, you can use closures, but they should be a last resort which is why they were not part of my suggestions or kuppitz's suggestions. a closure reduces the portability of your code as they do not work in all environments. i also think think that if you still intend to use lambdas you can massively reduce the complexity of your proposed solution. i updated my answer again with a lambda solution that should be much more direct. consider doing a `profile()` on both to see the difference in what's happening. – stephen mallette Mar 21 '19 at 21:56