0

I have a similar graph as provided here. I have simplified to be with airports as vertex and edge as a person travelling though those airports. I want to find the number of people who have travelled to the two airport from b to f (except airport d). Also I want to order the graph by the highest to lowest traffic.

Sample Graph: https://gremlify.com/bgdnijf9xs6

If above question doesn't provide clarity. Here's simple form

  • Find the path between two vertex except through a mid vertex(you can take any vertex in the midpoint). Sort the path by the highest traffic based on edge property(property will have unique value and will be connected to vertex).

For identifying person we have uniquename on it. If uniquename is same then we know it's a person travelling to destination. So a edge with unique name from a -> b -> c is essentially same person travelling.

For the path query I have

g.V()
 .has("name", 'b')
 .repeat(
    out('person').not(__.has('name', 'd'))
 )
 .until(has('name', 'f'))
 .path()
 .dedup()
 .fold()

The output would be following:

b -> c -> c1 -> e -> f   count(3) // 3 person travelled full path
b -> c -> b2 -> e -> f   count(2) // 2 person travelled full path
b -> c -> b3 -> e -> f   count(1) // 1 ...

Or if you want to go from a to g then

a -> b -> c -> c1 -> e -> f -> g   count(3) // 3 person travelled full path
a -> b -> c -> b2 -> e -> f -> g  count(2) // 2 person travelled full path
a -> b -> c -> b3 -> e -> f -> g  count(1) // 1 ...

For what I have tried up till now: https://gremlify.com/fz54u5jiszo

Edit: Latest query I have come up with

g.V().has('name', 'c').as('c')
    .sideEffect(
        V().has('name', 'a').aggregate('a')
        .V().has('name', 'b').aggregate('b')
        .V().has('name', 'e').aggregate('e')
        .V().has('name', 'f').aggregate('f')
        .V().has('name', 'g').aggregate('g')
    )
    .barrier()
    
    // Get All users From Start To Finish
    .sideEffect(
        select('a').unfold().outE().where(inV().has('name', 'b')).dedup().aggregate('before_users')
    )
    .sideEffect(
        select('b').unfold().outE().where(inV().has('name', 'c')).dedup().aggregate('before_users')
    )
    .sideEffect(
        select('before_users').unfold().fold().unfold()
        .groupCount()
        .by(values('uniquename').fold())
        .unfold()
        .where(select(values).is(eq(2)))
        .select(keys)
        .unfold()
        .aggregate('unique_before_users')
    )

    .sideEffect(
        select('e').unfold().outE().where(inV().has('name', 'f')).dedup().aggregate('after_users')
    )
    .sideEffect(
        select('f').unfold().outE().where(inV().has('name', 'g')).dedup().aggregate('after_users')
    )
    .sideEffect(
        select('after_users').unfold().fold().unfold()
        .groupCount()
        .by(values('uniquename').fold())
        .unfold()
        .where(select(values).is(eq(2)))
        .select(keys)
        .unfold()
        .aggregate('unique_after_users')
    )
    
    .sideEffect(
        project('').
        union(select('unique_after_users').unfold(), select('unique_before_users').unfold())
        .groupCount()
        .unfold()
        .where(select(values).is(eq(2)))
        .select(keys)
        .unfold()
        .aggregate('unique_users')
    )
    .barrier()
    
    // Start to analyze traffic based on our crieteria
    // not through d
    .sideEffect(
        identity()
        .repeat(
          outE()
          .where(within('unique_users')).by('uniquename').by()
          .inV()
          .not(__.has('name', 'd'))
        )
         .until(has('name', 'e'))
         .path()
         .aggregate('allpath')
         
    )
    .select('allpath')
    .unfold()
    .map(
        project('path', 'count')
        .by(
            identity()
        )
        .by(
            identity().unfold().filter(where(hasLabel('airport'))).fold()
        )
    )
    .groupCount()
    .by('count')
        

Replicating sample graph:

g.addV('airport').as('1').property(single, 'name', 'a').
  addV('airport').as('2').property(single, 'name', 'b').
  addV('airport').as('3').property(single, 'name', 'c').
  addV('airport').as('4').property(single, 'name', 'd').
  addV('airport').as('5').property(single, 'name', 'e').
  addV('airport').as('6').property(single, 'name', 'f').
  addV('airport').as('7').property(single, 'name', 'g').
  addV('airport').as('8').property(single, 'name', 'b1').
  addV('airport').as('9').property(single, 'name', 'b2').
  addV('airport').as('10').property(single, 'name', 'b3').
  addE('person').from('1').to('2').property('uniquename', 'p1').
  addE('person').from('1').to('2').property('uniquename', 'p2').
  addE('person').from('2').to('3').property('uniquename', 'p3').
  addE('person').from('2').to('3').property('uniquename', 'p1').
  addE('person').from('2').to('3').property('uniquename', 'p4').
  addE('person').from('2').to('3').property('uniquename', 'p21').
  addE('person').from('2').to('3').property('uniquename', 'p2').
  addE('person').from('2').to('3').property('uniquename', 'p22').
  addE('person').from('2').to('3').property('uniquename', 'p31').
  addE('person').from('3').to('4').property('uniquename', 'p1').
  addE('person').from('3').to('8').property('uniquename', 'p21').
  addE('person').from('3').to('8').property('uniquename', 'p2').
  addE('person').from('3').to('8').property('uniquename', 'p22').
  addE('person').from('3').to('9').property('uniquename', 'p3').
  addE('person').from('3').to('10').property('uniquename', 'p4').
  addE('person').from('3').to('9').property('uniquename', 'p31').
  addE('person').from('4').to('5').property('uniquename', 'p1').
  addE('person').from('5').to('6').property('uniquename', 'p1').
  addE('person').from('5').to('6').property('uniquename', 'p21').
  addE('person').from('5').to('6').property('uniquename', 'p2').
  addE('person').from('5').to('6').property('uniquename', 'p22').
  addE('person').from('6').to('7').property('uniquename', 'p1').
  addE('person').from('6').to('7').property('uniquename', 'p21').
  addE('person').from('6').to('7').property('uniquename', 'p2').
  addE('person').from('6').to('7').property('uniquename', 'p22').
  addE('person').from('8').to('5').property('uniquename', 'p21').
  addE('person').from('8').to('5').property('uniquename', 'p2').
  addE('person').from('8').to('5').property('uniquename', 'p22').
  addE('person').from('9').to('5').property('uniquename', 'p3').
  addE('person').from('10').to('5').property('uniquename', 'p4')
Rajesh Paudel
  • 1,117
  • 8
  • 19
  • While Gremlify is nice, it helps people provide tested answers if you can include the `addV` and `addE` steps to create the test graph as part of the question. You should be able to just export that from Gremlify. This also means that the entire question is self contained should the Gremlify graph ever get deleted. Also, in your sample output you include `d` - but the question suggests you want to avoid `d`. Can you please clarify that point? – Kelvin Lawrence Aug 17 '22 at 13:57
  • Thank you @KelvinLawrence. I have added the sample graph statements in the question – Rajesh Paudel Aug 17 '22 at 13:58
  • Did you see my other question above about `d` being in the results? I'm not clear on what you are wanting to do there. – Kelvin Lawrence Aug 18 '22 at 20:51
  • @KelvinLawrence sorry that was my mistake. I just updated the question to much proper format – Rajesh Paudel Aug 19 '22 at 08:24
  • 1
    I think it should be possible to greatly simplify the query using `sack`. I will try to add an answer when I get a few spare minutes. – Kelvin Lawrence Aug 21 '22 at 15:16

1 Answers1

1

Using the Gremlin console, here is a query that uses sack to collect the uniquename values as the query progresses. The sack manipulation is a little odd looking as you cannot sack(sum) when dealing with strings.

gremlin> g.withSack([]).V().
......1>   has("name", 'b').
......2>   repeat(outE('person').
......3>          sack(assign).
......4>            by(union(sack().unfold(),values('uniquename')).fold()).
......5>          inV().has('name', neq('d'))).
......6>   until(has('name', 'f')).
......7>   where(sack().unfold().dedup().count().is(1)).
......8>   path().
......9>     by('name').
.....10>     by('uniquename')  

When run this yields

==>[b,p21,c,p21,b1,p21,e,p21,f]
==>[b,p2,c,p2,b1,p2,e,p2,f]
==>[b,p22,c,p22,b1,p22,e,p22,f]
Kelvin Lawrence
  • 14,674
  • 2
  • 16
  • 38