2

I have a simple graph structure as follows:

Event-----HappenedAt------>Location

Event and Location vertices both have a name and id property. HappenedAt edge has a date property (in ticks) called 'on'.

I am having trouble writing a query that for a given location (so starting with a known location) return a list of all distinct events that happened at that location, a count of the number of times each event happened (so groupcount of the happenedAt edges) and the max/most recent date each event occurred.

Ideally the output would look as follows for a given location: Event Id, count, most recent/max date

Event A, 2, 5/27/2018

Event B, 2, 7/1/2018

I can get the group count and the max date just fine by themselves but can't seem to get the query right where those are combined into a single query to produce that output.

Probably simple, but I am new to Gremlin and Graph databases in general. Any help would be greatly appreciated.

g.addV('event').property('id','e1').property('name','Event A').as('e1').
  addV('event').property('id','e2').property('name','Event B').as('e2').
  addV('location').property('id','l1').property('name','Location 1').as('l1').
  addV('location').property('id','l2').property('name','Location 2').as('l2').
  addE('happenedAt').from('e1').to('l1').property('on','5/27/2018').
  addE('happenedAt').from('e1').to('l1').property('on','4/1/2018').
  addE('happenedAt').from('e2').to('l1').property('on','6/5/2018').
  addE('happenedAt').from('e2').to('l1').property('on', '7/1/2018').iterate()
stephen mallette
  • 45,298
  • 5
  • 67
  • 135
xinunix
  • 561
  • 4
  • 15
  • When asking questions about Gremlin it is always helpful to people trying to answer your question if you provide a script that will create a sample graph - see here for an example: https://stackoverflow.com/questions/51388315/gremlin-choose-one-item-at-random – stephen mallette Sep 24 '18 at 15:41
  • updated to include script – xinunix Sep 24 '18 at 15:51

1 Answers1

3

Here's the first thing that came to mind for me - there may be other solutions:

gremlin> g.V().has('id','l1').
......1>   inE('happenedAt').
......2>   group().
......3>     by(outV().values('name')).
......4>   unfold().
......5>   project('event','count','mostRecent').
......6>     by(select(keys)).
......7>     by(select(values).count(local)).
......8>     by(select(values).unfold().values('on').max())
==>[event:Event B,count:2,mostRecent:7/1/2018]
==>[event:Event A,count:2,mostRecent:5/27/2018]

I group() on the edges at line 2-3 to get the unique events so basically at the end of that line 3 we have a Map where the key is the event name and the value is a list of the "happenedAt" edges for that event. That's the raw data you need to calculate what you need.

At line 4, I unfold the map to entries of that map and project() each entry into a new map of the data structure you requested. Note that with lines 7-8 I'm getting a List of edges from select(values). That's why we use local in count(local) as we want to count the items in the List and not the List itself. Similarly, we unfold() the List in line 8 to pop off the "on" property values off the edges themselves to find the max().

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Outstanding, thank you so much. Definitely would have floundered around for a long time before I would have come to that on my own. Now I need to really dive in and understand fold/unfold and project so that I can put a query like this together on my own in the future. Thanks again for the quick response! – xinunix Sep 24 '18 at 18:16