7

Background: I'm trying to implement a time-series versioned DB using this approach, using gremlin (tinkerpop v3).

enter image description here

I want to get the latest state node (in red) for a given identity node (in blue) (linked by a 'state' edge which contains a timestamp range), but I want to return a single aggregated object which contains the id (cid) from the identity node and all the properties from the state node, but I don't want to have to list them explicitly. (8640000000000000 is my way of indicating no 'to' date - i.e. the edge is current - slightly different from the image shown).

I've got this far:

:> g.V().hasLabel('product').
     as('cid').
     outE('state').
     has('to', 8640000000000000).
     inV().
     as('name').
     as('price').
     select('cid', 'name','price').
     by('cid').
     by('name').
     by('price')

=>{cid=1, name="Cheese", price=2.50}
=>{cid=2, name="Ham", price=5.00}

but as you can see I have to list out the properties of the 'state' node - in the example above the name and price properties of a product. But this will apply to any domain object so I don't want to have to list the properties all the time. I could run a query before this to get the properties but I don't think I should need to run 2 queries, and have the overhead of 2 round trips. I've looked at 'aggregate', 'union', 'fold' etc but nothing seems to do this.

Any ideas?

===================

Edit: Based on Daniel's answer (which doesn't quite do what I want ATM) I'm going to use his example graph. In the 'modernGraph' people-create->software. If I run:

> g.V().hasLabel('person').valueMap()
==>[name:[marko], age:[29]]
==>[name:[vadas], age:[27]]
==>[name:[josh], age:[32]]
==>[name:[peter], age:[35]]

then the results are a list of entities's with the properties. What I want is, on the assumption that a person can only create one piece of software ever (although hopefully we will see how this could be opened up later for lists of software created), to include the created software 'language' property into the returned entity to get:

> <run some query here>
==>[name:[marko], age:[29], lang:[java]]
==>[name:[vadas], age:[27], lang:[java]]
==>[name:[josh], age:[32], lang:[java]]
==>[name:[peter], age:[35], lang:[java]]

At the moment the best suggestion so far comes up with the following:

> g.V().hasLabel('person').union(identity(), out("created")).valueMap().unfold().group().by {it.getKey()}.by {it.getValue()}
==>[name:[marko, lop, lop, lop, vadas, josh, ripple, peter], lang:[java, java, java, java], age:[29, 27, 32, 35]]

I hope that's clearer. If not please let me know.

John Stephenson
  • 499
  • 4
  • 13

3 Answers3

10

Since you didn't provide I sample graph, I'll use TinkerPop's toy graph to show how it's done.

Assume you want to merge marko and lop:

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).valueMap()
==>[name:[marko],age:[29]]
gremlin> g.V(1).out("created").valueMap()
==>[name:[lop],lang:[java]]

Note, that there are two name properties and in theory you won't be able to predict which name makes it into your merged result; however that doesn't seem to be an issue in your graph.

Get the properties for both vertices:

gremlin> g.V(1).union(identity(), out("created")).valueMap()
==>[name:[marko],age:[29]]
==>[name:[lop],lang:[java]]

Merge them:

gremlin> g.V(1).union(identity(), out("created")).valueMap().
           unfold().group().by(select(keys)).by(select(values))
==>[name:[lop],lang:[java],age:[29]]

UPDATE

Thank you for the added sample output. That makes it a lot easier to come up with a solution (although I think your output contains errors; vadas didn't create anything).

gremlin> g.V().hasLabel("person").
           filter(outE("created")).map(
             union(valueMap(),
                   outE("created").limit(1).inV().valueMap("lang")).
             unfold().group().by {it.getKey()}.by {it.getValue()})
==>[name:[marko], lang:[java], age:[29]]
==>[name:[josh], lang:[java], age:[32]]
==>[name:[peter], lang:[java], age:[35]]
Daniel Kuppitz
  • 10,846
  • 1
  • 25
  • 34
  • Thanks for this! I think the identity() call was the one I had missed in the docs. Unfortunately though, I tried this on both my graph and the 'modern' graph and I get 'No such property: keys for class: groovysh_evaluate' in both. Any ideas? All queries up to the last one work out as per your answer above. – John Stephenson Feb 14 '17 at 13:38
  • Which TinkerPop version are you using? It's probably older than the one I've used for testing. IIRC older versions had `.mapKeys()` and `.mapValues()`, try to use those instead. – Daniel Kuppitz Feb 14 '17 at 14:10
  • ah yes... Tinkerpop 3.0.1-incubating as I'm using [titanDb](https://github.com/awslabs/dynamodb-titan-storage-backend). You were right about mapKeys() etc but it's not a straight replace as I get: gremlin> g.V(1).union(identity(), out("created")).valueMap().unfold().group().by(select(mapKeys())).by(select(mapValues())) No signature of method: static org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__.select() is applicable for argument types: (org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.DefaultGraphTraversal) values: [[MapKeysStep]] – John Stephenson Feb 14 '17 at 14:56
  • Also tried 'by(mapKey‌​s()).by(mapV‌​alues())' after reading the issue that lead to the origins of 'select', but no joy. I'm not clear on the types that are expected or returned by these operators and I'm not sure where to look - the docs don't seem to go to this level or detail, else I'm missing something. Any ideas? (Sorry for the hassle! :-( ) – John Stephenson Feb 14 '17 at 21:23
  • Yea, the 3.0.1 implementation was really weak in this area. I can't find a way w/o lambdas. However, w/ lambdas it's: `....group().by {it.getKey()}.by {it.getValue()}`. – Daniel Kuppitz Feb 15 '17 at 02:57
  • Thanks. That's running but I'm not getting the expected results from the original question. If I change 'g.V(1)' for g.V().hasLabel('person') I get '==>[name:[marko, lop, lop, lop, vadas, josh, ripple, peter], lang:[java, java, java, java], age:[29, 27, 32, 35]]' which is a list of values against each property key. This isn't what I want. I'll update the original question to hopefully make this clearer. – John Stephenson Feb 15 '17 at 10:38
  • Thanks for your continued help on this... I'm not in a position to try this ATM (hopefully later this week or next). When I do I will definitely make it as an answer/accepted. It certainly looks promising! :-) – John Stephenson Mar 01 '17 at 10:21
  • I've finally managed to revisit this! (My apologies). When I run your edited code I get a list of lists for the properties: ==>[name:[[marko]],lang:[[java]],age:[[29]]] ==>[name:[[josh]],lang:[[java]],age:[[32]]] ==>[name:[[peter]],lang:[[java]],age:[[35]]] I added [0] to the last getValue() and it's all golden! :-) MANY THANKS! – John Stephenson Oct 31 '17 at 12:12
  • Not working for multiple columns. require local function – Vinit Siriah Aug 12 '22 at 22:05
0

Merging edge and vertex properties using gremlin java DSL:

 g.V().has('User', 'id', userDbId).outE(Edges.TWEETS)
    .union(__.identity().valueMap(), __.inV().valueMap())
    .unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
    .map(v -> converter.toTweet((Map) v.get())).toList();
youhans
  • 6,101
  • 4
  • 27
  • 39
0

Thanks for the answer by Daniel Kuppitz and youhans it has given me a basic idea on the solution of the issue. But later I found out that the solution is not working for multiple rows. It is required to have local step for handling multiple rows. The modified gremlin query will look like:

g.V()
.local(
        __.union(__.valueMap(), __.outE().inV().valueMap())
        .unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
)

    

This will limit the scope of union and group by to a single row.

If you can work with custom DSL ,create custom DSL with java like this one.

public default GraphTraversal<S, LinkedHashMap> unpackMaps(){
        GraphTraversal<S, LinkedHashMap> it = map(x -> {
            LinkedHashMap mapSource = (LinkedHashMap) x.get();
            LinkedHashMap mapDest = new LinkedHashMap();

            mapSource.keySet().stream().forEach(key->{

                Object obj = mapSource.get(key);
                if (obj instanceof LinkedHashMap) {

                    LinkedHashMap childMap = (LinkedHashMap) obj;
                    childMap.keySet().iterator().forEachRemaining( key_child ->
                            mapDest.put(key_child,childMap.get(key_child)
                            ));


                } else
                    mapDest.put(key,obj);

            });

            return mapDest;
        });
        return it;
    }

and use it freely like

g.V().as("s")

.valueMap().as("value_map_0")
.select("s").outE("INFO1").inV().valueMap().as("value_map_1")
.select("s").outE("INFO2").inV().valueMap().as("value_map_2")
.select("s").outE("INFO3").inV().valueMap().as("value_map_3")

.select("s").local(__.outE("INFO1").count()).as("value_1")
.select("s").outE("INFO1").inV().value("name").as("value_2")


.project("val_map1","val_map2","val_map3","val1","val2")
.by(__.select("value_map_1"))
.by(__.select("value_map_2"))
.by(__.select("value_1"))
.by(__.select("value_2"))
.unpackMaps()

results to rows with

 map1_val1, map1_val2,.... ,map2_va1, map2_val2....,value1, value2

This can handle mix of values and valueMaps in a natural gremlin way.

Vinit Siriah
  • 327
  • 3
  • 12