3

I am trying to return something like

{
  "label1" : ["prop1","prop2"],
  "label2" : ["prop3","prop4"],
  "label2" : ["prop1","prop3"]
}

etc where the labels[N] are vertex label values and the props array are the properties for these vertices.

I can get a list of labels and I can get a list of properties but I can't combine them in a single object. I could potentially do the two queries and combine the two arrays ultimately, but something like

g.V().valueMap().select(keys).dedup();

only gets properties where there are any, so if a vertex type doesn't have any properties the array returned by this is a different size than doing

g.V().label().dedup();

This is using gremlin syntax (TP3) Thanks

radder5
  • 236
  • 3
  • 14

1 Answers1

13

I'm assuming that you're trying to get sort of a schema definition. Note that this will be a fairly expensive traversal as you have to iterate all vertices to do this:

gremlin> g.V().
......1>   group().
......2>     by(label).
......3>     by(properties().
......4>        label().
......5>        dedup().
......6>        fold())
==>[software:[name,lang],person:[name,age]]
stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Perfect thanks. Yes, does seem rather expensive, but only going to be run once when I spin up an API and cache it thereafter. Any suggested optimisations I could do with indexes? – radder5 Feb 25 '18 at 08:23
  • Indices won't help in this case. No matter what you do you can't gather the whole schema without iterating the entire graph, so you're stuck. I think that this traversal would work in OLAP, so you could execute it over spark if your graph is large. You don't say what graph database you use, but you might check if it's APIs enable more direct access to the elements of the schema so that you didn't have to iterate through the whole graph to get it. – stephen mallette Feb 25 '18 at 16:03
  • I'm using Janusgraph with HBase storage backend lucene index (for PoC). I may well swap out the latter for Solr in production. Thanks – radder5 Feb 26 '18 at 19:57
  • In case there is millions of records, any way to speed it up ? – Chitresh goyal Nov 23 '22 at 05:46
  • as mentioned in my comment above, you would use spark-gremlin/olap if you had a really large graph. – stephen mallette Nov 29 '22 at 11:20