0

I am using gremlinpython to connect to a CosmosDB graph and would like to be able to directly add a vertex using a GraphSON formatted dictionary. Specifically, I would like to avoid having to dynamically build a gremlin query such as:

"g.addV('person').property(...)..."

and instead run something like:

my_dict = {'id':'something', 'label':'person', 'outE':{}, 'properties':{}}
_gremlin_insert_vertex = "g.addV('person').use_my_graphson_dict({})".format(my_dict)
callback = client.submitAsync(_gremlin_insert_vertex)

Or something to that effect. The Azure portal has a JSON representation of vertices from a query I run (e.g. "g.V()"), but I would like to be able to get that into Python using gremlinpython, make updates, then send the JSON dict back to update or add a vertex. I can't seem to find any documentation on how to convert between a GraphSON dict and gremlin objects or queries.

Thursdays Coming
  • 1,002
  • 13
  • 28

2 Answers2

3

There is simply no such API in Gremlin. It doesn't have a step that can accept GraphSON or a Dictionary (Map in Java) to natively convert it to property() steps. There has been considerable discussion on this topic in the TinkerPop community over the years as the user convenience of a such a step is arguably high especially in the context that you describe. Unfortunately, introducing Map doesn't fit nicely into the API as it would initially appear as it does not properly allow for setting of multi-properties unless the step signature accepted a Map<Object,List<Object>> (i.e. in Python, a Dictionary where the key is String or T and the value is a List of arbitrary objects) which is more complex to construct and reason about. Moreover, that API doesn't account well for meta-properties when taken in the general context of how those are set. There are other arguments against it as well but those are the ones that tend to stick out in my mind.

As for a step accepting GraphSON itself (which I suppose would mitigate some of the issue I mentioned above with multi/meta-properties), I don't think that has ever been proposed. I'm not sure how that would work though, as GraphSON is a function of IO operations and the Gremlin language itself simply has never had any knowledge of that. IO is an abstraction far away from Gremlin and I don't know that it would fit in well there. I also think that most users have complained about GraphSON's complexity (dictionaries with embedded lists or lists and so on) and that manually constructing GraphSON is non-trivial and therefore I'd doubt that many would find such an API appealing to them. Multi/meta-properties strike again! :)

I'd also say that TinkerPop is very much against constructing strings of Gremlin. You're forced to do that now in CosmosDB as they don't yet support the bytecode API. With that support (something they are working on), you will no longer submit Gremlin as a String value but instead write Gremlin in your favorite native language (in your case Python). So, developing paths that further encourage users to "construct strings", of any sort, GraphSON or Gremlin, will probably be discouraged.

Now, in Python, you could build this method yourself as part of a custom Gremlin DSL which would basically take a Dictionary and convert it to property() calls. As the logic would be specific to your application, you could account for whatever meta/multi-property issues you may or may not have. You can read more about how to build DSLs here and learn more about patterns for implementing in this blog post series: Part I, Part II, and Part III.

I think we might see this kind of API native to Gremlin in 4.x when there is growing favor of dropping support for multi/meta-properties, but until then there haven't been many good ideas.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Very interesting and informative. My initial thought without being able to use the GraphSON dict itself was to construct Gremlin strings as you indicated, as that seems to be the only option discussed in the various CosmosDB documentation and sources. However, it quickly becomes obvious why TinkerPop wouldn't like that, and from a Python perspective, it feels very un-pythonic. – Thursdays Coming Jun 05 '18 at 22:30
  • Additionally as an aside, I've been trying to conceptualize graph based data models in python and how best to pass graph based data between various components of my code. I have created custom Vertex and Edge classes, in which I ended up including methods to dump and load them to/from GraphSON format because that is what CosmosDB seems to indicate in their documentation (e.g. https://learn.microsoft.com/en-us/azure/cosmos-db/gremlin-support#gremlin-wire-format-graphson ) – Thursdays Coming Jun 05 '18 at 22:32
  • It sounds as if I should instead start having these objects use a Gremlin DSL directly for interacting with CosmosDB rather than converting to GraphSON first (once they support this functionality)? I'm going to read through your blog series now, but generally it seems to make sense. Thanks for the help. – Thursdays Coming Jun 05 '18 at 22:32
  • when i mentioned the DSL approach I'd forgotten that you are using cosmosdb. note that under that scenario you can't use the native gremlin because at this time cosmosdb does not support bytecode request format that gremlin language variants use. that support is coming as far as i know. – stephen mallette Jun 06 '18 at 01:31
1

This is a year late so by now you've either solved it or not care anymore but for posterity...

You could use the sql python client against your graph collection, and use the insert document method to send a json that has a valid Graphson structure for a vertex:

Something like this:

{
    "label": "person",
    "firstName": [{
        "_value": "Thomas",
        "id": "5267ec4b-a39e-4d77-8dea-668cb36307bc"
    }],
    "lastName": [{
        "_value": "Andersen",
        "id": "2e5271a6-ddd8-48b9-8ff6-be41e19f82f8"
    }],
    "age": [{
        "_value": 44,
        "id": "1c9a57cc-3324-4a0c-b4c3-d494fbb3fb81"
    }],
    "PartitionKey": "123",
    "id": "a9b57684-16bf-47d9-8761-570bab43ca7b"
}

I've blogged about it a while ago - I've only tested it in the .NET SDK though.

AlexDrenea
  • 7,981
  • 1
  • 32
  • 49
  • Interesting. I have indeed found a different way of doing this, mostly by trying to follow the intended use of Gremlin more closely. But I like the idea of having this as an option, and it hadn't occurred to me that SQL would even be a possibility. – Thursdays Coming Oct 12 '19 at 21:40