3

I am using the Gremlin-Python Client to query Gremlin Server with a janusgraph backend.

Running the following query:

graph = Graph()
g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
sg = g.E().subgraph('a').cap('a').next()

The query returns a subgraph containing a list of edges and vertices.

I have the following serializers configured on the server

serializers:
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }}
  - { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.janusgraph.graphdb.tinkerpop.JanusGraphIoRegistry] }} 

Does anyone know how to configure gremlin-server and a sample code to return a fully populated subgraph?

Updated test case based on Stephen's feedback

# DB: Janusgraph with Opensource Cassandra storage backend
# Data: v[41427160]--reports_to-->v[36712472]--reports_to-->v[147841048]
# Objective: get subgraph detached to python client with all properties of the vertex and edges

(py365)$ pip list | grep gremlinpython
gremlinpython   3.3.4
(py365)$ python
Python 3.6.5 (default, Apr 25 2018, 14:26:36)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from gremlin_python.driver import client
>>> from gremlin_python.driver.serializer import GraphSONSerializersV3d0
>>> session = client.Client('ws://localhost:8182/gremlin', 'g', message_serializer=GraphSONSerializersV3d0())
>>> query_parameters = {"vids": [41427160, 36712472]}
>>> query = "g.V(vids).outE('reports_to').subgraph('1').otherV().cap('1').next()"
>>> results = session.submit(query, query_parameters)
>>> for r in results:
...     r_vertices = r[0]['@value'].get('vertices')
...     r_edges = r[0]['@value'].get('edges')
...     print(r)
...     print(r_vertices)
...     print(r_edges)
...
[{'@type': 'tinker:graph', '@value': {'vertices': [v[41427160], v[147841048], v[36712472]], 'edges': [e[{'@type': 'janusgraph:RelationIdentifier', '@value': {'relationId': '21y8ez-onxeg-f11-luviw'}}][41427160-reports_to->36712472], e[{'@type': 'janusgraph:RelationIdentifier', '@value': {'relationId': '225dz7-luviw-f11-2g0qvs'}}][36712472-reports_to->147841048]]}}]
[v[41427160], v[147841048], v[36712472]]
[e[{'@type': 'janusgraph:RelationIdentifier', '@value': {'relationId': '21y8ez-onxeg-f11-luviw'}}][41427160-reports_to->36712472], e[{'@type': 'janusgraph:RelationIdentifier', '@value': {'relationId': '225dz7-luviw-f11-2g0qvs'}}][36712472-reports_to->147841048]]
>>>

Is it true that gremlinpython is lightweight that, even when using script based approach, only necessary elements(id and label) are detached as "reference elements" part of the graphson?

Wolfgang Fahl
  • 15,016
  • 11
  • 93
  • 186
Moses
  • 113
  • 8

2 Answers2

7

You can't fully return the result of subgraph() step as a Graph with Gremlin Python (or any other language variant for that matter). The problem is that Gremlin Python is meant to be a lightweight implementation of Gremlin and thus does not have a graph data structure instance to deserialize the returned data into.

At this time, the only workaround is to simply return the data that forms the graph and then you would have to store that data into something graph-like in Python. So perhaps you would do:

g.E().project('edgeId','label','inId','outId').
        by(id).
        by(label).
        by(inV().id()).
        by(outV().id())

That would return the minimum data required for the structure of the subgraph as a Map and then you could do something with that data in Python.

The other option which I think is less recommended would be to submit a script with Python rather than use a bytecode based request. With a script you would get a GraphSON representation of the subgraph and then you could parse it as necessary to some data structure in Python. Here is the equivalent of the script you would need to send:

gremlin> graph = g.E().hasLabel('knows').subgraph('sg').cap('sg').next()
==>tinkergraph[vertices:3 edges:2]
gremlin> mapper = GraphSONMapper.build().addRegistry(TinkerIoRegistryV3d0.instance())create().createMapper()
==>org.apache.tinkerpop.shaded.jackson.databind.ObjectMapper@f6de586
gremlin> mapper.writeValueAsString(graph)
==>{"@type":"tinker:graph","@value":{"vertices":[{"@type":"g:Vertex","@value":{"id":{"@type":"g:Int32","@value":1},"label":"person","properties":{"name":[{"@type":"g:VertexProperty","@value":{"id":{"@type":"g:Int64","@value":0},"value":"marko","label":"name"}}],"age":[{"@type":"g:VertexProperty","@value":{"id":{"@type":"g:Int64","@value":1},"value":{"@type":"g:Int32","@value":29},"label":"age"}}]}}},{"@type":"g:Vertex","@value":{"id":{"@type":"g:Int32","@value":2},"label":"person","properties":{"name":[{"@type":"g:VertexProperty","@value":{"id":{"@type":"g:Int64","@value":2},"value":"vadas","label":"name"}}],"age":[{"@type":"g:VertexProperty","@value":{"id":{"@type":"g:Int64","@value":3},"value":{"@type":"g:Int32","@value":27},"label":"age"}}]}}},{"@type":"g:Vertex","@value":{"id":{"@type":"g:Int32","@value":4},"label":"person","properties":{"name":[{"@type":"g:VertexProperty","@value":{"id":{"@type":"g:Int64","@value":6},"value":"josh","label":"name"}}],"age":[{"@type":"g:VertexProperty","@value":{"id":{"@type":"g:Int64","@value":7},"value":{"@type":"g:Int32","@value":32},"label":"age"}}]}}}],"edges":[{"@type":"g:Edge","@value":{"id":{"@type":"g:Int32","@value":7},"label":"knows","inVLabel":"person","outVLabel":"person","inV":{"@type":"g:Int32","@value":2},"outV":{"@type":"g:Int32","@value":1},"properties":{"weight":{"@type":"g:Property","@value":{"key":"weight","value":{"@type":"g:Double","@value":0.5}}}}}},{"@type":"g:Edge","@value":{"id":{"@type":"g:Int32","@value":8},"label":"knows","inVLabel":"person","outVLabel":"person","inV":{"@type":"g:Int32","@value":4},"outV":{"@type":"g:Int32","@value":1},"properties":{"weight":{"@type":"g:Property","@value":{"key":"weight","value":{"@type":"g:Double","@value":1.0}}}}}}]}}

We'll be reconsidering how subgraphing works for different language variants in future versions of TinkerPop but for now these are the only solutions that we have.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Thanks Stephen for the very useful pointers. Is it true that gremlinpython is lightweight that, even when using script based approach, only necessary elements(id and label) are detached as "reference elements" part of the graphson? I added some sample code of what I am seeing as I could not get all the elements detached. Many thanks again! – Moses Jan 08 '19 at 16:55
  • hmm - i didn't expect that. i guess you would need to go one extra step and force serialize it yourself and return a string value. i've updated my answer to include that information – stephen mallette Jan 08 '19 at 17:06
  • Sorry if this is a stupid question/comment but how come that *The other option which I think is less recommended would be to submit a script with Python rather than use a bytecode based request.*? when there are virtually zero examples/tutorials on how to do the `DriverRemoveConnection` bytecode-based connection with a real remote database and not something on localhost? – Sascha Feb 16 '19 at 10:29
  • 1
    the entire section of gremlin-python in the reference documentation is about bytecode based requests (i.e. remote traversals via DriverRemoteConnection). http://tinkerpop.apache.org/docs/current/reference/#gremlin-python there is only one little subsection there about "Submitting Scripts". and what more do you have to do to connect to a "real remote database" other than to change the URL from "localhost" to an IP address or hostname? sorry, if i'm not really following what you mean. – stephen mallette Feb 16 '19 at 11:40
0

The above solution doesn't work for me.. A better way to do it using SubgraphStrategy in GremlinPython:

sg = g.withStrategies(SubgraphStrategy())

sg will be a GraphTraversalSource.. check official documentation

for example if I wanna find the subgraph with vertex label 'person'

sg = g.withStrategies(SubgraphStrategy(vertices=hasLabel('person')))
Ming
  • 706
  • 6
  • 8
  • 2
    The original question was looking for ways to deserialize a subgraph using the Python Client. In your example `sg` will not be a graph object it will be a GraphTraversalSource. You would still need to do `sg.V().limit(10)` to get some vertices back for example. – Kelvin Lawrence Mar 29 '20 at 18:23
  • Thanks for pointing it out. I have modified the typo. But I have tries `sg = g.E().subgraph('a').cap('a').next()`, it doesn't return the subgraph, instead it returns empty{}. I am quite confused about this.. – Ming Mar 30 '20 at 09:35
  • 2
    Here is an example that works from Python. I ran this using the Python console >>> g.E().limit(10).subgraph('a').cap('a').toList() [{'@type': 'tinker:graph', '@value': {'vertices': [v[22], v[45], v[34], v[46], v[26], v[15], v[28], v[29], v[6], v[8], v[30], v[20], v[31], v[43]], 'edges': [e[4639][28-route->8], e[4607][26-route->22], e[4719][30-route->34], e[4671][29-route->20], e[4751][31-route->22], e[4687][29-route->43], e[4655][28-route->46], e[4623][26-route->45], e[4735][31-route->6], e[4703][30-route->15]]}}] – Kelvin Lawrence Mar 30 '20 at 12:33