0

With reference To link

https://github.com/tinkerpop/blueprints/wiki/GraphSON-Reader-and-Writer-Library

How to label the vertices in the json file which will result in speeding up graph traversals. I am using titan graph db . So GRAPHSON will be used for converting json into graph instance and also using gremlin query language . And thus to speed up the retrieval of vertices i need to label these vertices to categorise . How Can i add label ??

{
"mode":"EXTENDED",
"vertices": [
    {
        "name": {
            "type": "string",
            "value": "lop"
        },
        "lang": {
            "type": "string",
            "value": "java"
        },
        "_id": "3",
        "_type": "vertex"
    },
    {
        "name": {
            "type": "string",
            "value": "vadas"
        },
        "age": {
            "type": "integer",
            "value": 27
        },
        "_id": "2",
        "_type": "vertex"
    },
    {
        "name": {
            "type": "string",
            "value": "marko"
        },
        "age": {
            "type": "integer",
            "value": 29
        },
        "_id": "1",
        "_type": "vertex"
    },
    {
        "name": {
            "type": "string",
            "value": "peter"
        },
        "age": {
            "type": "integer",
            "value": 35
        },
        "_id": "6",
        "_type": "vertex"
    },
    {
        "name": {
            "type": "string",
            "value": "ripple"
        },
        "lang": {
            "type": "string",
            "value": "java"
        },
        "_id": "5",
        "_type": "vertex"
    },
    {
        "name": {
            "type": "string",
            "value": "josh"
        },
        "age": {
            "type": "integer",
            "value": 32
        },
        "_id": "4",
        "_type": "vertex"
    }
],
"edges": [
    {
        "weight": {
            "type": "float",
            "value": 1
        },
        "_id": "10",
        "_type": "edge",
        "_outV": "4",
        "_inV": "5",
        "_label": "created"
    },
    {
        "weight": {
            "type": "float",
            "value": 0.5
        },
        "_id": "7",
        "_type": "edge",
        "_outV": "1",
        "_inV": "2",
        "_label": "knows"
    },
    {
        "weight": {
            "type": "float",
            "value": 0.4000000059604645
        },
        "_id": "9",
        "_type": "edge",
        "_outV": "1",
        "_inV": "3",
        "_label": "created"
    },
    {
        "weight": {
            "type": "float",
            "value": 1
        },
        "_id": "8",
        "_type": "edge",
        "_outV": "1",
        "_inV": "4",
        "_label": "knows"
    },
    {
        "weight": {
            "type": "float",
            "value": 0.4000000059604645
        },
        "_id": "11",
        "_type": "edge",
        "_outV": "4",
        "_inV": "3",
        "_label": "created"
    },
    {
        "weight": {
            "type": "float",
            "value": 0.20000000298023224
        },
        "_id": "12",
        "_type": "edge",
        "_outV": "6",
        "_inV": "3",
        "_label": "created"
    }
]

}

user3646858
  • 137
  • 1
  • 2
  • 12

1 Answers1

0

TinkerPop 2.x did not have native support for vertex labels and thus, GraphSON 2.x didn't support it either. TinkerPop 3.x (and thus Titan 1.0) have this support:

http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#graphson-reader-writer

You'd either need to upgrade to Titan 1.0/TP3 or write a custom GraphSON processor using the Titan API in earlier versions of Titan/TP2.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • What are the following changes you suggest to the above code ? So that vertex label might be added ? – user3646858 Oct 01 '15 at 10:58
  • As far as structure goes, I guess i would modify the GraphSON to have a `_label` key on each vertex. – stephen mallette Oct 01 '15 at 11:11
  • And how should i traverse vertex ? before i traversed using g.V().has("name","lop").next();Whats the new format , ok i will try _label now – user3646858 Oct 01 '15 at 11:16
  • and using _label it is iterating over all vertices too and not some specific category of vertices. i want to traverse only particular vertices that belong to that label , – user3646858 Oct 01 '15 at 11:23
  • just to be clear, adding `_label` doesn't automatically load it into titan. you still have the work of having to write some form of custom GraphSON reader that will parse that all into Titan. Again, TP2 does not have support for setting the vertex label natively. That said, the manner in which you traverse vertices is up to Titan. If you want to save yourself some extra work, I suggest you upgrade to Titan 1.0. – stephen mallette Oct 01 '15 at 11:31
  • Yes i am using titan 1.0 dependency in eclipse – user3646858 Oct 01 '15 at 11:33
  • And can you show like how to write that GraphSON reader that will parse all the labels into titan . Some examples – user3646858 Oct 01 '15 at 11:38
  • This is very confusing. If you are using Titan 1.0 then "label" is directly supported and you don't need to add a `_label` key. Please refer to the link I provided above for how to read/write GraphSON. Note that the old version of GraphSON, for which you provided a code sample in your question, cannot be read by the current TP3 readers. It can be read by the [legacy reader](http://tinkerpop.incubator.apache.org/docs/3.0.1-incubating/#_tinkerpop2_data_migration) but again, that will not support vertex labels. In that case you would have to write a custom parser. – stephen mallette Oct 01 '15 at 11:43
  • { "id": 1, "label": "person", "outE": { "created": [ { "id": 9, "inV": 3, "properties": { "weight": 0.4 } } ], "knows": [ { "id": 7, "inV": 2, "properties": { "weight": 0.5 } }, { "id": 8, "inV": 4, "properties": { "weight": 1 } } }, – user3646858 Oct 01 '15 at 12:05
  • here is an example for TinkerPop3: https://github.com/apache/incubator-tinkerpop/blob/master/data/tinkerpop-modern.json – stephen mallette Oct 01 '15 at 12:26
  • This is not even a json format but many json format embedded in a file, and creating this format is a very tedious task . I am creating a graph of 10,000 of nodes and millions of relationships. Can you suggest me an easier way of creating this json format . – user3646858 Oct 02 '15 at 08:10
  • GraphSON files needed to be easily split for usage in Spark/Hadoop so one line per vertex in adjacency list format makes sense for that use case. For smaller graphs, where you don't care about such things, you can wrap that format into a JSON object with a key of "vertices" which has an array value of those vertices. The format for vertices itself is complicated by the need for spark/hadoop/olap support and multi/meta properties so there isn't much that can be done there to simplify. – stephen mallette Oct 02 '15 at 12:45
  • If you'd like to see a simplified edge list representation for GraphSON in TP3, please create an issue https://issues.apache.org/jira/browse/TINKERPOP3 or raise the idea on the gremlin users mailing list https://groups.google.com/forum/#!forum/gremlin-users – stephen mallette Oct 02 '15 at 12:46
  • I forgot to mention that if you do "wrap" you vertices in an array to get valid JSON, you will need to set this setting to true in your `GraphSONReader`: http://tinkerpop.incubator.apache.org/javadocs/3.0.1-incubating/full/org/apache/tinkerpop/gremlin/structure/io/graphson/GraphSONReader.Builder.html#unwrapAdjacencyList-boolean- – stephen mallette Oct 02 '15 at 12:49