0

How does folding affect the output of JSON from gremlin server? I get different data structure when I unfold and fold path content, it adds the edge and vertex properties. While this is my goal to get the properties in the path as well, but this seems odd behaviour and I could not find about this functionality in the docs.

So why does this happen?

g.V('1').out().path()

g.V('1').out().path().by(unfold().fold())

When I run following query: g.V('1').out().path()

{
...
    {
      "@type": "g:Path",
      "@value": {
        "labels": {
          "@type": "g:List",
          "@value": [
            {
              "@type": "g:Set",
              "@value": []
            },
            {
              "@type": "g:Set",
              "@value": []
            }
          ]
        },
        "objects": {
          "@type": "g:List",
          "@value": [
            {
              "@type": "g:Vertex",
              "@value": {
                "id": "1",
                "label": "USER"
              }
            },
            {
              "@type": "g:Vertex",
              "@value": {
                "id": "2",
                "label": "USER"
              }
            }
          ]
        }
      }
    }
...
}

But when I g.V('1').out().path().by(unfold().fold())

{
...
  {
    "@type": "g:Path",
    "@value": {
      "labels": {
        "@type": "g:List",
        "@value": [
          {
            "@type": "g:Set",
            "@value": []
          },
          {
            "@type": "g:Set",
            "@value": []
          }
        ]
      },
      "objects": {
        "@type": "g:List",
        "@value": [
          {
            "@type": "g:List",
            "@value": [
              {
                "@type": "g:Vertex",
                "@value": {
                  "id": "1",
                  "label": "USER",
                  "properties": {
                    "prop1": [
                      {
                        "@type": "g:VertexProperty",
                        "@value": {
                          "id": {
                            "@type": "g:Int32",
                            "@value": 101839172
                          },
                          "value": {
                            "@type": "g:Int32",
                            "@value": 1
                          },
                          "label": "prop1"
                        }
                      }
                    ],
                    "created_at": [
                      {
                        "@type": "g:VertexProperty",
                        "@value": {
                          "id": {
                            "@type": "g:Int32",
                            "@value": 589742877
                          },
                          "value": {
                            "@type": "g:Date",
                            "@value": 1557226436119
                          },
                          "label": "created_at"
                        }
                      }
                    ]
                  }
                }
              }
            ]
          },
          {
            "@type": "g:List",
            "@value": [
              {
                "@type": "g:Vertex",
                "@value": {
                  "id": "2",
                  "label": "USER",
                  "properties": {
                    "prop1": [
                      {
                        "@type": "g:VertexProperty",
                        "@value": {
                          "id": {
                            "@type": "g:Int32",
                            "@value": -1354828672
                          },
                          "value": {
                            "@type": "g:Date",
                            "@value": 1557225020168
                          },
                          "label": "prop1"
                        }
                      }
                    ],
                    "created_at": [
                      {
                        "@type": "g:VertexProperty",
                        "@value": {
                          "id": {
                            "@type": "g:Int32",
                            "@value": 589742878
                          },
                          "value": {
                            "@type": "g:Date",
                            "@value": 1557226436119
                          },
                          "label": "created_at"
                        }
                      }
                    ]
                  }
                }
              }
            ]
          }
        ]
      }
    }
  }
...
}

EDIT: Additional information, I discovered that additional to fold(), I can get the whole entity with properties by using project() and identity().

So when I run g.V('1').out().path().by(identity()) I get following contents of a Path, same as first query.

      "objects": {
        "@type": "g:List",
        "@value": [
        {
          "@type": "g:Vertex",
          "@value": {
            "id": "1",
            "label": "USER"
        }
        },
        {
          "@type": "g:Vertex",
          "@value": {
            "id": "2",
            "label": "USER"
        }
    }
  ]
}

But when I run g.V('1').out().path().by(project('identity').by(identity())), this is what I get in the path(note the properties object):

"objects": {
    "@type": "g:List",
    "@value": [
        {
            "@type": "g:Map",
            "@value": [
                "identity",
                {
                    "@type": "g:Vertex",
                    "@value": {
                        "id": "1",
                        "label": "USER",
                        "properties": {
                            "prop1": [
                                {
                                    "@type": "g:VertexProperty",
                                    "@value": {
                                        "id": {
                                            "@type": "g:Int32",
                                            "@value": 101839172
                                        },
                                        "value": {
                                            "@type": "g:Int32",
                                            "@value": 1
                                        },
                                        "label": "prop1"
                                    }
                                }
                            ],
                            "created_at": [
                                {
                                    "@type": "g:VertexProperty",
                                    "@value": {
                                        "id": {
                                            "@type": "g:Int32",
                                            "@value": 589742877
                                        },
                                        "value": {
                                            "@type": "g:Date",
                                            "@value": 1557226436119
                                        },
                                        "label": "created_at"
                                    }
                                }
                            ],
                        }
                    }
                }
            ]
        }
Vasar
  • 395
  • 2
  • 5
  • 15
  • interesting - what graph database are you using? – stephen mallette May 08 '19 at 13:50
  • I am using Neptune, so I guess it is kind of black box how exactly they have implemented gremlin. – Vasar May 09 '19 at 06:03
  • Actually the `unfold()` step is unnecessary, it is the `fold()` step, that basically adds properties to the edges and vertices. I guess this is not normal Gremlin behaviour? Is there a way to get the whole vertex/edge without doing `fold()`? – Vasar May 09 '19 at 10:32

1 Answers1

1

You should never get properties on any graph element (i.e. Vertex, Edge, or VertexProperty) returned from the server - only a "reference" which is composed of id and label. So, what you see in you first traversal is correct and what you see in the second that uses by(unfold().fold()) is wrong.

It is actually a bug in TinkerPop for which I've created TINKERPOP-2212.

The correct way to get what you want is to do something along the lines of:

gremlin> g.V(1).out().path().by(valueMap())
==>[[name:[marko],age:[29]],[name:[lop],lang:[java]]]
==>[[name:[marko],age:[29]],[name:[vadas],age:[27]]]
==>[[name:[marko],age:[29]],[name:[josh],age:[32]]]
gremlin> g.V(1).out().path().by(valueMap(true).by(unfold()))
==>[[id:1,label:person,name:marko,age:29],[id:3,label:software,name:lop,lang:java]]
==>[[id:1,label:person,name:marko,age:29],[id:2,label:person,name:vadas,age:27]]
==>[[id:1,label:person,name:marko,age:29],[id:4,label:person,name:josh,age:32]]

or perhaps in latest versions of TinkerPop, replace valueMap(true) with:

gremlin> g.V(1).out().path().by(valueMap().by(unfold()).with(WithOptions.tokens))
==>[[id:1,label:person,name:marko,age:29],[id:3,label:software,name:lop,lang:java]]
==>[[id:1,label:person,name:marko,age:29],[id:2,label:person,name:vadas,age:27]]
==>[[id:1,label:person,name:marko,age:29],[id:4,label:person,name:josh,age:32]]
stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Alright, I was wondering if it was intended this way. Though why not provide the properties, for example with some argument which would also load the properties. I am just thinking if I have a client for Gremlin, I could define a data structure that would have id, label and properties so interacting with the data would be more consistent? – Vasar May 09 '19 at 12:16
  • There are a number of reasons for returning a reference element without properties, but one of the main ones is that there is risk of a "heavy vertex" - a Vertex with a ton of property data which would chew up a massive amount of server resources to serialize. The potential for this situation is driven by TinkerPop's support of multi-properties - imagine a vertex with 1+ million properties on it and you do `g.V()` and end up hitting that (or worse, several of those). By not returning properties and forcing users to be explicit about what they return we mitigate that problem – stephen mallette May 09 '19 at 12:43
  • I always liken this problem to SQL where you would never do `SELECT * FROM table` - you would specify the exact fields to return, because if your schema was ever modified and added, for example, a blob field you would suddenly be selecting that data too which may not be so good for your application. – stephen mallette May 09 '19 at 12:44
  • Yeah, I get why this is the default behavior, but it would be nice to have the choice, like in SQL. – Vasar May 09 '19 at 14:07