0

I have a vertex which has a list property, and I want to replace the values in said property and project out the result in a specific format. For context, let's assume the following data:

g.AddV("Post").property("id", "1")
  .property(list, "Tags", "gremlin")
  .property(list, "Tags", "new")

I want to be able to set the Tags property. What I've tried so far:

g.V("1")
  .sideEffect( properties("Tags").drop() )
  .property(list, "Tags", "gremlin")
  .property(list, "Tags", "closed")
  .property(list, "Tags", "solved")
  .project("Tags").By(values("Tags"))

What I would expect is the following

{
  "Tags": [
    "gremlin",
    "closed",
    "solved",
  ]
}

But I instead get the error Project By: Next: The provided traverser of key "Tags" maps to nothing. So it appears as if the Tags property was deleted in its entirety. If I do a query afterwards

g.V("1").project("Tags").By(values("Tags"))

I get the expected result:

{
  "Tags": [
    "gremlin",
    "closed",
    "solved",
  ]
}

So the data must have been changed. If I try without projecting, the result contains the new values.

g.V("1")
  .sideEffect( properties("Tags").drop() )
  .property(list, "Tags", "gremlin")
  .property(list, "Tags", "closed")
  .property(list, "Tags", "solved")

Resulting in:

{
  "id": "1",
  "label": "Post",
  "type": "vertex",
  "properties": {
    "Tags": [
    {
      "id": "4eaf5599-511c-4245-aaf8-15c828073fac",
      "value": "gremlin"
    },
    {
      "id": "75e3ad96-a503-4608-a675-e28f3ffc2ab4",
      "value": "closed"
    },
    {
      "id": "aea1a33c-bd8e-47bb-b294-f01db8642db5",
      "value": "solved"
    },
    ]
  }
}

But this leaves me unable to project the result.

How can I both update the data and project it?

Other things I've tried:

  • Adding a barrier() step after the drop() step, didn't work
  • Adding a barrier() step after the sideEffect() step, didn't work
  • Adding a barrier() step before the project() step, didn't work
  • Doing the same as the above three, but with .fold().unfold() instead, didn't work
  • Replacing the project() step with optional(g.V("1").project("Tags").by(values("Tags"))) - this one works by refetching the vertex but is expensive.
Benjamin
  • 46
  • 5

1 Answers1

2

There may be something happening with CosmosDB in this case, which means you may have to get specific help from them. Note that your traversal works (almost as it is written) with TinkerGraph which is the reference implementation for how Gremlin should work:

gremlin> g = TinkerFactory.createTheCrew().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:14], standard]
gremlin> g.V().has('person','name','marko').
......1>   sideEffect(properties("location").drop()).
......2>   property(list,'location','bombay').
......3>   property(list,'location','calcutta').
......4>   project('location').
......5>     by(values('location').fold())
==>[location:[bombay,calcutta]]

Perhaps you should try your query with my by() modulator on project() to see if the added fold() makes any difference. By the way, I don't expect your traversal without fold() to return what you are saying it returns. Note what happens with TinkerGraph if I try that:

gremlin> g.V().has('person','name','marko').
......1>   sideEffect(properties("location").drop()).
......2>   property(list,'location','bombay').
......3>   property(list,'location','calcutta').
......4>   project('location').
......5>     by(values('location'))
==>[location:bombay]

Replacing the project() step with optional(g.V("1").project("Tags").by(values("Tags"))) - this one works by refetching the vertex but is expensive.

The above is an interesting point, though I have a personal dislike of using g to spawn child traversals, so would prefer the anonymous traversal spawned as optional(V("1").project("Tags").by(values("Tags"))). It makes me wonder if CosmosDB does not reflect mutations in the current traverser which is why you get the result you are looking for when you refresh by doing the re-query. I'm surprised that the mid-traversal lookup is expensive as it is a lookup by element id and should be the fastest way to find things in the graph. That said, it's not so nice that you have to look up the Vertex more than once in the same traversal.

Out of curiosity, you might try valueMap(true) rather than project() as another test just to see if that triggers any different behavior.

Another thing to try might be to use union() instead of sideEffect():

gremlin> g.V().has('person','name','marko').
......1>   union(properties("location").drop(), identity()).
......2>   property(list,'location','bombay').
......3>   property(list,'location','calcutta').
......4>   project('location').
......5>     by(values('location').fold())
==>[location:[bombay,calcutta]]

to see if that makes any difference.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Using the `by(values('Tags'))` yields the following result: `Unsupported Error: Gremlin op does not support by(traversal)` I have the same suspicions about CosmosDB not reflecting mutations, however if I do `g.V('1').property(list, 'Tags', 'foo').values('Tags')` the result includes the newly added tag. I tried `valueMap(true)` and interestingly enough it included the tags. The issue seems to only occur when a `drop()` is involved. Another interesting point about the projection format and the `fold()`. – Benjamin Mar 27 '20 at 09:08
  • 1
    Well it definitely seems like you're bumping up against inconsistencies in their implementation of Gremlin then. I added another option to try but I sense you will be stuck either (1) with two separate traversals or (2) requerying with mid-traversal `V()`. it's also hard to think of workarounds when i'm not completely sure of what CosmosDB is truly supporting and what isn't. I'd again suggest you bring this issue up with them. – stephen mallette Mar 27 '20 at 10:32
  • Unfortunately that yields a `Gremlin Query Compilation Error`. Thanks for your time though. I will continue to investigate and try to contact MS. If I find a solution I will update. – Benjamin Mar 27 '20 at 11:12