0

Given a set of User vertices, I need to find all Chat vertices that are connected to those, but no other. For instance, all chats that only Alice and Bob participate in. The query should order the results so that the chat connected to the most recent message is returned first.

Query Example

In my initial attempt I tried to start off from one of the users, visit all chats he/she participates in, and filter them so that only those chats remain that have exactly all other users as participants.

traversal().V(user.id()) //random user of the set
            .out("participates")
            .hasLabel("Chat")
            .where(__.in("participates")
                    .hasLabel("User")
                    .fold()) //how could I match this collection against my set?

Would this be a suitable approach? How can I match against sets and order by the timestamps of linked Messages? Thanks for any pointers.

edit Here's the new query thanks to Daniel's answer:

The edge, vertex and property labels are slightly different in the actual product (Chat is CONVERSATION, timestamp is CREATED_AT, participates is MEMBER_OF) and part of custom enums. The vertex array contains all user vertices that should be part of the conversation.

traversal().V(vertices[0].id())
            .out(EdgeLabel.MEMBER_OF.name())
            .hasLabel(VertexLabel.CONVERSATION.name())
            .filter(__.in(EdgeLabel.MEMBER_OF.name())
                    .hasLabel(VertexLabel.USER.name())
                    .is(P.within(vertices)).count().is(vertices.length))
            .order().by(__.out(EdgeLabel.CONTAINS.name())
                        .values(PropertyKey.CREATED_AT.name())
                        .order().by(Order.decr).limit(1), Order.decr)
Double M
  • 1,449
  • 1
  • 12
  • 29

1 Answers1

4

Let's start with sample graph:

g = TinkerGraph.open().traversal()
g.addV("user").property("name", "alice").as("a").
  addV("user").property("name", "bob").as("b").
  addV("user").property("name", "caesar").as("c").
  addV("chat").property("name", "A").as("A").
  addV("chat").property("name", "B").as("B").
  addV("chat").property("name", "C").as("C").
  addV("message").property("timestamp", 1).property("text", "Sed mollis velit.").as("m1").
  addV("message").property("timestamp", 2).property("text", "Aenean aliquet dapibus.").as("m2").
  addV("message").property("timestamp", 3).property("text", "Nunc vel dignissim.").as("m3").
  addV("message").property("timestamp", 4).property("text", "Aliquam in auctor.").as("m4").
  addV("message").property("timestamp", 5).property("text", "Nulla dignissim et.").as("m5").
  addV("message").property("timestamp", 6).property("text", "Pellentesque semper dignissim.").as("m6").
  addE("participates").from("a").to("A").
  addE("participates").from("a").to("B").
  addE("participates").from("a").to("C").
  addE("participates").from("b").to("B").
  addE("participates").from("b").to("C").
  addE("participates").from("c").to("C").
  addE("contains").from("A").to("m1").
  addE("contains").from("A").to("m2").
  addE("contains").from("B").to("m3").
  addE("contains").from("B").to("m4").
  addE("contains").from("C").to("m5").
  addE("contains").from("C").to("m6").iterate()
  • Alice participates in chat A, B and C.
  • Bob participates in chat B and C.
  • Caesar participates in chat C only.

Thus if we now look for chats with a conversation between Alice and Bob, we should only find Chat B. And sure enough:

gremlin> g.V(users.head()).
           out("participates").
           not(__.in("participates").is(without(users))).
           filter(__.in("participates").is(within(users)).count().is(users.size())).
           order().by(out("contains").values("timestamp").order().by(decr).limit(1), decr).
           valueMap()
==>[name:[B]]

It would be a bit easier and faster if timestamps were part of the contains edges.

Daniel Kuppitz
  • 10,846
  • 1
  • 25
  • 34
  • Thanks a ton for this elegant query. I wrote various test cases to ensure it does what it should. It still returns those `Chat` vertices that _also_ have the given users as participants, but not only those. Like if A, B and C are in a chat, and I ask for chats with only A and B in it, I get the one with all three, too. Do you suppose it has something to do with the concatenation of `.is(within(...)).count().is(...)`? Please see my edited question for the current (Java) query. – Double M Aug 03 '17 at 14:04
  • Are you saying you want an exact match? E.g. if you ask for A and B, then A and B should be the only participants in this chat? – Daniel Kuppitz Aug 03 '17 at 14:25
  • That's right. I think `.is(within(users))` filters all users that are not part of the conversation, which means that the subsequent `.count().is(users.size())` will always map to `true` if all of the users are part of the chat, regardless if there are more. What do you think? – Double M Aug 03 '17 at 14:34
  • Spot-on! Thanks so much, a very neat solution. – Double M Aug 03 '17 at 15:08