Suppose I have the following graph:
scala> v.show()
+---+---------------+
| id|downstreamEdges|
+---+---------------+
|CCC| null|
|BBB| null|
|QQQ| null|
|DDD| null|
|FFF| null|
|EEE| null|
|AAA| null|
|GGG| null|
+---+---------------+
scala> e.show()
+---+---+---+
| iD|src|dst|
+---+---+---+
| 1|CCC|AAA|
| 2|CCC|BBB|
...
+---+---+---+
I would like to run an aggregation that gets all of the messages (not just the sum, first, last, etc) that are sent from the destination vertexes to the source vertexes. So the command I would like to run is something like:
g.aggregateMessages.sendToSrc(AM.edge("id")).agg(all(AM.msg).as("downstreamEdges")).show()
except that the function all
does not exist (not that I'm aware of). The output would be something like:
+---+---------------+
| id|downstreamEdges|
+---+---------------+
|CCC| [1, 2]|
...
+---+---------------+
I am able to use the above function with first
or last
instead of (the non-existent) all
, but they would give me only
+---+---------------+
| id|downstreamEdges|
+---+---------------+
|CCC| 1|
...
+---+---------------+
or
+---+---------------+
| id|downstreamEdges|
+---+---------------+
|CCC| 2|
...
+---+---------------+
respectively. How could I keep all of the entries? (There could be many, not just 1 and 2, but 1,2,23,45, etc). Thanks.