1

I have 2 topics:

player_info_topic example message:

{"id": 1, "name": "Sandy"}

seating_arrangement_topic example message:

{"id": 1, "seat": 2}

Is there a way to match these messages in gcp, cloud dataflow maybe then publish to another pubsub topic?

emurmotol
  • 171
  • 1
  • 2
  • 12

1 Answers1

2

In streaming, joins can only be done if you apply a window to the PCollections to be joined. The window has to be either a FixedWindow or a SlidingWindow.

You could read from the two topics, and then add a key to each element with the id in the two resulting PCollections. Then add the window, and apply the CoGroupByKey. Then generate the output in the format that you want to write to Pubsub, and send it to Pubsub.

Have a look at the examples for Python, for Java and the section about CoGroupByKey in the Beam Programming Guide

Israel Herraiz
  • 611
  • 3
  • 8