1

I'm running a Dataflow streaming pipeline that reads from PubSub and writes to BigQuery. When I try to verify that all the messages have been written to BigQuery I realize that some messages are missing.

How can I verify that all the messages sent to PubSub have been sent correctly?

bsmarcosj
  • 1,590
  • 1
  • 11
  • 21
  • 1
    If you are getting OK responses on your PubSub publisher (https://cloud.google.com/pubsub/publisher), it should mean they have been sent correctly. You can try logging all published message IDs at your publisher, and all received message IDs in your pipeline (e.g. put them into a BigQuery column). What fraction of messages is missing? – jkff Aug 28 '15 at 16:07

1 Answers1

0

If you are getting OK responses on your PubSub publisher (cloud.google.com/pubsub/publisher), it should mean they have been sent correctly. You can try logging all published message IDs at your publisher and all received message IDs in your pipeline (e.g. put them into a BigQuery column).

Sam McVeety
  • 3,194
  • 1
  • 15
  • 38