0

I see examples here https://cloud.google.com/dataflow/model/pubsub-io#reading-with-pubsubio for Java, but when I look here https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/pubsub.py its says:

def reader(self):
    raise NotImplementedError(
        'PubSubSource is not supported in local execution.')

What does that mean? Cloud Data Flow Python SDK PubSub Source/Sink is not quite ready?

David L
  • 32,885
  • 8
  • 62
  • 93

3 Answers3

1

It means that reading from PubSub is currently not supported when executing the pipeline locally (on your machine, i.e. not in the cloud). Local execution is mainly used for testing.

PubSub is supported when you run using the Dataflow runner.

jkff
  • 17,623
  • 5
  • 53
  • 85
0

It would appear that it's not ready yet, as I was able to run it locally with the Java SDK and using the pubsub emulator, but as you've encountered, not with the Python SDK.

rezn8
  • 151
  • 1
  • 8
0

For anyone visiting this question in 2019, I can confirm that PubSub does work with DirectRunner as long as proper Google Cloud authentication is provided.

andreimarinescu
  • 3,541
  • 2
  • 25
  • 32