I appreciate if someone could help me understand the code example in below question. I am now trying to implement similar thing using apache beam 2.13.0
with python3.7.3
.
Why does custom Python object cannot be used with ParDo Fn?
I have understood that network sockets
are not serializable since it is not objects that could return neither string nor tuple after the serialization.
What I did not understand was why do you need to call super class
inside __init__
?
class PublishFn(beam.DoFn):
def __init__(self, topic_path):
self.topic_path = topic_path
super(self.__class__, self).__init__()
def process(self, element, **kwargs):
if not hasattr(self, 'publish'):
from google.cloud import pubsub_v1
self.publisher = pubsub_v1.PublisherClient()
future = self.publisher.publish(self.topic_path, data=element.encode("utf-8"))
return future.result()
Thanks.