I am using GCP with its Cloud Functions to execute web scrapers on a frequent basis. Also locally, my script is working without any problems.
I have a setup.py
file in which I am initializing the connection to a Kafka Producer. This looks like this:
p = Producer(
{
"bootstrap.servers": os.environ.get("BOOTSTRAP.SERVERS"),
"security.protocol": os.environ.get("SECURITY.PROTOCOL"),
"sasl.mechanisms": os.environ.get("SASL.MECHANISMS"),
"sasl.username": os.environ.get("SASL.USERNAME"),
"sasl.password": os.environ.get("SASL.PASSWORD"),
"session.timeout.ms": os.environ.get("SESSION.TIMEOUT.MS")
}
)
def delivery_report(err, msg):
"""Called once for each message produced to indicate delivery result.
Triggered by poll() or flush()."""
print("Got here!")
if err is not None:
print("Message delivery failed: {}".format(err))
else:
print("Message delivered to {} [{}]".format(msg.topic(), msg.partition()))
return "DONE."
I am importing this setup in main.py
in which my scraping functions are defined. This looks similar to this:
from setup import p, delivery_report
def scraper():
try:
# I won't insert my whole scraper here since it's working fine ...
print(scraped_data_as_dict)
p.produce(topic, json.dumps(scraped_data_as_dict), callback=delivery_report)
p.poll(0)
except Exception as e:
# Do sth else
The point here is: I am printing my scraped data in the console. But it doesn't do anything with the producer. It's not even logging an failed producer message (deliver_report) on the console. It's like my script is ignoring the producer command. Also, there are no Error reports in the LOG of the Cloud Function. What am I doing wrong since the function is doing something, except the important stuff? What do I have to be aware of when connection Kafka with Cloud Functions?