0

I am using GCP with its Cloud Functions to execute web scrapers on a frequent basis. Also locally, my script is working without any problems. I have a setup.py file in which I am initializing the connection to a Kafka Producer. This looks like this:

p = Producer(
    {
        "bootstrap.servers": os.environ.get("BOOTSTRAP.SERVERS"),
        "security.protocol": os.environ.get("SECURITY.PROTOCOL"),
        "sasl.mechanisms": os.environ.get("SASL.MECHANISMS"),
        "sasl.username": os.environ.get("SASL.USERNAME"),
        "sasl.password": os.environ.get("SASL.PASSWORD"),
        "session.timeout.ms": os.environ.get("SESSION.TIMEOUT.MS")
    }
)


def delivery_report(err, msg):
    """Called once for each message produced to indicate delivery result.
    Triggered by poll() or flush()."""
    print("Got here!")
    if err is not None:
        print("Message delivery failed: {}".format(err))
    else:
        print("Message delivered to {} [{}]".format(msg.topic(), msg.partition()))

    return "DONE."

I am importing this setup in main.py in which my scraping functions are defined. This looks similar to this:

from setup import p, delivery_report
def scraper():
    try:
       # I won't insert my whole scraper here since it's working fine ...
       print(scraped_data_as_dict)
       p.produce(topic, json.dumps(scraped_data_as_dict), callback=delivery_report)
       p.poll(0)
    except Exception as e:
       # Do sth else

The point here is: I am printing my scraped data in the console. But it doesn't do anything with the producer. It's not even logging an failed producer message (deliver_report) on the console. It's like my script is ignoring the producer command. Also, there are no Error reports in the LOG of the Cloud Function. What am I doing wrong since the function is doing something, except the important stuff? What do I have to be aware of when connection Kafka with Cloud Functions?

ku11
  • 43
  • 5
  • 1
    are you able to see output of `print(scraped_data_as_dict)` ? What about `print("Got here!")` can you see that in the log ? Also do you have any log for the `scraper()` error block ? Also check if you have any `egress rule` set for cloud function. – Naveen Kulkarni Jun 11 '22 at 14:52
  • You'll also want to try flushing the producer, not poll(0) – OneCricketeer Jun 12 '22 at 00:11
  • @NaveenKulkarni Yes, I am able to see the output of scraped_data_as_dict and this is confusing me, because that says the script works fine, except for the producer part. AND No, there are no Error Logs for the error block. Works just fine. BUT I don't have any egress rules. Do I need them? – ku11 Jun 12 '22 at 12:26
  • @ku11 thanks for confirming. You probably don't need an egress rule, just wanted to confirm if anything was set. Can you please try using Functions framework emulator https://cloud.google.com/functions/docs/functions-framework and see if you are able to publish message from local so that we can omit the if it's happening due to something in cloud function or not. – Naveen Kulkarni Jun 12 '22 at 17:08
  • @NaveenKulkarni thanks for this tip! It seems that my scraper is working: ```%7|1655060197.781|MSGSET|rdkafka#producer-1| [thrd:sasl_ssl://$my_bootstrap.servers]: sasl_ssl://$my_bootstrap.servers: scraper[3]: MessageSet with 1 message(s) (MsgId 0, BaseSeq -1) delivered``` is the output (around 10 console logs like this came per second) . Where should I look at to find the error now? – ku11 Jun 12 '22 at 19:03
  • + I am also able to see the produced messages in my topic on Confluent Kafka. So I think it's due to something in the cloud function! – ku11 Jun 12 '22 at 19:14
  • Thanks @ku11. Where is Kafka running. Is it outside gcp ? Can you also try the below ```result = p.produce(topic, json.dumps(scraped_data_as_dict)) \n result.get(timeout=60)``` – Naveen Kulkarni Jun 13 '22 at 04:10
  • @NaveenKulkarni Kafka is running on Confluent Cloud. It's deployed on GCP in Europe-West3. I've initialized my producer with python-package ```confluent-kafka```. Locally, this setup is working without problems. Did you mean ```requests.get(result, timeout=60)```? Because get won't be defined as of now. – ku11 Jun 13 '22 at 05:22
  • @NaveenKulkarni this is the log entry I get: ```result.get(timeout=60) AttributeError: 'NoneType' object has no attribute 'get'```As I assumed, this won't work. Do you have any other ideas? – ku11 Jun 13 '22 at 05:48
  • @ku11 if you look at https://docs.confluent.io/kafka-clients/python/current/overview.html .For async writes the p.poll() is have a non zero value for wait. Can you please try with that ? – Naveen Kulkarni Jun 13 '22 at 06:22
  • @ku11 for the `result.get()` can you please look at https://kafka-python.readthedocs.io/en/master/ – Naveen Kulkarni Jun 13 '22 at 06:24
  • @NaveenKulkarni I am using ```confluent-kafka```. The docs you provided me are meant for ```kafka-python```. This is a difference because Confluent is a fully managed kafka cloud service. – ku11 Jun 13 '22 at 06:42
  • @ku11 got it. Can you please look at the above comment for `poll()` in confluent client – Naveen Kulkarni Jun 13 '22 at 06:48
  • @NaveenKulkarni I've tried using p.poll(), but this doesn't work either. The messages just are not send to Confluent Platform (https://docs.confluent.io/kafka-clients/python/current/overview.html#:~:text=Confluent%20develops%20and%20maintains%20confluent,Confluent%20Cloud%20and%20Confluent%20Platform). It feels like the script is skipping the p.produce() part, since there is no callback if it fails or similar. – ku11 Jun 13 '22 at 07:04
  • Can you also try synchronous writes ? – Naveen Kulkarni Jun 13 '22 at 09:13

1 Answers1

0

I ran into the same problem. I was able to reproduce it with the functions-framework (rewritten to reflect the original question):

from functions_framework import create_app
from functions_framework._http import create_server
PORT = os.getenv('PORT', 8980)
app = create_app('scraper', 'function.py', 'http')
create_server(app, False).run('0.0.0.0', PORT)

Weirdly enough, when I run vanilla flask and gunicorn, the timeout doesn't happen, but the functions-framework (based on flask and gunicorn) does something weird.

Solution is stupid simple: create the producer object within the entry point function:

def scraper():
    p = Producer({...})  # <<< create the producer here, instead of a global constant
    try:
       # I won't insert my whole scraper here since it's working fine ...
       print(scraped_data_as_dict)
       p.produce(...
Erwin Kooi
  • 373
  • 4
  • 5