1

I am new to Azure Databricks and Event Hubs. I have been struggling for days to stream data from Databricks using Spark and Kafka API to an event hub.The data I want to stream is in a .csv file. The stream is starting but the Dashboard with the Input Rate is blank. Here is a code snippet:

def write_to_event_hub(df:DataFrame, topic: str, bootstrap_servers: str, config: str, checkpoint_path: str):
    print("Producing to even hub via Kafka")
    df.writeStream\
    .format("kafka")\
    .option("topic", topic)\
    .option("kafka.bootstrap.servers", bootstrap_servers)\
    .option("kafka.sasl.mechanism", "PLAIN")\
    .option("kafka.security.protocol", "SASL_SSL")\
    .option("kafka.sasl.jaas.config", config)\
    .option("checkpointLocation", checkpoint_path)\
    .start()
    


write_to_event_hub(streaming_df, topic, bootstrap_servers, sasl_jaas_config, "./checkpoint")

And the code used to generated data:

streaming_df = spark.readStream.option("header", "true").schema(location_schema).csv(f"{path}").select("*")

And I also attached a picture with the configuration. The topic is the name of the event hub, and the connection string has the format: <Endpoint=sb://XXXX.servicebus.windows.net/;SharedAccessKeyName=XXXX;SharedAccessKey=XXX=;EntityPath=XXXX>

(I want to connect to one event hub in the namespace) Probably something is wrong with the way I read the data I want to stream or with the configuration. Any idea?

enter image description here enter image description here

Thank you!

user14681827
  • 143
  • 1
  • 10
  • you don't have input data to stream into EventHubs. from where you get this data? – Alex Ott Jun 28 '22 at 17:22
  • I have a piece of code to generate data. From a csv file with 3 columns I created a streaming data frame. And if I call .describe() on that streaming data frame, the data can be seen. Also the graph with input vs processing rate has values – user14681827 Jun 29 '22 at 06:29
  • I just realized that maybe the format that I used to read the csv it's not correct I misunderstood it. I edited with the generated data – user14681827 Jun 29 '22 at 07:56

0 Answers0