My idea is to use Spark Streaming + Kafka to get the events from the kafka bus. After retrieving a batch of avro-encoded events I would like to transform them with Spark Avro into SparkSQL Dataframes and then write the dataframes to a Hive Table.
Is this approach feasable? I am new to spark and I am not totally sure, if I can use the Spark Avro package for decoding the Kafka Events, since in the documentation only avro files are mentioned. But my understanding so far is, that it would be possible.
The next question is: if this is possible, my understanding is, that I have a SparkSQL conforming Dataframe, which I could write to a hive table. Are my assumptions correct?
Thanks in advance for any hints and tips.