I'm new working with real-time applications. Currently, I'm using AWS Kinesis/Flink and Scala I have the following architecture:
As you can see I consume a CSV file using CSVTableSource. Unfortunately, the CSV file became too big for the Flink Job. The file is updated daily, then new rows are added. So, now I am working in a new architecture, where I want to replace the CSV for a DynamoDB.
My question is: what do you recommend to consume the DynamoDB table?
PD: I need the to do a left outer join using the DynamoDB table and the Kinesis Data Stream data