I need to build a service using aws tools which aggregates data from various dynamodb tables and stores the data in a redshift cluster. There also needs to be processing done to each data stream before it is stored in redshift.
My current idea is to send each data stream through dynamodb streams to kinesis data analytics, with each stream having its own kinesis component. Each kinesis component will do processing on the data and then write the processed data to the same redshift cluster.
I fear that this is not scalable and was wondering if there is any way to have one single service take multiple input streams, do processing, and then send the processed data to the redshift cluster? This way, for each new dynamodb table or s3 bucket, we don't need to create a whole new kinesis analytics component.
For reference, the data stored in each of the dynamodb tables is not the same and neither is the processed data.
There is an extremely large amount of data being used and the updates need to be handled in realtime.