Our application receives a large dataset(Ex 100 GB) from third party tool every night and we need to process it and feed it to our system by end of day. We are free to use any infrastructure but processing time should be as minimal as possible ,
Can anyone please suggest an optimised approach .?