I have 100k records to be processed and I need to fetch 10k each time, process them and fetch another 10k until I process all the 100k records which I call as batch size to reduce the processing overhead each time by fetching all the records at once.
Any suggestions on how to achieve it using Apache beam
I am using spark runner.