I got a requirement to process the file as it is means the file content should be processed as it appears in the file.
For Example: I have a file and size is 700MBs. How we can make sure the file will be processed as it appears since it depends on Datanode availability. In some cases, if any of Datanode process the file slowly(low configuration).
One way to fix this, adding unique id/key in file but we dont want to add anything new in the file.
Any thoughts :)