How can we read multiple sequence files in Apache Flink parallely as Batch Job

Asked May 15 '18 at 10:50

Active May 15 '18 at 10:50

Viewed 363 times

I have a use case of reading sequence files as a Batch job in Flink Dataset. The files are stored in S3 bucket which I have to consume in a Flink Dataset. I am not able to read the files by providing comma(,) separated file paths to read in the Dataset. I cannot read the data in file using a loop as there are a lot of files in the bucket. Also the union function for Flink Datasets seems to fail after few iterations. Can someone help me out with creating a custom Sequence file reader which will work for this case as provided in Spark.

asked May 15 '18 at 10:50

Abhinav Prakash

How can we read multiple sequence files in Apache Flink parallely as Batch Job

0 Answers0