0

I have quite a lot of bounded data that I need to filter after getting from DataSource (say, I'm unable to filter it instantly using query due to complex filtering logic). And I need to restrict maximum amount of data at the end of pipeline (to implement some sort of paging).

So, I need some sort of Java's Stream.limit() to stop fetching rows from DataSource after needed amount of data has been accumulated.

What is best way of doing that in Apache Flink? I'm looking at counters, but maybe there is more suitable API?

viator
  • 1,413
  • 3
  • 14
  • 25
  • I'm not sure what you mean by "restrict maximum amount of data per request", if you have some DataSource that's providing the data, and you effectively want to limit it to some max number of records. – kkrugler Sep 18 '20 at 18:06
  • @kkrugler I need some sort of paging. I want to stop DataSource producing records after I have enough data to process at the end of pipeline. – viator Sep 19 '20 at 07:41
  • Normally "paging" would mean that you could start the DataSource again, and continue processing the next batch of records. I assume that's not what you need, right? – kkrugler Sep 20 '20 at 14:32

0 Answers0