I have quite a lot of bounded data that I need to filter after getting from DataSource (say, I'm unable to filter it instantly using query due to complex filtering logic). And I need to restrict maximum amount of data at the end of pipeline (to implement some sort of paging).
So, I need some sort of Java's Stream.limit()
to stop fetching rows from DataSource after needed amount of data has been accumulated.
What is best way of doing that in Apache Flink? I'm looking at counters, but maybe there is more suitable API?