2

I am using Kinesis data stream as a source and elasticsearch as a sink. Running Flink job in AWS Kinesis Data analytics application.

Sample event :

{"area":"sessions","userId":4450,"date":"2021-12-03T11:00:00","videoDuration":5} 

I am collecting these video watching events from the front-end while the video is playing every 5 seconds for one user. These events are used to calculate the watch time of a user.

Let's say If one user is watching a video then every 5 seconds this event is generated from the front-end and ingested into the Kinesis data stream. So there are 10,000 users watching a video so in one minute total of 120,000 events are generated.

For processing 120,000 events my Flink job nearly takes ~4 minutes of time. This is quite a long time.

So how can I improve the performance of the job? I need to achieve this in 1 minute.

My job looks like this :

        stream
                .keyBy(e -> e.getUserId())
                .timeWindow(Time.seconds(60))
                .reduce(new MyReduceFunction()) //sum of video duration for user
                .map(<enrich event using some data from redis>)
                .addSink(<elasticsearch sink>);

// Reduce function 

 private static class MyReduceFunction implements ReduceFunction<TrackingData> {
        @Override
        public TrackingData reduce(TrackingData trackingData, TrackingData t1) throws Exception {
                trackingData.setVideoDuration(trackingData.getVideoDuration() + t1.getVideoDuration());
                return trackingData;
        }
    }

So what this job is doing first receiving events from Kinesis Data stream then I key by this stream by userId then I do some of videoDuration for 1 minute then this data goes to enrichment function in which I read some data from Redis and enrich this event then i sink this event to elasticsearch.

I have tried with increasing parallelism of job it is giving best performance for 1 parallelism which is ~4 minutes. If I increase parallelism it's taking more time it's quite strange. Tried with 2, 4, 8, 16, etc. Increasing parallelism should give more speedy processing isn't it so?

Can anyone help what I am missing or what I am doing wrong with this Flink job, What do I need to do to speed up these events in 1 min?

Rohit
  • 97
  • 10
  • 1
    As a guess - the bottleneck might be on a Redis side, if you are reading data for each reduced event. Have you tried for check to temporary disable the enrichment and see whether there is any performance boost? – Mikalai Lushchytski Dec 20 '21 at 12:38

0 Answers0