1

I'm using Flink and FlinkCEP for Detection of Complex Events on Datastreams. For research purposes, I need to measure solely the recognition time.

I am using Flink / FlinkCEP - 1.7.1. I am creating the stream within the Flink Environment using the env.fromCollection() function. After that, I am using FlinkCEP: CEP.pattern(....) along with other select and print functions.

I only found this post: Measure job execution time in flink that helped a lot. It suggests a solution that returns the execution time of the streaming environment process. This is not precisely what I'm looking for.

I noticed that the returned value includes the time of other operators such as .assignAscendingTimestamps(x => x.TimeStamp()) and thus I couldn't use it.

Is there a way to measure only the time of CEP.pattern process? I also couldn't find a metric that would help me in this case, unless I missed something...

alextroupi
  • 21
  • 2

1 Answers1

0

You could add a timestamp field to each record, and use a mapFunction right before CEP to drop the current time into that field. Then use that to compute the time elapsed within CEP immediately afterwards in a RichMapFunction -- which you can then report via a custom metric, or send to a sink. This will add a bit of overhead, but not much. So long as you can avoid any keyBy or rebalancing calls between the those two functions, everything involved will be chained together via function calls, without any serialization or network overhead.

David Anderson
  • 39,434
  • 4
  • 33
  • 60