1

I have implemented a pattern with Flink CEP that matches three Events such as A->B->C. After I have defined my pattern I generate a

PatternStream<Event> patternStream = CEP.pattern(eventStream, pattern);

with a PatternSelectFunction such that

patternStream.select(new MyPatternSelectFunction()).print();

This works like a charm but I am interested in the event-time of all matched events. I know that the traditional Flink streaming API offers rich functions which allow you to register Flink's internal latency tracker as described in this question. I have also seen that for Flink 1.8 a new RichPatternSelectFunction has been added. But unfortunately I cannot set up Flink 1.8 with Flink CEP.

Finally, is there a way to get the event-time of all matched events?

Robin Ellerkmann
  • 2,083
  • 4
  • 29
  • 34

1 Answers1

5

You don't need Rich Functions to use Flink's latency tracking. You just need to enable it by setting latencyTrackingInterval to a positive number in either the Flink configuration or ExecutionConfig, e.g.,

env.getConfig().setLatencyTrackingInterval(1000);

and then you can observe the results in your metrics solution, or via the REST api (latency metrics are not reported in the Flink web UI).

Documentation

Update:

The latency statistics are job metrics, and are in the list returned by

http://<job_manager_rest_endpoint>/jobs/<job_id>/metrics

Latency metric values can be fetched from

http://<job_manager_rest_endpoint>/jobs/<job_id>/metrics?get=<metric_name>

These metrics have names like

latency.source_id.<ID>.operator_id.<ID>.operator_subtask_index.<SUBTASK>.<metric>

where the IDs identity the source and operator nodes in the job graph between which the latency is being measured.

For example, I can determine the 95th percentile latency between the source and one of the sinks in a job I am running right now with this request:

http://localhost:8081/jobs/94b189a96b98b3aafaba6db6aa8b770b/metrics?get=latency.source_id.bc764cd8ddf7a0cff126f51c16239658.operator_id.fd0ee602f2fa8d310d9bd9f694e185f5.operator_subtask_index.0.latency_p95

Alternatively, you could use a ProcessFunction to add processing time timestamps to your events before they enter the CEP part of your job, and then use another ProcessFunction afterwards to measure the elapsed time.

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • Thanks for your answer. I have added the line of code to my Flink program but it still shows no metrics, neither for a job nor for a task manager. Do I have to enable metrics in general or do I miss something else? – Robin Ellerkmann Jan 21 '19 at 22:25
  • I've expanded my answer to provide details on inspecting latency metrics. – David Anderson Jan 22 '19 at 14:31
  • Thanks for expanding your answer. I have now fully understood how I can access a job's metrics. But unfortunately I only receive empty responses when I use these queries with my respective job and operator ids. I have set the 'latencyTrackingInterval' to 10. But even the basic 'http://localhost:8081/jobs//metrics/' endpoint returns empty responses only. Does it matter that I try to get metrics for completed jobs? – Robin Ellerkmann Jan 22 '19 at 23:07
  • Metrics reflect current state. You will need to setup a metrics reporter so that the metrics get persisted in an external metrics system -- see https://ci.apache.org/projects/flink/flink-docs-release-1.7/monitoring/metrics.html#reporter. – David Anderson Jan 23 '19 at 07:11
  • 1
    Also note that setting the latencyTrackingInterval to 10 could cause enormous overhead, especially as you scale up the job, since latency tracking markers are sent from every source instance to every operator instance. Sampling latency that often is not recommended. And most metrics systems can't handle that level of granularity. – David Anderson Jan 23 '19 at 07:19
  • Thanks for your help. I managed to retrieve some latency stats. But I am still confused by the ids of the single operators. Where can I find out which id belongs to which operator? I didn't find them neither in the web dashboard nor in the task/jobmanager logs. Is it possible to provide constant ids for operators so that I can retrieve the latency stats via a constant id for different program runs? – Robin Ellerkmann Jan 23 '19 at 17:12
  • The operators and their IDs are listed under vertices at http:///jobs// – David Anderson Jan 23 '19 at 18:29