0

We are using flink-cep as a standalone library for finding out patterns in a list of events.

Given the following list of events:

val patientKey = "patient"
val hrKey = "hr"
// Event
val p1e1 = Event("hr", mapOf(patientKey to 1, hr to 1))
val p1e2 = Event("hr", mapOf(patientKey to 1, hr to 2))
val p2e1 = Event("hr", mapOf(patientKey to 2, hr to 1))
val p1e3 = Event("hr", mapOf(patientKey to 1, hr to 3))
val p2e2 = Event("hr", mapOf(patientKey to 2, hr to 2))
val p3e1 = Event("hr", mapOf(patientKey to 3, hr to 1))
val p2e3 = Event("hr", mapOf(patientKey to 2, hr to 3))
val p3e2 = Event("hr", mapOf(patientKey to 3, hr to 2))
val p3e3 = Event("hr", mapOf(patientKey to 3, hr to 3))

We would like to write a pattern which returns as matches:

first match: p1e1, p1e2, p1e3

second match: p2e1, p2e2, p2e3

third match: p3e1, p3e2, p3e3

As such, this seems to be doable running CEP in a flink environment with keyed streams, but how do we do it without keyed streams. We cannot deploy a full flink env as we are running on a constrained device.

We would like to get all the heart rates gathered for a patient within 5 seconds.

Thanks

1 Answers1

0

You can get the effect of keying the stream by instead putting the keyed constraint into the pattern definition. Something like this, if you use SQL:

PATTERN (A B C) WITHIN INTERVAL '5' SECOND
DEFINE
  A AS A.hr = 1
  B AS B.patientKey = A.patientKey AND B.hr = 2
  C AS C.patientKey = B.patientKey AND C.hr = 3

If you don't use SQL, the same logic applies. (For the contiguity, you'll need to specify followedBy rather than next, since you won't have partitioned the stream by patientKey.)

For what it's worth, I can't think of any operational or performance benefit that will come from avoiding keyed streams. (In fact, CEP always uses keyed state, even if you don't explicitly use keyed streams.) The use of keyed streams makes it possible to use a larger Flink cluster and operate in parallel, but doesn't require it.

David Anderson
  • 39,434
  • 4
  • 33
  • 60