0

I am using hazelcast jet 0.6.1 for real time analysis. There are multiple streams (mostly from remote journal) coming from different sources.

I would like to know, if full join supported between multiple streams.

If yes, will you please point me to some links / examples for full join between multiple streams.

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
  • Do you mean to just merge the streams? In 0.7 the [merge](https://docs.hazelcast.org/docs/jet/0.7.2/manual/#merge) operator was added. – Can Gencer Jan 07 '19 at 13:49

2 Answers2

0

I think you need to elaborate a bit more on what you are trying to do. Streams are theoretically infinite, so the term "full join" has to mean something different than it does in a database.

There are several types of joins available in Jet. As Can said above, there is a merge operator, but you might be thinking more of windowed join where you time bound the period of the joins.

Merge Steams is here: https://docs.hazelcast.org/docs/jet/0.7.2/manual/#merge

Window Concepts are here: https://docs.hazelcast.org/docs/jet/0.7.2/manual/#unbounded-stream-processing

Scott M
  • 84
  • 2
  • This is more like a fraud detection use case, where user activity data streams are coming from multiple nodes. The plan is to have single Hazelcast Jet connecting to multiple remote streams, collect / merge the user activity and find the possibility for any fraud by analyzing the stream details. – Harshad Murtekar Jan 08 '19 at 09:00
0

*This is in response to the comment from the first answer, it's to large for another comment and I thought the first answer is still relevant

Is this the same data and data type, just from different nodes? Like app servers for a microservices architecture? It seems to me that you have a few options here that really come down to preferred overall architecture, especially about how you want to transport the events. A couple thoughts:

  1. You can simply merge streams from different data sources if that fits the use case:

See: https://docs.hazelcast.org/docs/jet/0.7.2/manual/#merge

  1. If this is homogenous data, just distributed across app servers, if might be a case where you use the Hazelcast client on each app server to put events into an IMap (which is shared by all the app servers) with an Event Journal on a Hazelcast cluster. Then Jet just receives all the events from the Event Journal.

See: https://docs.hazelcast.org/docs/latest/manual/html-single/#event-journal

  1. If you have Kafka available, perhaps you create a topic for the events from the servers and Jet receives the events from Kafka. Either way they are already merged when Jet gets them, so they are processed as one stream.

See: https://docs.hazelcast.org/docs/jet/0.7.2/manual/#kafka

Scott M
  • 84
  • 2