2

I am new to the event processing domain. I am looking out for a Java based event processing framework for my requirements. I've been through a documentation and tutorial maze on Myriad frameworks - Apache Storm, apache Kafka as well as traditional event brokers such as RabbitMQ. I am none the wiser.

My requirements are the following. I have a source of events (e.g. usage tracking) that are pushed to me. I want to do the following things with them:

  1. Bucketing (Split them into different buckets e.g. by customer)
  2. Insert all the bucketed events as batches into a Database.
  3. Perform some kind of load balancing/event prioritization, e.g. do not want a low priority customer pushing a huge no. of events starving a high priority customer with a few events.

I do not care too much about event ordering, but I would like to ensure high availability of these systems.

Looking out for a few pointers to start off with. Technology infrastructure no bar, but something Java based.

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
archer
  • 21
  • 1
  • 2

1 Answers1

4

There are great frameworks for doing real-time distributed data processing :

In your case, I think choosing one of those frameworks is like taking a sledgehammer to crack a nut. You will have to deploy and manage a cluster with master and worker nodes in addition to a Kafka cluster.

To keep your architecture simple, scalable and highly-available you should have a look at KafkaStreams. KafkaStreams is a new Java API (available since kafka 0.10) for doing real-time computation on kafka topics.

A KafkaStreams application is a simple java application so you can embedded a job into an existing application.

Also, Kafka Streams jobs can be deployed with a simple command: java -jar .

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
fhussonnois
  • 1,607
  • 12
  • 23