Communication between microservices for large data

Question

I am building a spring cloud-based microservice ML pipeline. I have a data ingestion service that (currently) takes in data from SQL, this data needs to be used by the prediction service.

The general consensus is that writes should use async message-based communication using kafka/rabbitmq.

What I am not sure about is how do I orchestrate these services?

Should I use an API gateway that invokes ingestion that starts the pipeline?

Spring Cloud Data Flow: http://cloud.spring.io/spring-cloud-dataflow/ ? — Artem Bilan, Jun 06 '17 at 17:24
I am really new to this. If I use spring cloud data flow, will I be able to use the services individually? Use one services independently out of the data flow pipeline using, say rest endpoints? — Chintan Shah, Jun 07 '17 at 07:27

score 3 · Answer 1 · answered Jun 07 '17 at 10:34

Typically you would build a service with rest endpoints (Spring Boot) to ingest the data. This service can then be deployed multiple times behind a api gateway (Zuul, Spring Cloud) that takes care about routing. This is the default spring cloud microservices setup. The ingest service can then convert the data and produce it to a RabbitMQ or Kafka. I recommend using Spring Cloud Stream for the interaction with the queue, it's abstraction on top of RabbitMQ and Kafka, which can be configured using starters/binders.

Spring Cloud Dataflow is a declarative approach for orchestration of your queues and also takes care of deployment on several cloud services/platforms. This can also be used but might add extra complexity to your use case.

Communication between microservices for large data

1 Answers1