1

I am building a spring cloud-based microservice ML pipeline. I have a data ingestion service that (currently) takes in data from SQL, this data needs to be used by the prediction service.

The general consensus is that writes should use async message-based communication using kafka/rabbitmq.

What I am not sure about is how do I orchestrate these services?

Should I use an API gateway that invokes ingestion that starts the pipeline?

Chintan Shah
  • 935
  • 2
  • 9
  • 28
  • 2
    Spring Cloud Data Flow: http://cloud.spring.io/spring-cloud-dataflow/ ? – Artem Bilan Jun 06 '17 at 17:24
  • I am really new to this. If I use spring cloud data flow, will I be able to use the services individually? Use one services independently out of the data flow pipeline using, say rest endpoints? – Chintan Shah Jun 07 '17 at 07:27

1 Answers1

3

Typically you would build a service with rest endpoints (Spring Boot) to ingest the data. This service can then be deployed multiple times behind a api gateway (Zuul, Spring Cloud) that takes care about routing. This is the default spring cloud microservices setup. The ingest service can then convert the data and produce it to a RabbitMQ or Kafka. I recommend using Spring Cloud Stream for the interaction with the queue, it's abstraction on top of RabbitMQ and Kafka, which can be configured using starters/binders.

Spring Cloud Dataflow is a declarative approach for orchestration of your queues and also takes care of deployment on several cloud services/platforms. This can also be used but might add extra complexity to your use case.

Jeff
  • 1,871
  • 1
  • 17
  • 28