0

I am trying to work through a solution where the workflow is like this:

  • User hits a microservice to upload images
  • That microservice de-duplicates the image and if it really is new, queues it up for processing
  • The processing chain lives in Spring Cloud Dataflow

The microservice already exists, and we are trying to extend it to do the fancy processing. My initial cut was to use the Http Source from the sample starter pack since that would be something I didn't have to create. The problem is that the source doesn't register itself with Spring Discovery server, so there is no way to get an end point without making gross assumptions (like it lives on the dataflow server at port XYZ).

We can create a Queue endpoint and send the data directly a Queue source that receives the outside event and forwards it to an SCDF queue.

What would be awesome is if DataFlow could connect the start of the queue for me, without repackaging the microservice as a Source.

The major issue with Spring Data Flow is that it does not automatically start up deployed streams when the server starts up, and we need to be reasonably sure that microservice is always up.

Berin Loritsch
  • 11,400
  • 4
  • 30
  • 57

2 Answers2

0

The lifecycle of the server is decoupled from the apps it deploys, that was intentional.

I'm not following your thoughts on how dataflow could connect the start of the queue, but from your description there's a few things you could do:

You would need to modify the app in order to have it registered with eureka, but this is a very simple operation, no more than a few lines of code:

  1. You can either start from a stream app perspective: https://start-scs.cfapps.io/ , select http source, your binder, and then add the spring-cloud-netflix library as well as @EnableDiscoveryClient at the Main boot class

  2. Start with http://start.spring.io Select Stream Rabbit or Stream Kafka, add Web and netflix libraries, then add the @EnableDiscoveryClient and @EnableBinding annotations and create a simple HTTP endpoint for your use case.

In any case should be a small addition.

You can also open an issue at :https://github.com/spring-cloud-stream-app-starters/http/issues suggesting that we add @EnableDiscoveryClient to the http source app, we can take that in consideration on our next iteration as well.

Vinicius Carvalho
  • 3,994
  • 4
  • 23
  • 29
0

I'll try to clarify few bits.

upload images -> if it really is new -> queues it up for processing

Upon a new upload event, you'd want to process the image. Here's a similar use-case, but more of a real-time streaming style solution. This is not what you're looking to do, but I thought it might be useful.

Porting the image processing code to a Spring Cloud Stream application is as simple as adding @EnableBinding(Processor.class). It is the same business logic - whether you're running it separately or orchestrating it via SCDF, it is still a standalone microservice. However, SCDF expects it to be either a Source, Processor, Sink, or Task application types. We will be opening this up to support any arbitrary "functions" (lambdas) in the future release.

We can create a Queue endpoint and send the data directly a Queue source that receives the outside event and forwards it to an SCDF queue.

This is one of the standard solutions. You can directly consume new events (images) from a queue/topic and process it in the image-processor that we created in previous step. The named-channel support in DSL facilitates just that.

What would be awesome is if DataFlow could connect the start of the queue for me, without repackaging the microservice as a Source.

I'm not sure I understand this. If I were to assume, you're looking for "named-channel" as source and that is supported.

The major issue with Spring Data Flow is that it does not automatically start up deployed streams when the server starts up, and we need to be reasonably sure that microservice is always up.

The moment you deploy a Stream in SCDF, all the individual steps included in the DSL (i.e., stream definition) are resolved and deployed as standalone apps in the target runtime (cloud foundry, kubernetes, etc.,). Once deployed, it is left to the platform where the apps run for lifecycle management. SCDF does not retain or track the app states.

Sabby Anandan
  • 5,636
  • 2
  • 12
  • 21