2

May I ask how to send HTTP requests in parallel in the context of Spring Webflux and Spring WebClient?

I have a very simple scenario, where from one request, step 1, I make a first request to one external API.

Step 2, from the result of the first request from step 1, I have to make N (let's say three requests in parallel to external APIs)

Since for Step 2, they have no dependencies on each other, I would like this to be done in parallel.

However, I do need to "wait" for the result of everyone, so that I can do a last step computation.

How to achieve this in Spring Webflux Spring WebClient please?

I wrote the following, and contacted each and every external party I am sending the request to.

@RestController
public class QuestionController {

    private WebClient webClient;

    @Autowired
    public QuestionController(WebClient webClient) {
        this.webClient = WebClient.create("http://some-service:8111/getPriceForFoo");
    }

    @GetMapping("/question")
    public Mono<String> question(@RequestParam(value = "foo", required = false) String foo) {
        //Step 1 - get the price through an API to some service
        Mono<String> priceMustBeComputedFirst = firstComputePriceBySendingRequest(foo).cache();

        // this step can be computed in parallel with companyB and companyC
        Mono<String> interestToBuyFromCompanyA = areYouInterestedToBuyCompanyA(priceMustBeComputedFirst);
        // this step can be computed in parallel with companyA and companyC
        Mono<String> interestToBuyFromCompanyB = areYouInterestedToBuyCompanyB(priceMustBeComputedFirst);
        // this step can be computed in parallel with companyA and companyB
        Mono<String> interestToBuyFromCompanyC = areYouInterestedToBuyCompanyC(priceMustBeComputedFirst);

        return Mono.zip(interestToBuyFromCompanyA, interestToBuyFromCompanyB, interestToBuyFromCompanyC).map(tuple3 -> computeSomethingAtTheEnd(tuple3.getT1(), tuple3.getT2(), tuple3.getT3()));
    }

    private Mono<String> firstComputePriceBySendingRequest(String foo) {
        return webClient.mutate().baseUrl("http://some-service:8111/getPriceForFoo").build().get().uri("/{foo}", foo).retrieve().bodyToMono(String.class);
    }

    private Mono<String> areYouInterestedToBuyCompanyA(Mono<String> priceMustBeComputedFirst) {
        return priceMustBeComputedFirst.flatMap(product ->  webClient.mutate().baseUrl("http://companyA/areYouInterestedToBuy").build().get().uri("/{product}", product).retrieve().bodyToMono(String.class));
    }

    private Mono<String> areYouInterestedToBuyCompanyB(Mono<String> priceMustBeComputedFirst) {
        return priceMustBeComputedFirst.flatMap(product ->  webClient.mutate().baseUrl("http://companyB/areYouInterestedToBuy").build().get().uri("/{product}", product).retrieve().bodyToMono(String.class));
    }

    private Mono<String> areYouInterestedToBuyCompanyC(Mono<String> priceMustBeComputedFirst) {
        return priceMustBeComputedFirst.flatMap(product ->  webClient.mutate().baseUrl("http://companyC/areYouInterestedToBuy").build().get().uri("/{product}", product).retrieve().bodyToMono(String.class));
    }

    private String computeSomethingAtTheEnd(String t1, String t2, String t3) {
        return "some computation based on each company s response";
    }

}

For some reason, it seems what I am currently doing is sequential. I got the timestamps from each company, and indeed, company A receives the request before company B. Company B received before company C, etc etc etc, I believe it is very sequential.

I also added logs on my end, and it seems it is also sequential.

Is this code indeed sequential?

Is it because of the Mono.zip?

How to achieve this parallelism, please?

PatPanda
  • 3,644
  • 9
  • 58
  • 154
  • Your thinking is right. Based on your code these are executed concurrently. Couple things to consider: 1. The price api is called three times which I think is not your intention. Consider using cache operator on that Mono so it is called only once. 2. Don't create a new WebClient for each request. Use a single instance created in the constructor. – Martin Tarjányi Nov 09 '20 at 16:58
  • 1
    If you do all of the above, you might get better results. Although, if the APIs you call concurrently are very fast, then you might not benefit as much and they might seem sequential. – Martin Tarjányi Nov 09 '20 at 17:01
  • Hello Martin, you are correct. I did use cache per your advice, and also created one single WebClient which I mutate. Thanks! However, I am still seeing a very sequential execution :'( – PatPanda Nov 09 '20 at 18:58
  • How much is the difference between the 'sequential' executions? I mean if they don't arrive at the exact same millisecond that doesn't mean they are sequential. – Martin Tarjányi Nov 09 '20 at 19:13
  • 1
    Also, with mutating the webclient you still create a new webclient for each request. You should either create as many webclients in the controller as many apis you call, or create a single instance without base url and set url at request time without mutating the client. – Martin Tarjányi Nov 09 '20 at 19:18
  • Very good question. Actually, I pasted 3 in this example, but in reality, I am sending to some 50ish. And for all 50ish external service, if I get time stamp from both client (me) and server (them) the time difference is within hundred of milliseconds. and for all 50ish call, it is one after another. I was really wondering that on thousands of executions, I cannot be that "unlucky" to see the same order on both sides! – PatPanda Nov 09 '20 at 19:20
  • The calls are indeed initiated sequentially but they don't wait for each other to be finished before the next one would be started and the difference should be negligible. Is this really a concern in your domain? – Martin Tarjányi Nov 09 '20 at 19:25
  • Yes Martin. For instance, before the Webflux migration, I used to construct a thread pool, and fire/forget from the thread pool. I do still "wait" for all 50ish external services to respond, but "true parallelism" was seen! It is especially needed since all external parties should be asked the same time about the question in terms of the business flow, and not company A got the chance to be asked the question before company Z. Spring Webflux offers very nice reactive approach where we want to stick with. Just would like to achieve this parallelism which I do not know how to – PatPanda Nov 09 '20 at 19:34
  • Makes sense. Can you check my comment above about not mutating webclient but using a single instance without base URL? – Martin Tarjányi Nov 09 '20 at 19:42
  • One other idea: instead of `priceMustBeComputedFirst.flatMap` you can try `priceMustBeComputedFirst.publishOn(Schedulers.parallel()).flatMap`. I don't expect a huge difference by this but it is worth a try. – Martin Tarjányi Nov 09 '20 at 19:48
  • Giving this a go with all your advices. Will update here with my findings! Many thanks again Martin – PatPanda Nov 09 '20 at 19:51
  • No problem. Looking forward to the results! – Martin Tarjányi Nov 09 '20 at 19:51
  • did you have a chance to check the improvement? – Martin Tarjányi Nov 11 '20 at 18:05

0 Answers0