12

I have a RestService running on 45 different machines in three datacenters (15 in each datacenter). I have a client library which uses RestTemplate to call these machines depending on where the call is coming from. If the call is coming from DC1, then my library will call my rest service running in DC1 and similarly for others.

My client library is running on different machines (not on same 45 machines) in three datacenters.

I am using RestTemplate with HttpComponentsClientHttpRequestFactory as shown below:

public class DataProcess {

    private RestTemplate restTemplate = new RestTemplate();
    private ExecutorService service = Executors.newFixedThreadPool(15);

    // singleton class so only one instance
    public DataProcess() {
        restTemplate.setRequestFactory(clientHttpRequestFactory());
    }

    public DataResponse getData(DataKey key) {
        // do some stuff here which will internally call our RestService
        // by using DataKey object and using RestTemplate which I am making below
    }   

    private ClientHttpRequestFactory clientHttpRequestFactory() {
        HttpComponentsClientHttpRequestFactory requestFactory = new HttpComponentsClientHttpRequestFactory();
        RequestConfig requestConfig = RequestConfig.custom().setConnectionRequestTimeout(1000).setConnectTimeout(1000)
                .setSocketTimeout(1000).setStaleConnectionCheckEnabled(false).build();
        SocketConfig socketConfig = SocketConfig.custom().setSoKeepAlive(true).setTcpNoDelay(true).build();

        PoolingHttpClientConnectionManager poolingHttpClientConnectionManager = new PoolingHttpClientConnectionManager();
        poolingHttpClientConnectionManager.setMaxTotal(800);
        poolingHttpClientConnectionManager.setDefaultMaxPerRoute(700);

        CloseableHttpClient httpClientBuilder = HttpClientBuilder.create()
                .setConnectionManager(poolingHttpClientConnectionManager).setDefaultRequestConfig(requestConfig)
                .setDefaultSocketConfig(socketConfig).build();

        requestFactory.setHttpClient(httpClientBuilder);
        return requestFactory;
    }

}

And this is the way people will call our library by passing dataKey object:

DataResponse response = DataClientFactory.getInstance().getData(dataKey);

Now my question is:

How to decide what should I choose for setMaxTotal and setDefaultMaxPerRoute in PoolingHttpClientConnectionManager object? As of now I am going with 800 for setMaxTotal and 700 setDefaultMaxPerRoute? Is this a reasonable number or should I go with something else?

My client library will be used under very heavy load in multithreading project.

john
  • 11,311
  • 40
  • 131
  • 251
  • Is the same RestService application running on those 45 machines? – André Jun 25 '15 at 18:58
  • Yes RestService is running on those 45 machines. But Client Library will be running on some different machines and then that library will be calling these RestService machines in each datacenter. Let's say if call is coming from DC1, then client library will call RestService machine in DC1. – john Jun 25 '15 at 19:01
  • Can you explicitly explain how this `if call is coming from DC1, then client library will call RestService machine in DC1` would happen? I fail to see how the PoolingHttpClientConnectionManager would handle the load balance between all these 45 machines. – André Jun 25 '15 at 19:05
  • PoolingHttpClientConnectionManager is not handling that part. Given a UserId in DataKey object, my library will figure out these details which machine to call in the same datacenter if call is coming from DC1 or DC2 or DC3. I am using PoolingHttpClientConnectionManager for better performance. – john Jun 25 '15 at 19:13
  • Ah, thanks for the clarification :) – André Jun 25 '15 at 19:18

3 Answers3

8

There are no formula or a recipe that one can apply to all scenarios. Generally with blocking i/o one should have approximately the same max per route setting as the number of worker threads contending for connections.

So, having 15 worker threads and 700 connection limit makes little sense to me.

ok2c
  • 26,450
  • 5
  • 63
  • 71
  • Thanks for the help. Can you add some explanation why I should not have 700 connection limit? This will help me to understand better. And what settings I should use then for both of those property? – john Jul 02 '15 at 16:59
  • You have how many worker thread? 15. How many connections can those workers use concurrently in the best case scenario? Right. 15 and no more – ok2c Jul 02 '15 at 17:29
  • Ok do you think we even need `PoolingHttpClientConnectionManager ` with `RestTemplate` here at all? may be we need other things that I have but I am guessing may be not `PoolingHttpClientConnectionManager `. – john Jul 02 '15 at 17:41
  • Of course you do. One generally should be reusing connections – ok2c Jul 02 '15 at 17:42
  • Ok got it. One last thing should I set explicitly `setSoKeepAlive(true)` on `SocketConfig` and `setExpectContinueEnabled(true)` on `RequestConfig`? – john Jul 02 '15 at 17:47
  • In both cases it depends on what you want to gain and are willing to trade. – ok2c Jul 02 '15 at 19:58
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/82231/discussion-between-david-and-oleg). – john Jul 02 '15 at 20:18
  • I think the maxTotal should be the same as number of worker threads and maxPerRoute be say maxTotal/N if we want to support upto N routes in the worst case scenario. If we do not care about heavy traffic from one Route affecting others, then N=1 and maxPerRoute=maxTotal=no of worker threads – Tushar B Oct 11 '21 at 18:03
2

Apparently there is not a definite and worthy formula into the situation. The relation between poolsize and throughput and response-time is not simple enough. One particular case which comes to mind is that after a certain value the response-time starts increasing on increasing the pool-size.

Generally with blocking i/o one should have approximately the same max per route setting as the number of worker threads contending for connections.

Himanshu Mishra
  • 8,510
  • 12
  • 37
  • 74
1

Let us try to come up with a formula for computing the poolsize.

R: average response time of a http call in millisecond
Q: required throughput in requests per second

In order to achieve Q, you will need approximately t = Q*R/1000 threads to process your requests. For all these threads not to be contending for the http-connection, you should have atleast t connections in the pool at any point in time.

Example: I have a web server which fetches the result and return it as a response.

Q = 700 rps
X = 50 ms
t = 35

So you would need atleast 35 connections per http-route and your total connections would be 35 * no. of routes (3).

PS: This is a very simple formula however the relation (between poolsize and throughput and response-time) is not straightforward. One particular case which comes to mind is that after a certain value the response-time starts increasing on increasing the pool-size.

Amm Sokun
  • 1,298
  • 4
  • 20
  • 35