I have to check thousands of proxy servers continuously.
To speed it up, I am thinking to create a batch of size N(say 50) and send requests to them concurrently. Each proxy server has a unique IP/Port and username/password authentication.
Since I am checking proxies, I will configure the request to use the given Proxy and send a request to the target site and measure the response.
Here is an example to use proxy with auth from the Apache client docs:
public static void main(String[] args)throws Exception {
CredentialsProvider credsProvider = new BasicCredentialsProvider();
credsProvider.setCredentials(
new AuthScope("localhost", 8889),
new UsernamePasswordCredentials("squid", "nopassword"));
CloseableHttpAsyncClient httpclient = HttpAsyncClients.custom()
.setDefaultCredentialsProvider(credsProvider)
.build();
try {
httpclient.start();
HttpHost proxy = new HttpHost("localhost", 8889);
RequestConfig config = RequestConfig.custom()
.setProxy(proxy)
.build();
HttpGet httpget = new HttpGet("https://httpbin.org/");
httpget.setConfig(config);
Future<HttpResponse> future = httpclient.execute(httpget, null);
HttpResponse response = future.get();
System.out.println("Response: " + response.getStatusLine());
System.out.println("Shutting down");
} finally {
httpclient.close();
}
}
As you can see, if you are using an authenticated proxy, you need to provide the credentials in the Client itself. This means that if I am checking 50 proxy servers concurrently then I have to create a new client for each of them. Which means that the requests will not be concurrent and better if I just use a multi-threaded solution.
The issue is that if I use multithreading then I will put excessive loads on the server as most of the threads will block on I/O. A concurrent non-blocking I/O is much better for this type of challenge.
How can I check multiple authenticated proxy servers concurrently if I have to create a client for each of them?