I'm building a system similar to pingdom.com, where I have around 10k domains for checking uptime every 5 minutes. I'm using ec2 micro instances for the checks to be performed. My check urls and their last check times are stored in mongodb. A node process takes the top n checks for processing that are not processed within last 5 minutes, then the url requests are done asynchronously. I'm using node request library and my url check code looks like the following:
var request = require("request");
var options = {
url: url,
headers: {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'Accept-Encoding': 'gzip,deflate,sdch',
'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.17 Safari/537.36'
},
timeout: 10000,
maxRedirects: 10,
pool: false,
strictSSL: false
};
request(options, function (error, response, body) {
...
});
Now I've noticed when I make more than 10 requests simultaneously to different domains for uptime check, the response times for those domains slows down as the number of simultaneous requests is increased. I thought the response times should not increase as node is asynchronous. I'm considering to try node-curl library too, but before that I want to confirm if I'm doing anything wrong here.
I've tried tweaking ulimit & pool.maxConnection limits without any luck. I know if I increase the number of ec2 instances, I can achieve the 10k checks per 5 minute with acceptable response times, but I guess services like pingdom has many more checks to deal with and I'm curious what do they do to scale their systems apart from increasing uptime check instances.