I have a very simple nodejs application that accepts json data (1KB approx.) via POST request body. The response is sent back immediately to the client and the json is posted asynchronously to an Apache Kafka queue. The number of simultaneous requests can go as high as 10000 per second which we are simulating using Apache Jmeter running on three different machines. The target is to achieve an average throughput of less than one second with no failed requests.
On a 4 core machine, the app handles upto 4015 requests per second without any failures. However since the target is 10000 requests per second, we deployed the node app in a clustered environment.
Both clustering in the same machine and clustering between two different machines (as described here) were implemented. Nginx was used as a load balancer to round robin the incoming requests between the two node instances. We expected a significant improvement in the throughput (like documented here) but the results were on the contrary. The number of successful requests dropped to around 3100 requests per second.
My questions are:
- What could have gone wrong in the clustered approach?
- Is this even the right way to increase the throughput of Node application?
- We also did a similar exercise with a java web application in Tomcat container and it performed as expected 4000 requests with a single instance and around 5000 successful requests in a cluster with two instances. This is in contradiction to our belief that nodejs performs better than a Tomcat. Is tomcat generally better because of its thread per request model?
Thanks a lot in advance.