16

I am doing a quick performance test for NodeJS vs. Java. The simple use case chosen is querying a single table in MySQL database. The initial results were as follows:

Platform                      | DB Connections | CPU Usage | Memory Usage  | Requests/second
==============================|================|===========|===============|================
Node 0.10/MySQL               | 20             |  34%      |  57M          | 1295
JBoss EAP 6.2/JPA             | 20             | 100%      | 525M          | 4622
Spring 3.2.6/JDBC/Tomcat 7.0  | 20             | 100%      | 860M          | 4275

Note that Node's CPU and memory usage are way lower than Java but the throughput is also about a third! Then I realized that Java was utilizing all four cores on my CPU, whereas Node was running on only one core. So I changed the Node code to incorporate the cluster module and now it was utilizing all four cores. Here are the new results:

Platform                      | DB Connections | CPU Usage | Memory Usage  | Requests/second
==============================|================|===========|===============|================
Node 0.10/MySQL (quad core)   | 20 (5 x 4)     | 100%      | 228M (57 x 4) | 2213

Note that the CPU and memory usage have now gone up proportionately but the throughput has only gone up by 70%. I was expecting a four fold increase, exceeding the Java throughput. How can I account for the descrepancy? What can I do to increase the throughput linearly?

Here's the code for utilizing multiple cores:

if (Cluster.isMaster) {
    var numCPUs = require("os").cpus().length;
    for (var i = 0; i < numCPUs; i++) {
        Cluster.fork();
    }

    Cluster.on("exit", function(worker, code, signal) {
        Cluster.fork();
    });
}
else {
    // Create an express app
    var app = Express();
    app.use(Express.json());
    app.use(enableCORS);
    app.use(Express.urlencoded());

    // Add routes

    // GET /orders
    app.get('/orders', OrderResource.findAll);

    // Create an http server and give it the
    // express app to handle http requests
    var server = Http.createServer(app);
    server.listen(8080, function() {
        console.log('Listening on port 8080');
    });
}

I am using the node-mysql driver for querying the database. The connection pool is set to 5 connections per core, however that makes no difference. If I set this number to 1 or 20, I get approximately the same throughput!

var pool = Mysql.createPool({
    host: 'localhost',
    user: 'bfoms_javaee',
    password: 'bfoms_javaee',
    database: 'bfoms_javaee',
    connectionLimit: 5
});

exports.findAll = function(req, res) {
    pool.query('SELECT * FROM orders WHERE symbol="GOOG"', function(err, rows, fields) {
        if (err) throw err;
        res.send(rows);
    });
};
Naresh
  • 23,937
  • 33
  • 132
  • 204
  • You could try `NODE_ENV=production` https://groups.google.com/forum/#!topic/express-js/fqtr1Carr0E – KeepCalmAndCarryOn Jan 20 '14 at 22:04
  • Also, are you pooling connections correctly? this is the suggested way `var mysql = require('mysql'); var pool = mysql.createPool(...); pool.getConnection(function(err, connection) { // Use the connection connection.query( 'SELECT something FROM sometable', function(err, rows) { // And done with the connection. connection.release(); // Don't use the connection here, it has been returned to the pool. }); });` https://github.com/felixge/node-mysql – KeepCalmAndCarryOn Jan 20 '14 at 22:18
  • Yes, I am using the pool correctly. The code I am showing is simply a short cut of what you have (I have tried it both ways). Clarified this extensively on this node-mysql issue: https://github.com/felixge/node-mysql/issues/712. – Naresh Jan 20 '14 at 23:26
  • 1
    Setting NODE_ENV=production makes no difference - all numbers remain the same. – Naresh Jan 20 '14 at 23:46
  • I'd be interested in seeing if you've found a way to improve performance. Please keep us updated. – u84six May 29 '14 at 23:01
  • did you got to increase the performance 4 fold? – Jas May 03 '16 at 17:58

2 Answers2

3

From what I see, you aren't comparing just platforms but also the frameworks. You probably want to remove the framework effect and implement a plain HTTP server. For instance, all those middlewares in Express app add up to the latency. Also, did you make sure the Java libraries do not cache the frequently requested data which would significantly improve the performance?

Other things to consider is that the built-in http module in Node (thus, any library built on top of it, including node-mysql) maintains an internal connection pool via the Agent class (not to confuse with the MySQL connection pool) so that it can utilize HTTP keep-alives. This helps increase the performance when you're running many requests to the same server instead of opening a TCP connection, making an HTTP request, getting a response, closing the TCP connection, and repeating. Thus, the TCP connections can be reused.

By default, the HTTP Agent will only open 5 simultaneous connections to a single host, like your MySQL server. You can change this easily as follows:

var http = require('http');
http.globalAgent.maxSockets = 20;

Considering these changes, see what improvement you can get.

Other ideas is to verify that the MySQL connection pool is properly used by checking MySQL logs on when connections get opened and when closed. If they get opened often, you may need to increase the idle timeout value in node-mysql.

esengineer
  • 9,514
  • 7
  • 45
  • 69
  • Thanks for the ideas. I am currently focused on other things but have a todo to try these suggestions out. – Naresh Jul 18 '14 at 12:44
1

Try setting the environment variable export NODE_CLUSTER_SCHED_POLICY="rr". As per this blog post.

Peter Lyons
  • 142,938
  • 30
  • 279
  • 274
  • That article is talking about NodeJS version 0.12 which is not yet available at http://nodejs.org/. Also it says that the round-robin algorithm does not affect performance on Windows, which is where I am doing my testing. – Naresh Jan 21 '14 at 01:41
  • 1
    Yes, but that does actually apply to recent versions of v0.10 as well. However, it is compensating for an aspect of the linux kernel's scheduler in particular, so no it is not necessary on Windows. – Peter Lyons Jan 21 '14 at 03:50