How to properly use database when scaling a NodeJS app?

Question

I am wondering how I would properly use MySQL when I am scaling my Node.JS app using the cluster module. Currently, I've only come up with two solutions:

Solution 1:

Create a database connection on every "worker".

Solution 2:

Have the database connection on a master process and whenever one of the workers request some data, the master process will return the data. However, using this solution, I do not know how I would be able to get the worker to retrieve the data from the master process.

I (think) I made a "hacky" workaround emitting with a unique number and then waiting for the master process to send the message back to the worker and the event name being the unique number.

If you don't understand what I mean by this, here's some code:

// Worker process

return new Promise (function (resolve, reject) {
    process.send({
        // Other data here
        identifier: <unique number>
    })

    // having a custom event emitter on the worker

    worker.once(<unique number>, function (data) {
        // data being the data for the request with the unique number

        // resolving the promise with returned data
        resolve(data)
    })
})

//////////////////////////

// Master process

// Custom event emitter on the master process

master.on(<eventName>, function (data) {
    // logic

    // Sending data back to worker
    master.send(<other args>, data.identifier)
}

What would be the best approach to this problem?

Thank you for reading.

I asked someone on Discord and he said the following: "Definitely have db access on main thread, which will control the data flow, or else you'll have fun data races and other undefined behaviour" — Martin, Oct 20 '18 at 18:29

score 0 · Answer 1 · answered Oct 20 '18 at 19:18

When you cluster in NodeJS, you should assume each process is completely independent. You really shouldn't be relaying messages like this to/from the master process. If you need multiple threads to access the same data, I don't think NodeJS is what you should be using. However, If you're just doing basic CRUD operations with your database, clustering (solution 1) is certainly the way to go.

For example, if you're trying to scale write ops to your database (assuming your database is properly scaled), each write op is independent from another. When you cluster, a single write request will be load balanced to one of your workers. Then in the worker, you delegate the write op to your database asynchronously. In this scenario, there is no need for a master process.

score 0 · Answer 2 · answered Oct 20 '18 at 19:23

If you've not planned on using a proper microservice architecture where each process would actually have its own database (or perhaps just an in-memory storage), your best bet IMO is to use a connection pool created by the main process and have each child request a connection out of that pool. That's probably the safest approach to avoid issues in the neighborhood of threadsafety errors.

How to properly use database when scaling a NodeJS app?

2 Answers2