0

Right now I am struggling with debugging of NodeJs application which is clustered and is running on Docker. Found on this link and this information in it:

Remember, Node.js is still single-threaded in most cases, so even on a single server you’ll likely want to spin up multiple container replicas to take advantage of multiple CPU’s

So what does it mean, clustering of NodeJs app is pointless when it is meant to be deployed on Kubernetes ?

  1. EDIT: I should also say that, by clustering I mean forking workers with cluster.fork() and goal of the application is to build simple REST API with high load traffic.
Juraj Zovinec
  • 226
  • 1
  • 15
  • Depends what do you technically mean by "clustering", some library in particular? Short answer is no, it is not pointless. Same with Kubernetes or bare-metal. Scaling out should be preferred over scaling up. Distributing it over several processes would get you passed that single-thread limitation. – SYN Jul 12 '21 at 17:09
  • Aside from NodeJS native clustering, IPC libraries, or PM2, all assuming workers would run on the same host, you should look into network-based "clustering", messaging or jobs libraries such as bee-queue, bull, kue, ... using redis, rabbitmq, ... some kind of queue for your processes to exchange messages/instructions/schedule jobs out of each others. – SYN Jul 12 '21 at 17:20
  • Hello SYN, my project is simple **REST API** done with **express** and by clustering I mean forking workers with `Cluster.fork()` method. Using queue system would be kind of overkill. – Juraj Zovinec Jul 12 '21 at 17:39
  • well, NodeJS native clustering could still make sense, if you're looking to take advantage of multiple CPU, that's one of the main advantages of it (https://nodejs.org/api/cluster.html#cluster_cluster). You may not be able to scale it on multiple containers though / depends how stateless your server is, that's when messages/queues would kick in. – SYN Jul 12 '21 at 17:49

1 Answers1

2

Short answer is yes..

Containers are just mini VM's and kubernetes is the orchestration tool that manages all the running 'containers', checking for health, resource allocation, load etc.

So, if you are running your node application in a container with an orchestration tool like kubernetes, then clustering is moot as each 'container' will be using 1 CPU or partial CPU depending on how you have it configured. Multiple containers essentially just place a new VM in rotation and kubernetes will direct traffic to each.

Now, when we talk about clustering node, that really comes into play when using tools like PM2, lets say you have a beefy server with 8 CPU's, node can only use 1 per instance so tools like PM2 setup a cluster and will route traffic along each of the running instances.

One thing to keep in mind though is that your application needs to be cluster OR container ready. Meaning nothing should be stored on the ephemeral disk as with each container restart that data is lost OR in a cluster situation there is no guarantee the folders will be available to each running instance and if you cluster with multiple servers etc you are asking for trouble :D ( this is where an object store would come into play like S3)

proxim0
  • 1,418
  • 2
  • 11
  • 14