Questions tagged [cluster-computing]

A computer cluster is a set of connected systems that work together so that in many respects they can be viewed as a single system.

A computer cluster consists of a set of loosely connected computers that work together so that in many respects they can be viewed as a single system. Cluster management is centralized as opposed to a grid's non-central approach. (wikipedia).

5527 questions
47
votes
4 answers

Difference between Clustering and Load balancing?

What is the difference between Clustering and Load balancing? I know it is a simple question.But I asked this question to several people, But no one gave reliable answer. Also I googled a lot and can't get an exact answer . Hope our Stack users…
Human Being
  • 8,269
  • 28
  • 93
  • 136
46
votes
1 answer

Easy way to use parallel options of scikit-learn functions on HPC

In many functions from scikit-learn implemented user-friendly parallelization. For example in sklearn.cross_validation.cross_val_score you just pass desired number of computational jobs in n_jobs argument. And for PC with multi-core processor it…
46
votes
4 answers

What are the differences between a node, a cluster and a datacenter in a cassandra nosql database?

I am trying to duplicate data in a cassandra nosql database for a school project using datastax ops center. From what I have read, there is three keywords: cluster, node, and datacenter, and from what I have understand, the data in a node can be…
enjazweb
  • 473
  • 1
  • 5
  • 6
46
votes
4 answers

How to set amount of Spark executors?

How could I configure from Java (or Scala) code amount of executors having SparkConfig and SparkContext? I see constantly 2 executors. Looks like spark.default.parallelism does not work and is about something different. I just need to set amount of…
Roman Nikitchenko
  • 12,800
  • 7
  • 74
  • 110
45
votes
2 answers

What's the difference between Cluster and Instance in AWS Aurora RDS

I guess the title is pretty objective, but just to clarify: When you create an Aurora Database Instance, it is asked to give a name for a Database Instance, a Database Cluster and a Database (where the name of the Database is optional, and no…
44
votes
1 answer

How to fix symbol lookup error: undefined symbol errors in a cluster environment

I'm working on some python code that extracts some image data from an ECW file using GDAL (http://www.gdal.org/) and its python bindings. GDAL was built from source to have ECW support. The program is run on a cluster server that I ssh into. I have…
agnussmcferguss
  • 485
  • 1
  • 4
  • 10
44
votes
9 answers

Singleton in Cluster environment

What is the best strategy to refactor a Singleton object to a cluster environment? We use Singleton to cache some custom information from Database. Its mostly read-only but gets refreshed when some particular event occurs. Now our application needs…
lud0h
  • 2,370
  • 6
  • 33
  • 41
42
votes
1 answer

What does Apache Mesos actually do?

I am trying to wrap my head around Apache Mesos and need clarification on a few items. My understanding of Mesos is that it is an executable that gets installed on every physical/VM server ("node") in a cluster, and then provides a Java API…
smeeb
  • 27,777
  • 57
  • 250
  • 447
37
votes
5 answers

How to add a new node to my Elasticsearch cluster

My cluster has a yellow health as it has only one single node, so the replicas remain unasigned simply because no other node is available to contain them. So I want to create/add another node so Elasticsearch can begin allocating replica’s to it.…
Avión
  • 7,963
  • 11
  • 64
  • 105
37
votes
3 answers

How to submit a job to any [subset] of nodes from nodelist in SLURM?

I have a couple of thousand jobs to run on a SLURM cluster with 16 nodes. These jobs should run only on a subset of the available nodes of size 7. Some of the tasks are parallelized, hence use all the CPU power of a single node while others are…
Faber
  • 1,504
  • 2
  • 13
  • 21
35
votes
2 answers

Node.JS built in cluster or PM2 clustering?

Which one is better? I have activated Nodejs clustering mode with workers but now I discovered PM2 that does the same thing. I'm using keymetrics to see the stats from my webserver and I have noticed that when I launch my NodeJS node (with a built…
llf
  • 483
  • 1
  • 4
  • 9
35
votes
5 answers

How to run Cron Job in Node.js application that uses cluster module?

I'm using node-cron module for scheduling tasks in Node.js application. I also want run the application in several processes using core cluster module. Running application in several processes ends up in scheduled tasks execution in each process…
epidemiya30
  • 1,203
  • 1
  • 9
  • 12
34
votes
7 answers

NodeJS|Cluster: How to send data from master to all or single child/workers?

I have working (stock) script from node var cluster = require('cluster'); var http = require('http'); var numReqs = 0; if (cluster.isMaster) { // Fork workers. for (var i = 0; i < 2; i++) { var worker = cluster.fork(); …
htonus
  • 629
  • 1
  • 9
  • 19
34
votes
3 answers

Error in SLURM cluster - Detected 1 oom-kill event(s): how to improve running jobs

I'm working in a SLURM cluster and I was running several processes at the same time (on several input files), and using the same bash script. At the end of the job, the process was killed and this is the error I obtained. slurmstepd: error: Detected…
CafféSospeso
  • 1,101
  • 3
  • 11
  • 28
33
votes
3 answers

how to specify error log file and output file in qsub

I have a qsub script as #####----submit_job.sh---##### #!/bin/sh #$ -N job1 #$ -t 1-100 #$ -cwd SEEDFILE=/home/user1/data1 SEED=$(sed -n -e "$SGE_TASK_ID p" $SEEDFILE) /home/user1/run.sh $SEED The problem is-- it puts…
d.putto
  • 7,185
  • 11
  • 39
  • 45