0

I am running (on CentOS 6) a cluster connected to the head node over a slow network. Each spawned job needs to pull a ~1GB file from the head node to the compute node, and then process it locally. The head's filesystem is served via NFS.

Allowing each spawned job to (simultaneously) cp the file it needs obviously bogs down the NFS server.

What is the recommended way to queue up copy / file transfer processes on Linux?

NFS does not have to be in the picture. If there is e.g. an (s)ftp server that can accept multiple requests and serve them one (or N) at a time, this would be perfect. The "client" component should be able to wait for a long time without timing out. The cluster manager I am using is SLURM; but the issue is general.

EDIT

This is not a matter of syncing the dataset across all nodes. Each needs its own files.

Dmitri
  • 101
  • 2
  • Perhaps running such things over a slow network isn't an optimal setup and you should re-think things. – GregL Dec 01 '15 at 12:58

1 Answers1

0

Frankly, by the sound of it, you really should be using bittorrent to sync those files around. That way there's no hotspot on any one machine.

womble
  • 96,255
  • 29
  • 175
  • 230
  • Each compute node needs its own set of files. The entire dataset is too big to fit on the compute nodes, and would be mostly useless (each compute node needs only a small fraction of it). Moreover, I don't know in advance which node will be assigned which job, so I cannot pre-populate the node with just the needed files. – Dmitri Dec 01 '15 at 01:03
  • You should clarify your question, then. – womble Dec 01 '15 at 01:05