I am running (on CentOS 6) a cluster connected to the head node over a slow network. Each spawned job needs to pull a ~1GB file from the head node to the compute node, and then process it locally. The head's filesystem is served via NFS.
Allowing each spawned job to (simultaneously) cp
the file it needs obviously bogs down the NFS server.
What is the recommended way to queue up copy / file transfer processes on Linux?
NFS does not have to be in the picture. If there is e.g. an (s)ftp server that can accept multiple requests and serve them one (or N) at a time, this would be perfect. The "client" component should be able to wait for a long time without timing out. The cluster manager I am using is SLURM; but the issue is general.
EDIT
This is not a matter of syncing the dataset across all nodes. Each needs its own files.