0

I am experiencing a problem with PBS where, of all the jobs I submit, there tends to be a fraction that do not produce any output as they should. I have to resubmit them several times until they have all produced the output. I have also noticed that this is especially bad when other users submit large numbers of jobs. In this case, ALL of my jobs fail to produce the expected output files.

I'm only user of PBS so don't understand what is going on. If anyone can give some suggestions that'd be great. Thanks.

qAp
  • 1,139
  • 2
  • 12
  • 26
  • 1
    Please log on to the cluster node allocated to the job and use `top` to check what is running. You could also attach `gdb` to the process. You can identify the node allocated to the job with `tracejob JOBID` or `qstat -f JOBID` – Dima Chubarov Dec 13 '12 at 07:28
  • Could you clarify 'do not produce any output as they should'? Are these interactive jobs? Should they be writing to a directory? – spuder Aug 05 '13 at 17:45
  • I later posted a revised version of this question: http://stackoverflow.com/questions/13849604/how-fast-can-one-submit-consecutive-and-independent-jobs-with-qsub?lq=1 if you care to read about it as it's quite long. But the 'do not produce any output as they should' is due to file processing error of some nodes, which I avoided submitting jobs to in the end. I didn't really fix the file processing error. The jobs are not interactive jobs and should write files to a directory. – qAp Aug 05 '13 at 18:16

0 Answers0