2

I am trying to submit batch jobs to SLURM but I keep getting JobState=FAILED Reason=NonZeroExitCode. I can compile and run the code fine on regular g++ but I have to use SLURM for an assignment for school. I thought I was running them properly and I got a nasty-gram from the root telling me to quit running scripts on the login node. Any help would be appreciated. Here is my batch file and my Makefile:

#!/bin/bash
#SBATCH -N1 -n1 --mem-per-cpu=100m -t00:05:00
echo "#SBATCH -N1 -n1 --mem-per-cpu=100m -t00:05:00 --qos=test"
cd /home/<username>/AFS/cse_430/Project1/Parallel/
module load gcc/4.9.1
make clean
make all
echo "Running single threaded code..."
./run "SeqCA(57;4,10).txt"
echo "Done experiment. Check log.txt"

Makefile:

EXEC=run    # name of executable is run
CC=g++      # compile with g++
CFLAGS=-std=c++11 -fopenmp -c -Wall 

all: $(EXEC)

$(EXEC): main.o threeSeq.o fourSeq.o fiveSeq.o
    $(CC) -fopenmp -o $(EXEC) main.o threeSeq.o fourSeq.o fiveSeq.o 

main.o: main.cpp
    $(CC) $(CFLAGS) main.cpp

threeSeq.o: threeSeq.cpp threeSeq.hpp
    $(CC) $(CFLAGS) threeSeq.cpp

fourSeq.o: fourSeq.cpp fourSeq.hpp
    $(CC) $(CFLAGS) fourSeq.cpp

fiveSeq.o: fiveSeq.cpp fiveSeq.hpp
    $(CC) $(CFLAGS) fiveSeq.cpp

clean: 
    rm -f *.o
    rm -f $(EXEC)
    rm -f *log.txt
Johnny
  • 675
  • 3
  • 15
  • 25

1 Answers1

0

You should submit your jobs with: sbatch jobscript.sh

Check also the output files from slurm to check for errors on the job execution. By default slurm stores the stdout and err in the slurm-.out file

Carles Fenoy
  • 4,740
  • 1
  • 26
  • 27
  • That's what's weird... it says there is a slurm output file, but it doesn't exist. My script above is name job2.sh and I submit it with sbatch job2.sh. My program takes in an input file and computes things on it. Maybe that is causing an issue? I've seen example scripts and mine looks correct. – Johnny Sep 30 '14 at 10:03
  • Where are you submitting your job from? which path are you in? The path should be accessible from any compute node for slurm to be able to create the output file. – Carles Fenoy Sep 30 '14 at 10:06
  • The path is in a folder I have on the server: http://hphn1.a2c2.asu.edu/home//AFS/cse_430/Project1/Parallel – Johnny Sep 30 '14 at 10:11
  • If the path is accessible from all the nodes everything looks fine. Can you see the job in the queue when you submit it? Try specifying an output file with the "-o" flag – Carles Fenoy Sep 30 '14 at 10:18
  • I can see it queued but it usually is just failed when I call `scontrol show job ` – Johnny Sep 30 '14 at 10:28
  • How would I specify an output file with the "-o" flag? – Johnny Sep 30 '14 at 10:29
  • You should look at the sbatch man page, but simply specifying "-o FILENAME" – Carles Fenoy Sep 30 '14 at 13:32