0

I have just started learning bioinformatics in my lab and I am a complete newbie. I am using a genome annotation tool called Kofamscan from NCBI and I am getting an error that could be due to the fact results of multiple processes are being stored in the same temp directory and the files are collapsing. So I want to create separate temp directories per process (temp1 for process1, temp2 for process2,...etc) but I don't know how to write the code that enables it.

files=(`cat kofam_files`) #input files 

TASK_ID = `expr ${SGE_TASK_ID} -1`

~/kofamscan/bin/exec_annotation -o marine_kofam.txt --tmp-dir **** ${files[$TASK_ID]}

I probably need to write something in the **** section of the above code but I don't know how to write them.

Thank you in advance.

Ryohei

Compo
  • 36,585
  • 5
  • 27
  • 39
Ryohei
  • 29
  • 1
  • 1
  • 6

1 Answers1

1

This is exactly why mktemp command exists :)

The following snippet the minimal changes you would have to make to yours:

files=(`cat kofam_files`) #input files 

TASK_ID=`expr ${SGE_TASK_ID} -1`

~/kofamscan/bin/exec_annotation -o marine_kofam.txt --tmp-dir `mktemp -d` ${files[$TASK_ID]}

Note that the temp directory would be created in /tmp though. You could use the flags for mktemp to create temp subdirectories in the current directory.

EDIT : Folowing the best practices for bash, one would also,

  1. Use newer the mapfile or readarray commands (in bash 4+) instead of using cat to create arrays in bash
  2. Use $(...) instead of `...` since they support nesting
  3. Use $((...)) instead of the archaic expr syntax (see this thread)

The final snippet would then look like:

readarray -t files < kofam_files #input files 

TASK_ID=$((SGE_TASK_ID - 1))

~/kofamscan/bin/exec_annotation -o marine_kofam.txt --tmp-dir $(mktemp -d) ${files[$TASK_ID]}
Saswat Padhi
  • 6,044
  • 4
  • 20
  • 25
  • sorry for the late reply, I was busy with something else that I had to get rid of straight away... Thanks for your advice, I will try it out! – Ryohei Jul 31 '19 at 09:14