1

I have a multi-fasta file with 1333 individual fasta files in txt fomrat

>header1
ACGATGCACAAGGT.....
>header2
CCAAACGCAGGGGT.....
>header3
CCAATAAGTAGCCC.....
>header4
AAAGTCGGATTTAG.....

continuing till >header1333

I want to split the multi-fasta into separate individual fasta files so that it fits in my R code for some biological analysis which was exclusively made for a single fasta file.

I want the outcome to be like file1.txt will contain

>header1
ACGATGCACAAGGT.....

file2.txt will contain

>header2
CCAAACGCAGGGGT.....

And so on. Is there any possible way to do this?

08BKS09
  • 105
  • 6
  • 1
    Is this the same as your previous question? [Split multi-fasta file into separate single fasta files in R](https://stackoverflow.com/questions/72410613/split-multi-fasta-file-into-separate-single-fasta-files-in-r) (my advice is to do it outside of R, e.g. with awk: `awk '{if(substr($0, 1, 1)==">"){filename=(substr($0,2)".fa")} print $0 > filename}' multi.fasta` then read in the files individually) – jared_mamrot May 28 '22 at 09:55

1 Answers1

1

You can read it in data.frame then save it in folder named fastafolder using a for loop :

fasta <- read.table("~/fasta", quote="\"", comment.char="")

dir.create("~/fastafolder")

for (line in 1:nrow(fasta)) {
  rows <- 2 * line - 1
  if (rows < nrow(fasta)) {
    write(fasta[rows:(rows + 1), 1] , paste0("~/fastafolder/fasta" , line))
  }
}

Created on 2022-05-28 by the reprex package (v2.0.1)

Mohamed Desouky
  • 4,340
  • 2
  • 4
  • 19
  • Thanks for the help! But the files are being saved as file.txt1, file.txt2, file.txt3 and so on which is changing their original txt format, would you suggest a way that I could change the name of all the files in the new folder to maintain the original txt format and rename them to file1.txt, file2.txt, file3.txt...? – 08BKS09 May 31 '22 at 07:57
  • 1
    Try change `write` function as `write(fasta[rows:(rows + 1), 1], paste0("~/fastafolder/fasta", line , ".txt"))` – Mohamed Desouky May 31 '22 at 11:24