2

I'm currently working on trying to analyze a dataset. I'm new to the field of bioinformatics and was trying to use BWA tools, however, as soon as I reach bwa mem, I keep running into the same error:

input --> mirues-macbook:sra ipmiruek$ bwa mem -t 8 Homo_sapiens.GRCh38.dna.chromosome.17.fa ERR3841737/ERR3841737_trimmed.fq.gz > ERR3841737/ERR3841737_mapped.sam

output --> [E::bwa_idx_load_from_disk] fail to locate the index files

I've already indexed the reference chromosome as such:

bwa index Homo_sapiens.GRCh38.dna.chromosome.17.fa.gz

Is there anything I could do to fix this problem? Thank you.

I tried changing the dataset that I was using along with the corresponding reference chromosome but it still yielded the same result. Is this an issue with the code or with the dataset I'm working with?

Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
Mirue Kang
  • 23
  • 3
  • 1
    What is the output of `bwa index Homo_sapiens.GRCh38.dna.chromosome.17.fa.gz`? Did you run this command in the same directory as the mapping command? Can BWA index from a gzipped file or maybe you need to unzip first? – Pallie Jan 19 '23 at 11:40

1 Answers1

2

It looks like you indexed a gzip-compressed FASTA file, but are supplying an index base (idxbase) without the .gz extenstion. What you want is:

$ bwa mem \
    -t 8 \
    Homo_sapiens.GRCh38.dna.chromosome.17.fa.gz \ 
    ERR3841737/ERR3841737_trimmed.fq.gz \
    > ERR3841737/ERR3841737_mapped.sam

Alternatively, gunzip the reference FASTA file and index it. For example:

$ gunzip Homo_sapiens.GRCh38.dna.chromosome.17.fa.gz
$ bwa index Homo_sapiens.GRCh38.dna.chromosome.17.fa

Note that BWA packs the reference sequences (into the .pac file), so you don't even need the FASTA file to run BWA MEM after it's been indexed.

Steve
  • 51,466
  • 13
  • 89
  • 103
  • 1
    Thank you for your help! I just tried out the code above and it worked! I wasn't really sure where I had gone wrong and this kept occurring several times, but your explanation was incredibly helpful. Thanks, again. – Mirue Kang Jan 19 '23 at 15:05