I have a fasta file with several sequences, but the first line of all the sequences start with the same string (ABI) and I want to change and replace it with the names of the species stored in a different text file.
My fasta file looks like
>ABI
AGCTAGTCCCGGGTTTATCGGCTATAC
>ABI
ACCCCTTGACTGACATGGTACGATGAC
>ABI
ATTTCGACTGGTGTCGATAGGCAGCAT
>ABI
ACGTGGCTGACATGTATGTAGCGATGA
The list of spp looks like this:
Alsophila cuspidata
Bunchosia argentea
Miconia cf.gracilis
Meliosma frondosa
How I can change those ABI headers of my sequences and replace them with the name of my species using that exact order.
Required output:
>Alsophila cuspidata
AGCTAGTCCCGGGTTTATCGGCTATAC
>Bunchosia argentea
ACCCCTTGACTGACATGGTACGATGAC
>Miconia cf.gracilis
ATTTCGACTGGTGTCGATAGGCAGCAT
>Meliosma frondosa
ACGTGGCTGACATGTATGTAGCGATGA
I was using something like:
awk '
FNR==NR{
a[$1]=$2
next
}
($2 in a) && /^>/{
print ">"a[$2]
next
}
1
' spp_list.txt FS="[> ]" all_spp.fasta
This is not working, could someone guide me please.