I have a fasta file including about 1,000 sequences
each sequence have head and tail with missing replaced with 'N' like this
>CHR1
NNNNNAAAGAGAGAGNNTTTAGAGAGGGACNNNNNN
I want to get the start and end position of the target sequence (if there are N in the middle of target sequence, is OK)
for my example with 36 nucleotide, the start and the end position of the target sequence is CHR1:6-30 (because there are 5 N in the head and 6 N in the tail)
which means I want to know the position of the first nucleotide which is not N from both ends of the full sequence
does anyone know how to make it for all of the sequences in the same fasta file?
thanks a lot