I was learning python coding and was using a function for calculating the gc percentage in a DNA sequence with undefined character N or n (NAAATTTGGGCCCN) and this created the following problem. is there a way to overcome this ?
def gc(sequence) :
"This function computes the GC percentage of a dna sequence"
nbases=sequence.count('n')+sequence.count('N')
gc_count=sequence.count('c')+sequence.count('C')+sequence.count('g')+sequence.count('G') #total gc count
gc_percent=float(gc_count)/(len(sequence-nbases)) # TOTAL GC COUNT DIVIDED BY TOTAL LEN OF THE sequence-TOTAL NO. OF N
return 100 * gc_percent