I have written the following code in Perl. I want to iterate through a string 3 positions (characters) at a time. If TAA
, TAG
, or TGA
(stop codons) appear, I want to print till the stop codons and remove the rest of the characters.
Example:
data.txt
ATGGGTAATCCCTAGAAATTT
ATGCCATTCAAGTAACCCTTT
Answer:
ATGGGTAATCCCTAG (last 6 characters removed)
ATGCCATTCAAGTAA (last 6 characters removed)
(Each sequence begins with ATG).
Code:
#!/usr/bin/perl -w
open FH, "data.txt";
@a=<FH>;
foreach $tmp(@a)
{
for (my $i=0; $i<(length($tmp)-2); $i+=3)
{
if ($tmp=~/(ATG)(\w+)(TAA|TAG|TGA)\w+/)
{
print "$1$2$3\n";
}
else
{
print "$tmp\n";
}
$tmp++;
}
}
exit;
However, my code is not giving the correct result. There should not be any overlaps in the characters (I want to move every 3 characters).
Can someone suggest how to fix the error?
Thanks!