I have the following vector v
:
c("tactagcaatacgcttgcgttcggtggttaagtatgtataatgcgcgggcttgtcgt",
"tgctatcctgacagttgtcacgctgattggtgtcgttacaatctaacgcatcgccaa",
"gtactagagaactagtgcattagcttatttttttgttatcatgctaaccacccggcg")
i'm facing a very upsetting issue here. Each element of this vector is a DNA sequence. What i want to do is split each element 2 letters by 2 and obtain the count of occurrences of each pair of letters. The desired output would be exactly this for the first element:
AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT
3 2 2 4 1 0 6 3 0 6 4 7 7 2 5 4
This result is achieved easily using the function oligonucleotideFrequency. The problem is that this function will not apply over a list or a vector using sapply or lapply and i don't understand where is the problem and how to fix it.
If i do:
oligonucleotideFrequency(DNAString(v[1]), width = 2)
It works and i get this output:
AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT
3 2 2 4 1 0 6 3 0 6 4 7 7 2 5 4
but if i do:
v <- DNAString(v)
lapply(v, oligonucleotideFrequency(v, width = 2)
This is what i get:
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘oligonucleotideFrequency’ for signature ‘"list"
Same occurs with sapply
.
If i check the class of v
after applying the DNAString
function it returns "list"
so idon't get where is the problem here.
Even if i do:
oligonucleotideFrequency(v[1], width = 2)
it returns:
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘oligonucleotideFrequency’ for signature ‘"list"’
I'm totally lost here, please help, i've been hours breaking my head into this, how can i fis this problem?? I want to apply this function to the whole vector at once.
PD: The R package containing this functions os called Biostrings
and it can be downloaded and installed from here
Thanks in advance