-1

I have a list of DNA sequences and I want to mix the contents.
Let's say dna_lst:

[1] AATTAATTCC
[2] ATCGATCG
[3] TTTAACCCCCGG

I want to generate mix dna content like:dna_mix:

[1] TACAATTACT
[2] CATGCTAG
[3] CCTGATCTCGAC

how can I do this in R?
thanks.

miken32
  • 42,008
  • 16
  • 111
  • 154
Cina
  • 9,759
  • 4
  • 20
  • 36

2 Answers2

1

Something like that :

dna_mix<-sapply(dna_lst,function(dna){paste(sample(strsplit(dna,"")[[1]]),collapse="")})

can work if what you want is a "random mixing"

> dna_mix
    AATTAATTCC       ATCGATCG   TTTAACCCCCGG 
  "TTTATCAAAC"     "TAACCGGT" "CTACTCACGGTC"

with a list of factor (if each sequence is an element of the list) :

lapply(dna_lst,function(dna){paste(sample(strsplit(as.character(dna),"")[[1]]),collapse="")})

should work.

Cath
  • 23,906
  • 5
  • 52
  • 86
  • recieving error: `Error in strsplit(dna, ""): non-character argument`. my dataset is a factor. – Cina Nov 25 '14 at 09:58
  • @Cina, could you edit your question and put your data exactly as it is ? That would be easier to answer the right way. As it is, it looks like a character vector to me... (output of dput(dna_lst) would be fine) – Cath Nov 25 '14 at 10:05
  • @Cina, is it ok with the `lapply` version ? It would really help to know how your list is built (one sequence per element ? one element with all 3 sequences ? ...) – Cath Nov 25 '14 at 10:10
1

One possibility:

sapply(strsplit(as.character(dna_lst), ""), function(x) paste(sample(x), collapse = ""))
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
  • I receive error `Error in sapply(strsplit(dna_lst, ""), function(x) paste(sample(x), collapse = "")) : error in evaluating the argument 'X' in selecting a method for function 'sapply': Error in strsplit(dna, "") : non-character argument` I am using list of factors. my dataset is huge. I convert to character and again the same error – Cina Nov 25 '14 at 09:53