Questions tagged [dna-sequence]

A string representing the nucleotide sequence of the deoxyribonucleic acid, the molecule that holds the genes that constitute the genetic code.

Deoxyribonucleic acid (DNA) contains the genetic instructions specifying the biological development of all cellular life. DNA consists of two long polymers of simple units called nucleotides.

DNA single chain sequences are commonly represented as a string of uppercase letters that correspond to the nucleotide units in the sequence (A, G, C, T). More seldom, ambiquity codes are also used to specify that several alternative nucleotides are possible in the given position (R - A or G, Y - C or T, see complete table.

A great amount of work in bioinformatics is related with the analysis and comparison of these strings. DNA sequences may be very long or they sets may get very large (gigabytes).

Related tags:

levenshtein-distance

475 questions

votes

3 answers

minimum length window in string1 where string2 is subsequence

Main DNA sequence(a string) is given (let say string1) and another string to search for(let say string2). You have to find the minimum length window in string1 where string2 is subsequence. string1 = "abcdefababaef" string2 = "abf" Approaches that i…

algorithm window dynamic-programming dna-sequence subsequence

asked Aug 28 '14 at 09:25

Shweta

1,111
3
15
30

votes

4 answers

How can I reverse compliment a multiple sequence fasta file with python?

I am new to python and I am trying to figure out how to read a fasta file with multiple sequences and then create a new fasta file containing the reverse compliment of the sequences. The file will look something…

python while-loop fasta dna-sequence

asked Mar 03 '14 at 22:22

scooterdude32

votes

2 answers

Increase string overlap matrix building efficiency

I have a huge list (N = ~1million) of strings 100 characters long that I'm trying to find the overlaps between. For instance, one string might be XXXXXXXXXXXXXXXXXXAACTGCXAACTGGAAXA (and so on) I need to build an N by N matrix that contains the…

c++ performance dna-sequence

asked Mar 02 '14 at 21:15

Dustin

6,783
4
36
53

votes

2 answers

Commercial databases adept in storing biological sequences

Which commercial databases are adept in storing biological sequences like Protein/DNA sequence? Are there any which were designed specifically to store such sequences? cheers

database dna-sequence protein-database

asked Feb 04 '10 at 17:47

Arnkrishn

29,828
40
114
128

votes

5 answers

Translating a cDNA to amino acids using Perl

So I am trying to translate a complementary strand of DNA to it's respective amino acids. So far I have this code: #!/usr/bin/perl open (INFILE, "sumaira2.out"); open (OUTFILE3, ">>sumaira3.out"); %aacode = ( TTT => "F", TTC => "F", TTA => "L",…

arrays perl hashtable dna-sequence

asked Feb 04 '14 at 02:48

user3268152

votes

1 answer

How to use as.DNAbin{ape} with DNA sequences stored in a dataframe?

I have a dataframe with loci names in one column and DNA sequences in the other. I'm trying to use as.DNAbin{ape} or similar to create a DNAbin object. Here some example data: x <- structure(c("55548", "43297", "35309", "34468",…

r dna-sequence ape-phylo

asked Jan 14 '14 at 12:30

A.Mstt

votes

4 answers

Codon alignment via Python?

I have pairs of coding DNA sequences which I wish to perform pairwise codon alignments via Python, I have "half completed" the process. So far.. I retrive pairs of orthologous DNA sequences from genbank using Biopython package. I translate the…

python bioinformatics biopython dna-sequence sequence-alignment

asked Dec 30 '13 at 16:52

hello_there_andy

2,039
2
21
51

votes

1 answer

Generate all possible dna sequences from a few given sets

I have been trying to wrap my head around this for a while now but have not been able to come up with a good solution. Here goes: Given a number of sets: set1: A, T set2: C set3: A, C, G set4: T set5: G I want to generate all possible sequences…

ruby set cartesian-product dna-sequence

asked Nov 24 '09 at 15:58

reprazent74

votes

2 answers

Compute transitive closure

awk bioinformatics dna-sequence transitive-closure

asked Jan 10 '13 at 20:52

bala

votes

1 answer

Concatenation in C with 2D char array

I am reading in a textfile line by line into a 2D array. I want to concatenate the char arrays so I have one long char array. I am having trouble with this, I can get it to work with two char arrays but when I try to do a lot of them I go…

c char concatenation dna-sequence

asked Dec 09 '12 at 01:24

Ben Fossen

votes

5 answers

Looking for elegant glob-like DNA string expansion

I'm trying to make a glob-like expansion of a set of DNA strings that have multiple possible bases. The base of my DNA strings contains the letters A, C, G, and T. However, I can have special characters like M which could be an A or a C. For…

python permutation glob dna-sequence

asked Jul 08 '09 at 14:28

Rich

12,068
9
62
94

vote

4 answers

Using Perl to iterate through a string 3 positions at a time

I have written the following code in Perl. I want to iterate through a string 3 positions (characters) at a time. If TAA, TAG, or TGA (stop codons) appear, I want to print till the stop codons and remove the rest of the…

regex string perl dna-sequence

asked Apr 01 '12 at 15:12

zock

vote

1 answer

Multiple mismatches in DNA search sequence regex

I have written this barbaric script to create permutations of a string of characters that contain n (up to n=4) $'s in all possible combinations of positions within the string. I will eventually .replace('$','(\\w)') to use for mismatches in a dna…

python regex biopython dna-sequence

asked Dec 14 '11 at 17:51

jhjudd

vote

2 answers

Regex: extracting DNA info between 2 markers

I'm trying to extract some DNA info from a file. Before the DNA data consisting of bases GCAT there is the word ORIGIN, and after there is a //. How do I write a regular expression to get these bases between these markers? I have tried the following…

java regex dna-sequence

asked Dec 07 '11 at 15:13

user1044585

vote

0 answers

Using msa package in R and it is crashing

I am running the msa package to create a DNA alignment for the phangorn package and it crashes with this error I am running RStudio with R v4.3.1 on an M1 Mac Book Pro mult <- msa(seqs, method="Muscle", type="dna", order="input") That results…

r dna-sequence

asked Jul 11 '23 at 15:55

Julian-marchesi

Prev 1 2 3

…

31 32 Next