Questions tagged [dna-sequence]

A string representing the nucleotide sequence of the deoxyribonucleic acid, the molecule that holds the genes that constitute the genetic code.

Deoxyribonucleic acid (DNA) contains the genetic instructions specifying the biological development of all cellular life. DNA consists of two long polymers of simple units called nucleotides.

DNA single chain sequences are commonly represented as a string of uppercase letters that correspond to the nucleotide units in the sequence (A, G, C, T). More seldom, ambiquity codes are also used to specify that several alternative nucleotides are possible in the given position (R - A or G, Y - C or T, see complete table.

A great amount of work in bioinformatics is related with the analysis and comparison of these strings. DNA sequences may be very long or they sets may get very large (gigabytes).

Related tags:

475 questions
-3
votes
1 answer

C++ Incompatible types: calculating allele frequencies

Here is what the input file looks like: 1-1_Sample 1 GCCCATGGCT 2-1_Sample 1 GAGTGTATGT 3-1_Sample 1 TGTTCTATCT 1-1_Sample 2 GCTTAGCCAT 2-1_Sample 2 TGTAGTCAGT 3-1_Sample 2 GGGAACCAAG 1-1_Sample 3 TGGAAGCGGT 2-1_Sample…
-3
votes
4 answers

How do I extract DNA sequences from a text file without reading line by line?

I'm trying to extract a DNA sequence from a text file and store it. I can do it using the following code, but it's not the best way because I'm reading the text file line by line. I'm wondering if there's an easier way to find each of the DNA…
Conor C
  • 5
  • 3
-4
votes
5 answers

Finding regular expression with at least one repetition of each letter

From any *.fasta DNA sequence (only 'ACTG' characters) I must find all sequences which contain at least one repetition of each letter. For examle from sequence 'AAGTCCTAG' I should be able to find: 'AAGTC', 'AGTC', 'GTCCTA', 'TCCTAG', 'CCTAG' and…
-4
votes
2 answers

How to represent DNA sequences for neural networks?

I want to build a neural network to classify splice junctions in DNA sequences in Python. Right now, I just have my data in strings (for example, "GTAACTGC"). I am wondering about the best way to encode this in a way that I can process with a neural…
-4
votes
1 answer

How can I find the complement of a subset of a DNA sequence using a logical index?

I have a DNA sequence, its length for example is m*4n: B = 'GATTAACTACACTTGAGGCT...'; I have also a vector of real numbers X = {xi, i = 1..m*4n}, and use mod(X,1) to keep them in the range [0,1]. For example: X = [0.223 0.33 0.71 0.44 0.91 0.32…
M.A.Fathy
  • 23
  • 3
-4
votes
1 answer

I cannot process chars

I am writing code that translates a DNA sequence! The program imports a string called shortDNA (for example ATCGGA) and has to translate it (specifically to TAGCCT), but for some reason it gives the shortDNA string that it imports(in this case…
Gamio
  • 1
  • 5
-5
votes
3 answers

How can I identify a valid DNA sequence?

I'm just starting to learn how to code in python, applying it to the bioinformatics field. Nevertheless, I'm having troubles with the next program: First you introduce a dna sequence (made from g, c, t, a, and n), with the command dna=input("enter…
-5
votes
1 answer

How to count how many times a character occurs in a string?

I have a DNA sequence as my argument. sequence<-c("ATGAATTTTGATTTA") i want to find how many times ATG repeats and other 64 codons, 64 codons which codes for specific amino acids are codon <- list(ATA = "I", ATC = "I", ATT = "I", ATG = "M", ACA =…
-6
votes
1 answer

How to extract unique string of characters from line of text file?

I have a big text file who's lines are composed in this format: Query: 1586 cccaagatgagctgcagccccccagagagagctctgcacgtcaccaagtaaccaggcccc 1645 Sbjct: 27455708 cccaagatgagctgcagccccccagagagagctctgcacgtcaccaagtaaccaggcccc 27455649 Query: 1646 …
Peter
  • 15
  • 4
-8
votes
1 answer

Regular expression matching the DNA sequence of a viable life form

I'm looking for a regular expression matching the DNA sequence of a viable life form. I'm not very particular about the definition of "viable life form", as long as it's able to hatch out of an egg and survive for a couple of minutes, I'm fine with…
PiRK
  • 893
  • 3
  • 9
  • 24
1 2 3
31
32