Questions tagged [dna-sequence]

A string representing the nucleotide sequence of the deoxyribonucleic acid, the molecule that holds the genes that constitute the genetic code.

Deoxyribonucleic acid (DNA) contains the genetic instructions specifying the biological development of all cellular life. DNA consists of two long polymers of simple units called nucleotides.

DNA single chain sequences are commonly represented as a string of uppercase letters that correspond to the nucleotide units in the sequence (A, G, C, T). More seldom, ambiquity codes are also used to specify that several alternative nucleotides are possible in the given position (R - A or G, Y - C or T, see complete table.

A great amount of work in bioinformatics is related with the analysis and comparison of these strings. DNA sequences may be very long or they sets may get very large (gigabytes).

Related tags:

475 questions
-1
votes
1 answer

/Traceback (most recent call last): IndexError: list index out of range

Traceback (most recent call last): IndexError: list index out of range from sys import argv, exit import csv import sys def main (): if len(sys.argv) < 3: print("Usage: python dna.py data.csv sequence.txt") exit(1) #…
-1
votes
1 answer

Traceback (most recent call last): ValueError: I/O operation on closed file

enter image description here Traceback (most recent call last): File "dna.py", line 42, in main() ValueError: I/O operation on closed file
-1
votes
2 answers

python and user defined functions

I was learning python coding and was using a function for calculating the gc percentage in a DNA sequence with undefined character N or n (NAAATTTGGGCCCN) and this created the following problem. is there a way to overcome this ? def gc(sequence) : …
-1
votes
1 answer

How to remove lines that start with the same characters (but are random) in python?

I am trying to remove lines in a file that start with the same 5 characters, however, the first 5 characters are random (I don't know what they will be)? I have a code that reads the last 5 characters of the first line of a file and matches them to…
Alpa Luca
  • 13
  • 5
-1
votes
3 answers
-1
votes
2 answers

encoding binary to DNA sequence inC#

I would like to encode binary sequence to DNA sequence by truth table : 00=A 01=C 10=G 11=T For example:11000110=``TACGBy using C#, My issue, is the DNA sequence is not correctly converted. Does someone can help me PLEASE ? the code i write is this…
safaa
  • 57
  • 7
-1
votes
2 answers

Perl: Assigning a variable one of 3 possible values

I have a DNA sequence. Let's call it "ATCG". I have 2 small databases of DNA sequences in 2 separate files, which we will call "db1.txt" and "db2.txt". Both databases are formatted as follows: >name of sequence EXAMPLESEQUENCEATCGATCG >name of…
Aditya J.
  • 131
  • 2
  • 11
-1
votes
2 answers

splitting up the contents of a single line

I just went through a problem, where input is a string which is a single word. This line is not readable, Like, I want to leave is written as Iwanttoleave. The problem is of separating out each of the tokens(words, numbers, abbreviations, …
Geek_To_Learn
  • 1,816
  • 5
  • 28
  • 48
-1
votes
1 answer

Are unplaced genomic scaffolds unique compared to actual chromosomes?

I used UCSC blat to search for a horse genomic sequence. Three results were returned, two were unplaced scaffolds, and the other was chr1. All had 100% identity to my query…
Patrickc01
  • 145
  • 1
  • 1
  • 6
-1
votes
3 answers

Function to store strings as ints

I have a fixed 32 bits in which to store as much DNA as possible. The amount of space required to store 1 character of DNA ('A', 'C', 'G' or 'T') is 2 bits (00, 01, 10, 11, as there are only 4 combinations). To store up to 2 characters, (so, A, C,…
J.J
  • 3,459
  • 1
  • 29
  • 35
-1
votes
2 answers

How to handle large text file in spark?

I have a large textfile (3 GB) and it is DNA reference. I would like to slice it in parts so that i can handle it. So I want to know how to slice the file with Spark. I am currently having only one node with 4 GB of memory
BePhant
  • 51
  • 1
  • 7
-1
votes
2 answers

cannot find symbol...?

I have to write code that takes in a string of 3 letters and converts it to give the complement DNA (A==T, C==G AND REVERSE) string. Although I think the code is okay, it keeps giving me the same error "cannot find symbol" At the string dna…
-1
votes
4 answers

How to Identify Repetitive Characters in a String Using Python?

I am new to python and I want to write a program that determines if a string consists of repetitive characters. The list of strings that I want to test are: Str1 = "AAAA" Str2 = "AGAGAG" Str3 = "AAA" The pseudo-code that I come up with: WHEN…
MEhsan
  • 2,184
  • 9
  • 27
  • 41
-1
votes
1 answer

TypeError: object of type 'function' has no len()

I'm writing a program that is supposed to take a DNA chain and then change it into an RNA chain, after doing this it's supposed to take the RNA chain and find the amino acids. It seems that my code has a problem on line 30 but I can't find the…
-1
votes
1 answer

aligning DNA sequences and marking a SNP

I have two fasta files. Each file contains sequences of short genomic regions in Rat or Mouse with a species-specific known SNP. File_1…
mbk0asis
  • 99
  • 8