Questions tagged [dna-sequence]

A string representing the nucleotide sequence of the deoxyribonucleic acid, the molecule that holds the genes that constitute the genetic code.

Deoxyribonucleic acid (DNA) contains the genetic instructions specifying the biological development of all cellular life. DNA consists of two long polymers of simple units called nucleotides.

DNA single chain sequences are commonly represented as a string of uppercase letters that correspond to the nucleotide units in the sequence (A, G, C, T). More seldom, ambiquity codes are also used to specify that several alternative nucleotides are possible in the given position (R - A or G, Y - C or T, see complete table.

A great amount of work in bioinformatics is related with the analysis and comparison of these strings. DNA sequences may be very long or they sets may get very large (gigabytes).

Related tags:

475 questions
4
votes
2 answers

Analyze tandem repeat motifs in DNA sequences

Hy Py-guys :). Since I am new in the coding world and as well in Python, I don’t have much experience with coding and thus any help would be appreciated. I am working with short tandem repeats in DNA sequences and I would like to have a code that…
Majkl
  • 75
  • 7
4
votes
1 answer

Find all repeated 4-mers in a DNA Sequence - Perl

Hello, I try to write a program that reads in a FASTA-formatted file containing multiple DNA sequences, identifies all repeated 4-mers (i.e., all 4-mers that occur more than once) in a sequence, and prints out the repeated 4-mer and the header of…
ic23oluk
  • 125
  • 1
  • 9
4
votes
4 answers

Converting nucleotides to amino acids using JavaScript

I'm creating a Chrome Extension that converts a string of nucleotides of length nlen into the corresponding amino acids. I've done something similar to this before in Python but as I'm still very new to JavaScript I'm struggling to translate that…
happy
  • 61
  • 12
4
votes
3 answers

sequence logos in matplotlib: aligning xticks

I am trying to draw sequence logos using matplotlib. The entire code is available on gist The relevant portion is: class Scale(matplotlib.patheffects.RendererBase): def __init__(self, sx, sy=None): self._sx = sx self._sy = sy …
rightskewed
  • 624
  • 2
  • 11
  • 24
4
votes
1 answer

Computing edit distance of DNA sequence python

So I am given the task of aligning the lowest cost between 2 DNA sequences. One of the failing inputs is: CGCAATTCTGAAGCGCTGGGGAAGACGGGT & TATCCCATCGAACGCCTATTCTAGGAT The proper alignment costs 24, but I am getting a cost of 23. I have to read…
Dringo
  • 255
  • 1
  • 2
  • 13
4
votes
1 answer

Heatmap.2: add row/column labels on left/top without hard coding coordinates

I'm trying to recreate a heatmap, using heatmap.2, similar to this(1): I'm able to add the "A C G T" labels to the bottom column and right row labels. I'm trying to add "group" names to the top and left axis ("1012T3" etc. and "G>A" etc). I've…
clfougner
  • 173
  • 1
  • 11
4
votes
2 answers

replace partial of character string in a data frame by conditions in r

I have a data frame like this: df = read.table(text="REF Alt S00001 S00002 S00003 S00004 S00005 TAAGAAG TAAG TAAGAAG/TAAGAAG TAAGAAG/TAAG TAAG/TAAG TAAGAAG/TAAGAAG TAAGAAG/TAAGAAG T TG T/T -/- TG/TG T/T T/T CAAAA CAAA …
user3354212
  • 1,048
  • 8
  • 19
4
votes
3 answers

creating complement of DNA sequence and reversing it C++

So I am trying to create the complement of the sequence TGAGACTTCAGGCTCCTGGGCAACGTGCTGGTCTGTGTGC however my output didn't work as expected. The complements for each letter in the sequence are A -> T G -> C C -> G T -> A I've been programming…
Juan Battini
  • 115
  • 3
  • 14
4
votes
1 answer

Multiple Sequence Alignment with Unequal String Length

I need a methodology for creating a consensus sequence out of 3 - 1000 short (10-20bp) nucleotide ("ATCG") reads of varying lengths. A simplified example: "AGGGGC" "AGGGC" "AGGGGGC" "AGGAGC" "AGGGGG" Should result in a consensus sequence of…
4
votes
2 answers

Collapse a list of DNAstringsets into a single DNAStingset in order to apply writeXStringSet() and turn it into fasta file in R

Using R for bioinformatics here: I have a list of DNAstringsSets(seen below) and want to use the writeXstringset() function which takes a DNAstringset object as an argument in order to save as a FASTA file.Anyone knows how is it possible to collapse…
NEWSCIENT
  • 57
  • 1
  • 3
4
votes
3 answers

python script for robust multi-array average on microarray data

I have tried google with no luck. I have seen some weak references to robust multi-array averaging done with python but no code. I am not so interested in reinventing the wheel. Any suggestions on a python module, script .... If I could find a nice…
Vincent
  • 1,579
  • 4
  • 23
  • 38
3
votes
5 answers

separate the abnormal reads of DNA (A,T,C,G) templates

I have millions of DNA clone reads and few of them are misreads or error. I want to separate the clean reads only. For non biological background: DNA clone consist of only four characters (A,T,C,G) in various permutation/combination. Any character,…
shivam
  • 596
  • 2
  • 9
3
votes
0 answers

How to efficiently find almost identical substrings of a specified length in a collection of strings?

My question is similar to How to efficiently find identical substrings of a specified length in a collection of strings Let's assume that I have t strings, each one is at length n and I need to find a substring at length k that has at most one index…
3
votes
0 answers

"Killed" message while using Velvet to assemble SRA reads

I'm having some trouble using Velveth to assemble reads downloaded from the NCBI SRA. The command I used was: velveth velvet 27 -fastq -shortPaired -interleaved /home/bilalm/H_glaber_quality_filtering/AfterQC/good_reads/SRR530529.good.fq (velvet -…
Billy
  • 69
  • 5
3
votes
3 answers

How to creat a bar graph of microbiota data with one color for higher taxonomic rank and gradient color

I have a Phyloseq object with my OTU table and TAX table. I would like to create a bar plot, at for instance family level, but families belonging to the same Phylum will be displayed with the same colour and be distinguished by a gradient of this…
Thibault
  • 31
  • 1
  • 4
1 2
3
31 32