Questions tagged [biopython]

Biopython is a set of freely available tools for biological computation written in Python. Please only use this tag for issues relating to the Biopython suite of tools.

Biopython is a set of freely available tools for biological computation written in Python. It is developed by The Biopython Project, an international association of developers of Python tools for computational molecular biology. It includes a range of bioinformatics functionalities such as:

  • Parsing bioinformatics files into data structures usable by Python

  • Interfaces to commonly used bioinformatics programs (BLAST, Clustalw, EMBOSS among others)

  • Class for dealing with DNA, RNA and protein sequences. This includes feature annotations.

  • Tools for performing common operations on sequences, such as translation, transcription and weight calculations

amongst many, many others.

The biopython tag

Questions with tag should relate to issues involving the Biopython package of tools.

Learning More

The web site http://www.biopython.org provides an online resource for modules, scripts, and web links for developers of Python-based software for life science research. It also has a useful wiki site.

The Biopython Cookbook provides many examples of Biopython being used as well as installation instructions and a FAQ section.

1345 questions
-1
votes
2 answers

Never-ending loop? Can't get python to stop running

When I try to run this code, it never finishes and I think it's stuck somewhere but I'm not too sure since I am new to python. import re codon = [] rcodon = [] dataset =…
Lauren
  • 15
  • 3
-1
votes
1 answer

Running BLAST in Python with Biopython for SARS Virus. My output is simply not showing up! Someone check my code?

Here is my code: from Bio.Blast import NCBIWWW result = NCBIWWW.qblast("blastn","nt",r"C:\Users\video\Documents\sars.fasta") save_file = open("blast4.xml", "w") save_file.write(result.read()) save_file.close() result.close() result =…
Johnn-1231
  • 85
  • 1
  • 1
  • 5
-1
votes
1 answer

Counting specific lines that don't contain specific word

Please I have question: I have a file like this…
Reda
  • 449
  • 1
  • 4
  • 17
-1
votes
1 answer

How To Convert CSV to GFF3?

I want to convert a .csv file to a GFF3 file. The csv file contains annotation data. I know that I should parse the .csv file and then write the .gff file, but I dont know the complete code
eli bio67
  • 1
  • 1
-1
votes
1 answer

how to share anaconda packages with the user of HTTP server

I am user of ubuntu and I run many scripts written with python3 which was installed through anaconda. All modules that I need have been installed there previously i.e. biopython. However, I can't import biopython in one of my script when I try to…
-1
votes
1 answer

How to fix ''generator' object is not subscriptable" error when reading fasta file with BioPython

I am trying to open and read a fasta file and use only the first line from the input. Currently, I'm calling the first line and appending it to a list to use in a later function. However, I'm getting an error that generator object is not…
-1
votes
1 answer

Searching for nearest sequence in string

I need to convert contigs into their respective protein sequences given a reference genome (i.e. I need to take a substring, whose position is already known on the string, and I need to locate the nearest start and stop codons - a specific 3 letter…
Thomas
  • 25
  • 3
-1
votes
1 answer

#WatsonStudio and #Biopython and #fasta file saved on #S3 #Objectstorage

I need to read a fasta file uploaded on Cloud Object storage using Biopython. I've anotebook in Python 2.7 in Watson Studio. Does anyone have tried this?
-1
votes
2 answers

to extract dna sequence from a fasta file with gene ids in another location

I have created a little programe to extract selected ids + sequence from a fasta file. The ids of interest are file names that contains several seq for that gene. This is the programe: import glob, sys, os from Bio import SeqIO, SearchIO from…
Ana
  • 131
  • 1
  • 14
-1
votes
3 answers

dictionary key with more than one value not printing all values

I have a dictionary with repeated keys but different values for those keys and i want to pull all values for a specific key. Here is the abbreviated version of what I mean: x_table = {'A':'GCT','A':'GCC','A':'GCA','A':'GCG'} AA_list = [{'A'}] for…
dk09
  • 87
  • 1
  • 1
  • 7
-1
votes
2 answers

How to get an alignment score from DNA sequences?

I'm somewhat familiar with Biopython's pairwise2 function but I noticed that it adds dashes within the sequence in order to obtain the best possible alignment score. For example, for a in pairwise2.align.globalxx("ACCGT", "ACG"): …
superasiantomtom95
  • 521
  • 1
  • 7
  • 25
-1
votes
1 answer

Getting protein FASTA sequence based on keyword with python

I would like to gather proteins FASTA sequence from Entrez with python 2.7. I am looking for any proteins that have the keywords: "terminase" and "large" in their name. So far I got this code: from Bio import Entrez Entrez.email =…
tahunami
  • 141
  • 1
  • 7
-1
votes
2 answers

Align with Muscle (BioPython)

I'm trying to align with Muscle some example sequences from opuntia.fasta (from BioPython manual) from Bio.Align.Applications import MuscleCommandline in_file = "C:/Try/opuntia.fasta" out_file = "C:/Try/aligned.fasta" muscle_exe = "C:/Program…
lizaveta
  • 353
  • 1
  • 13
-1
votes
1 answer

Making a function to turn quality strings into a list of Phred scores

I'm new to Python coding, and I am having trouble making a function that turns a quality string into a list of PHRED-scaled quality scores. Hoping to get some assistance. Here is a FASTQ…
john.doe
  • 11
  • 1
-1
votes
1 answer

From where can i download RS126 protein dataset in *.mat format?

I've been working on a Protein Secondary Structures Prediction Project. I am unable to find the RS 126 dataset online. I found a list of proteins in that database. I am looking for the same proteins after running a PSI BLAST search on them and in…
Xerneas
  • 11
  • 2