Questions tagged [biopython]

Biopython is a set of freely available tools for biological computation written in Python. Please only use this tag for issues relating to the Biopython suite of tools.

Biopython is a set of freely available tools for biological computation written in Python. It is developed by The Biopython Project, an international association of developers of Python tools for computational molecular biology. It includes a range of bioinformatics functionalities such as:

Parsing bioinformatics files into data structures usable by Python
Interfaces to commonly used bioinformatics programs (BLAST, Clustalw, EMBOSS among others)
Class for dealing with DNA, RNA and protein sequences. This includes feature annotations.
Tools for performing common operations on sequences, such as translation, transcription and weight calculations

amongst many, many others.

The `biopython` tag

Questions with biopython tag should relate to issues involving the Biopython package of tools.

Learning More

The web site http://www.biopython.org provides an online resource for modules, scripts, and web links for developers of Python-based software for life science research. It also has a useful wiki site.

The Biopython Cookbook provides many examples of Biopython being used as well as installation instructions and a FAQ section.

1345 questions

votes

5 answers

Protein sequence from uniprot protein id python

I was wondering if there is way to get the sequence of proteins from uniprot protein ids. I did check few online softwares but they allow to get one sequence at a time but I have 5536 vlues. Is there any package in biopython to do this?

python bioinformatics biopython

asked Sep 29 '18 at 15:04

AST

votes

1 answer

"invalid sequence" error in seqio.write() of biopython

This question is related to bioinformatics. I did not recieve any suggestions in corresponding forums, so I write it here. I need to remove non-ACTG nucleotides in fasta file and write output to a new file using seqio from biopython. My code is…

biopython

asked Jul 11 '17 at 16:30

Hrant

votes

1 answer

Issue with parsing publication data from PubMed with Entrez

I am trying to use Entrez to import publication data into a database. The search part works fine, but when I try to parse: from Bio import Entrez def create_publication(pmid): handle = Entrez.efetch("pubmed", id=pmid, retmode="xml") …

python bioinformatics biopython pubmed

asked Dec 22 '16 at 15:42

apiljic

votes

1 answer

Phylo BioPython building trees

I trying to build a tree with BioPython, Phylo module. What I've done so far is this image: each name has a four digit number followed by - and a number: this number refer to the number of times that sequence is represented. That means 1578 - 22,…

python numpy graphviz biopython

asked Oct 29 '10 at 11:36

psoares

4,733
7
41
55

votes

2 answers

Biopython parse from variable instead of file

import gzip import io from Bio import SeqIO infile = "myinfile.fastq.gz" fileout = open("myoutfile.fastq", "w+") with io.TextIOWrapper(gzip.open(infile, "r")) as f: line = f.read() fileout.write(line) fileout.seek(0) count = 0 for rec in…

python biopython fasta

asked Jul 13 '16 at 17:28

Stuber

votes

2 answers

How can I extract the abstract from efetch (Biopython, Entrez)?

I am new to python and would like to extract abstracts from pubmed using the entrez system from the bio package. I got the esearch to give me my UIDs (stored in my_list_ges) and I can also download an entry using efetch. Now, however, the result is…

python biopython pubmed

asked Mar 18 '16 at 15:02

MaxS

votes

1 answer

Can Biopython perform Seq.find() accounting for ambiguity codes

I want to be able to search a Seq object for a subsequnce Seq object accounting for ambiguity codes. For example, the following should be true: from Bio.Seq import Seq from Bio.Alphabet.IUPAC import IUPACAmbiguousDNA amb = IUPACAmbiguousDNA() s1 =…

python bioinformatics biopython

asked Aug 24 '15 at 22:46

Malonge

1,980
5
23
33

votes

3 answers

Frequencies not adding up to one

I am writing a function that is supposed to go through a .fasta file of DNA sequences and create a dictionary of nucleotide (nt) and dinucleotide (dnt) frequencies for each sequence in the file. I am then storing each dictionary in a list called…

python python-2.7 biopython

asked May 27 '15 at 16:28

Bantha

votes

3 answers

Convert FASTA to GenBank

Is there a way to use BioPython to convert FASTA files to a Genbank format? There are many answers on how to convert from Genbank to FASTA, but not the other way around.

biopython fasta genbank

asked May 12 '15 at 03:59

Ricky Su

votes

1 answer

Trying to parallelize a python algorithm using multithreading and avoiding GIL restrictions

I am implementing an algorithm in Python using Biopython. I have several alignments (sets of sequences of equal length) stored in FASTA files. Each alignment contains between 500 and 30000 seqs and each sequence is about 17000 elements long. Each…

python multithreading bioinformatics biopython gil

asked Dec 29 '14 at 13:28

Francisco Merino

votes

4 answers

how to extend ambiguous dna sequence

Let's say you have a DNA sequence like this : AATCRVTAA where R and V are ambiguous values of DNA nucleotides, where R represents either A or G and V represents A, C or G. Is there a Biopython method to generate all the different combinations of…

python biopython dna-sequence

asked Dec 18 '14 at 17:05

jrjc

21,103
9
64
78

votes

3 answers

Installation of biopython - python 3.3 not found in registry

I am trying to install biopython to run with Python 3.3 on a Windows7 computer. I have downloaded the biopython executable biopython-1.61.win32-py3.3-beta.exe. When I attempt to run the executable, however, I get the message "Python version 3.3 is…

python windows-7 registry biopython

asked Mar 01 '13 at 15:33

gwilymh

votes

2 answers

Biopython class instance - output from Entrez.read: I don't know how to manipulate the output

I am trying to download some xml from Pubmed - no problems there, Biopython is great. The problem is that I do not really know how to manipulate the output. I want to put most of the parsed xml into a sql database, but I'm not familiar with the…

python class biopython

asked Jul 04 '12 at 04:00

PandaFacklerWeen

votes

4 answers

Split a multifasta file to files with the same number of accesion numbers

I have a file that has thousands of accession numbers: and looks like this.. >NC_033829.1 Kallithea virus isolate DrosEU46_Kharkiv_2014, complete genome AGTCAGCAACGTCGATGTGGCGTACAATTTCTTGATTACATTTTTGTTCCTAACAAAATGTTGATATACT >NC_020414.2 Escherichia…

python bash awk biopython

asked Jul 25 '21 at 19:36

LDT

2,856
2
15
32

votes

1 answer

How do I make more efficient code for a search for multiple strings in column in pandas

I am a newly self taught (minus 1 class on the very basics) programmer working for a bio lab. I have a script that goes though RNAseq data from two different cell types and runs a ttest if in another dataset. It worked for this application but the…

python pandas bioinformatics biopython

asked Jan 22 '20 at 20:55

David William Turnell

Prev 1

…

89 90 Next

Questions tagged [biopython]

The biopython tag

Learning More

The `biopython` tag