3

I have a list of protein id's I'm trying to access the protein sequences from Uniprot with python. I came across this post :Protein sequence from uniprot protein id python but gives a list of elements and not the actual sequence:

Code

import requests as r
from Bio import SeqIO
from io import StringIO

cID='P04637'

baseUrl="http://www.uniprot.org/uniprot/"
currentUrl=baseUrl+cID+".fasta"
response = r.post(currentUrl)
cData=''.join(response.text)

Seq=StringIO(cData)
pSeq=list(SeqIO.parse(Seq,'fasta'))

which gives output:

output

[SeqRecord(seq=Seq('MQAALIGLNFPLQRRFLSGVLTTTSSAKRCYSGDTGKPYDCTSAEHKKELEECY...SSS', SingleLetterAlphabet()), id='sp|O45228|PROD_CAEEL', name='sp|O45228|PROD_CAEEL', description='sp|O45228|PROD_CAEEL Proline dehydrogenase 1, mitochondrial OS=Caenorhabditis elegans OX=6239 GN=prdh-1 PE=2 SV=2', dbxrefs=[])]

I was just curious on how I can actually get the sequence itself.

1 Answers1

3

[record.seq for record in pSeq]

edit: You'll want str(pSeq[0].seq)

Pallie
  • 965
  • 5
  • 10