Questions tagged [protein-database]

A file containing protein sequences together with corresponding metadata

Classical protein-databases are text files containing a large number of protein-sequences.

Protein sequences are represented as strings of uppercase letters, each corresponding to a different aminoacid. Each protein sequence is preceeded by a header line containing metadata (protein reference number, name, description...).

The standard fasta format looks like:

>P31946|1433B_HUMAN 14-3-3 protein beta/alpha OS=Homo sapiens GN=YWHAB PE=1 SV=3
MTMDKSELVQKAKLAEQAERYDDMAAAMKAVTEQGHELSNEERNLLSVAYKNVVGARRSS
YEILNSPEKACSLAKTAFDEAIAELDTLNEESYKDSTLIMQLLRDNLTLWTSENQGDEGD
AGEGEN
>P62258|1433E_HUMAN 14-3-3 protein epsilon OS=Homo sapiens GN=YWHAE PE=1 SV=1
MDDREDLVYQAKLAEQAERYDEMVESMKKVAGMDVELTVEERNLLSVAYKNVIGARRASW
YYKMKGDYHRYLAEFATGNDRKEAAENSLVAYKAASDIAMTELPPTHPIRLGLALNFSVF
YYEILNSPDRACRLAKAAFDDAIAELDTLSEESYKDSTLIMQLLRDNLTLWTSDMQGDGE
EQNKEALQDVEDENQ
>.........................................................

A great amount of work in Bioinformatics relates with storing (annotating), searching and analyzing the sequences in these databases.

145 questions
-1
votes
1 answer

Replace values in a column while preserving the format

I have a file (.pdb) that looks like this: ATOM 1 BB MET A 1 4.171 16.195 -18.221 1.00 0.00 B ATOM 2 SC1 MET A 1 0.852 15.586 -20.418 1.00 0.00 S ATOM 3 BB GLU A 3 9.285 12.756…
Zeineb
  • 59
  • 1
  • 1
  • 7
-1
votes
1 answer

extracting each file from pdb trajctory

I have a pdb file which represent trajectory the file looks like REMARK GENERATED BY TRJCONV TITLE Protein in water t= 400.00000 REMARK THIS IS A SIMULATION BOX CRYST1 99.547 99.547 99.547 90.00 90.00 90.00 P 1 1 MODEL …
-1
votes
1 answer

From where can i download RS126 protein dataset in *.mat format?

I've been working on a Protein Secondary Structures Prediction Project. I am unable to find the RS 126 dataset online. I found a list of proteins in that database. I am looking for the same proteins after running a PSI BLAST search on them and in…
Xerneas
  • 11
  • 2
-1
votes
1 answer

Prody for modeling protein structure python

Can we use ProDy to model the structure of proteins? Is there any other way we can model the structure of a protein using Python? Thank you
Dan
  • 3
  • 2
-2
votes
1 answer

Signal Peptide Prediction Using Machine Learning

Can anyone please guide me on how do I predict the signal peptide from a protein sequence using machine learning technique? Any guide, reference or tutorial would be very helpful. Thank you in advance.
-2
votes
1 answer

How do I call write a python function without opening the file beforehand?

I'm using python2.7, and have written a few functions for analyzing protein structure files, which I have saved as pdbtools.py One function, for example, is getprot() which lets me pull protein structures from a database. After I open and edit the…
Devinity
  • 377
  • 1
  • 5
  • 17
-3
votes
1 answer

Quantifying hydrophobicity of of just the amino acid sequence

fourth-year undergrad here so any help is super appreciated! Also this is not something I am working on for a grade, so pls don't think I am just looking for someone to do my homework lol! In a gist, the project I am currently working on requires me…
-3
votes
1 answer

Getting the Protein names and their ID for a given list of peptide sequence (using Python)

I have a list of peptide sequence, I want to map it to the correct protein names from any Open Database like Uniprot, i.e., peptides belonging to the proteins. Can someone guide how to find the protein names and map them, thanks in advance.
Mathew
  • 61
  • 1
  • 8
-4
votes
1 answer

Downloading PDB files from BindingDB

I am trying to find a way to download PDB files of proteins using BindingDB. I have a file with different BindingDB IDs, and I want to download PDB files for every ligand that binds to the protein for each ID. I was using a script to download…
sergio
  • 31
  • 7
-8
votes
2 answers

How could I write the following functions in Matlab for MS protein analysis?

I need your help. I have more than 40000 proteins in fasta file format. First I want to write a function: that is able to calculate the masses of the b- and y-ions that creates a peptide database from the target proteins (mat-file) that…
Michael.Z
  • 37
  • 5
1 2 3
9
10