I wonder how the multiple pdbs can be written in single pdb file using biopython
libraries. For reading multiple pdbs such as NMR structure, there is content in documentation but for writing, I do not find. Does anybody have an idea on it?
3 Answers
Yes, you can. It's documented here.
Image you have a list of structure objects, let's name it structures
. You might want to try:
from bio import PDB
pdb_io = PDB.PDBIO()
target_file = 'all_struc.pdb'
with pdb_file as open_file:
for struct in structures:
pdb_io.set_structure(struct[0])
pdb_io.save(open_file)
That is the simplest solution for this problem. Some important things:
- Different protein crystal structures have different coordinate systems, then you probably need to superimpose them. Or apply some transformation function to compare.
- In pdb_io.set_structure you can select a entity or a chain or even a bunch of atoms.
- In pdb_io.save has an secondary argument which is a Select class instance. It will help you remove waters, heteroatoms, unwanted chains...
Be aware that NMR structures contain multiple entities. You might want to select one. Hope this can help you.

- 965
- 1
- 6
- 16
-
Thank you mithrado but this does not provide the solution of my problem. – Exchhattu Apr 04 '17 at 17:58
Mithrado's solution may not actually achieve what you want. With his code, you will indeed write all the structures into a single file. However, it does so in such a way that might not be readable by other software. It adds an "END" line after each structure. Many pieces of software will stop reading the file at that point, as that is how the PDB file format is specified.
A better solution, but still not perfect, is to remove a chain from one Structure and add it to a second Structure as a different chain. You can do this by:
# Get a list of the chains in a structure
chains = list(structure2.get_chains())
# Rename the chain (in my case, I rename from 'A' to 'B')
chains[0].id = 'B'
# Detach this chain from structure2
chains[0].detach_parent()
# Add it onto structure1
structure1[0].add(chains[0])
Note that you have to be careful that the name of the chain you're adding doesn't yet exist in structure1
.
In my opinion, the Biopython library is poorly structured or non-intuitive in many respects, and this is just one example. Use something else if you can.

- 1,253
- 13
- 21
-
Ideally, this might provide the insight to solve the problem. Thank you Nate. – Exchhattu Apr 04 '17 at 18:00
Inspired by Nate's solution, but adding multiple models to one structure, rather than multiple chains to one model:
ms = PDB.Structure.Structure("master")
i=0
for structure in structures:
for model in list(structure):
new_model=model.copy()
new_model.id=i
new_model.serial_num=i+1
i=i+1
ms.add(new_model)
pdb_io = PDB.PDBIO()
pdb_io.set_structure(ms)
pdb_io.save("all.pdb")