0

For the following code, when I execute the code I get an error, which I've listed below. I was wondering if anyone could give me any insights into how to append the CA atoms into tag_atoms/tagged_atoms lists, which I will use for alignment. And highlight any potential flaws in the way the code is written which I would be overlooking. I'm new to python, so any insight would be great and very helpful.

def loadPDB(pdb_name):

    folder = pdb_name[1:3]
    pdbl = PDB.PDBList()
    pdbl.retrieve_pdb_file(pdb_name)
    parser = PDB.PDBParser(PERMISSIVE=1)
    structure = parser.get_structure(
        pdb_name, folder + "/pdb" + pdb_name + ".ent")

    return structure

def alignCoordinates(taggedProtein, potentialTag):
    for model in taggedProtein:
        firstModel = model
        break
    for chain in firstModel:
        firstChain = chain
        break

    for firstChain in firstModel:
        tagged_atoms = []
        tag_atoms    = []

        for residue in firstChain:
            tagged_res = residue

        for tagged_res in firstChain:
            tagged_atoms.append(firstChain['CA'])

    for model in potentialTag:
        firstTagModel = model
        break

    for chain in firstTagModel:
        firstTagChain = chain
        break

    for residue in firstTagChain:
        tag_res = residue

        for tag_res in firstTagChain:
            tag_atoms.append(firstTagChain['CA'])

    super_imposer = Bio.PDB.Superimposer()
    print repr(tagged_atoms)
    print repr(tag_atoms)
    super_imposer.set_atoms(tagged_atoms, tag_atoms)
    super_imposer.apply(tag_model.get_atoms())

    print super_imposer.rms

    io = Bio.PDB.PDBIO()
    io.set_structure(tag_model)
    io.save("Aligned.PDB")

def main():

    pdb1 = "2lyz"
    pdb2 = "4abn"

    potentialTag  = loadPDB(pdb1)
    taggedProtein = loadPDB(pdb2)

    alignCoordinates(taggedProtein, potentialTag)

main()

This is the error message below:

Structure exists: '/Users/Azi_Ts/Desktop/ly/pdb2lyz.ent' 
Structure exists: '/Users/Azi_Ts/Desktop/ab/pdb4abn.ent' 
/Library/Python/2.7/site-packages/Bio/PDB/StructureBuilder.py:87:          PDBConstructionWarning: WARNING: Chain A is discontinuous at line 13957.
  PDBConstructionWarning)
/Library/Python/2.7/site-packages/Bio/PDB/StructureBuilder.py:87:   PDBConstructionWarning: WARNING: Chain B is discontinuous at line 14185.
  PDBConstructionWarning)

Traceback (most recent call last):
  File "alignPDB.py", line 76, in <module>
    main()
  File "alignPDB.py", line 74, in main
    alignCoordinates(taggedProtein, potentialTag)
  File "alignPDB.py", line 39, in alignCoordinates
    tagged_atoms.append(firstChain['CA'])
  File "/Library/Python/2.7/site-packages/Bio/PDB/Chain.py", line 70, in    __getitem__
    return Entity.__getitem__(self, id)
  File "/Library/Python/2.7/site-packages/Bio/PDB/Entity.py", line 38, in   __getitem__
    return self.child_dict[id]
KeyError: 'CA'
xbello
  • 7,223
  • 3
  • 28
  • 41

1 Answers1

0

To get all the CA atoms you only have to do:

ca_atoms = [atom for atom in taggedProtein.get_atoms() if atom.name=="CA"]

Remember that the structures loaded, taggedProtein and potentialTag, have three methods that might be useful here: get_chains(), get_residues() and get_atoms(). Using those three you could get rid of every for loop you have in def alignCoordinates().

xbello
  • 7,223
  • 3
  • 28
  • 41