I have several CIF files, for each of them I'd like to extract 3 chains and save the three chains in one PDB file. Problem could be that some CIF structures can contain different chain-IDs of my desired chains: I solved it the way that I parsed RCSB webpage with Beautiful Soup and save the chain-IDs for each CIF file in corresponding yaml file. Then I open each CIF file as well as its yaml file, then I get correct chainIDs for each CIF from its yaml file.
Now I'd like to extract all three chains and save them. I tried to use something from similar questions this. However, the last line of my code (with io.save) returns error: IndexError: tuple index out of range
code:
cif_input = "AAA","BBB","CCC"
for ID in cif_input:
cif_file = "{}.cif" .format(ID) #these are my CIF files
yaml_file_protein = "{}.yaml" .format(ID) #these are yaml files for each of the CIF files, they contain information about chain IDs
with open(yaml_file_protein, "r") as file_protein:
proteins = yaml.load(file_protein, Loader=yaml.FullLoader) #loading the yaml file
chain_1 = proteins["chain1"] #here I get information what ID has chain no.1, result in AAA.cif is for example: U
chain_2 = proteins["chain2"] #here I get information what ID has chain no.2
chain_3 = proteins["chain3"] #here I get information what ID has chain no.3
structure = parser.get_structure("{}" .format(ID), "{}.cif" .format(ID))[0] #parsing corresponding structure (CIF file)
#this is what I tried implement from similar questions: I need to extract all three chains and save them in one PDB, this have to be done for all CIF files
class ChainSelect(Select):
def accept_chain(self, chain):
if chain.get_id()=='{}'.format(chain_1): #this is what I should select chainID of chain_1 based on what is specified above from the yaml file
return True
if chain.get_id()=='{}'.format(chain_2):
return True
if chain.get_id()=='{}'.format(chain_3):
return True
else:
return False
io = PDBIO()
io.set_structure(structure)
io.save("{}_selection.pdb" .format(ID), ChainSelect(chain))
I assume the problem will be somewhere in the "class ChainSelect..." but can't figure out what exactly could be causing such error. I'd be very helpful for any suggestions.
Edit: I also thought the problem could be in lines if chain.get_id()=='{}'.format(chain_X):
, because it'd return e.g. U and not "U", thus I tried to edit it as if chain.get_id()==" '{}' ".format(chain_X):
which return "U" , however the error is still the here. I tried this code with only one chain as well, but still the same error...