I have alkene molecules of formula C9H17B. How can I separate these molecules into three classes, one being the class that has C-B-H2, one that has C2-B-H and one that has C3-B. How would I do this? I've tried using smiles and also as mol but my approaches aren't working.
Asked
Active
Viewed 106 times
2 Answers
4
To find specific substructures use SMARTS.
https://www.daylight.com/dayhtml/doc/theory/theory.smarts.html
If I see it correctly these are the three types of boron you are looking for.
from rdkit import Chem
from rdkit.Chem import Draw
smiles = ['CCB', 'CCBC', 'CCB(C)(C)']
mols = [Chem.MolFromSmiles(s) for s in smiles]
Draw.MolsToGridImage(mols)
Write SMARTS for boron with three connections BX3
and number of hydrogen H2
, H1
, H0
.
smarts = ['[BX3;H2]', '[BX3;H1]', '[BX3;H0]']
patts = [Chem.MolFromSmarts(s) for s in smarts]
Now you can proof for substructure in each molecule.
for p in patts:
for m in mols:
print(m.HasSubstructMatch(p))
print()
True
False
False
False
True
False
False
False
True

rapelpy
- 1,684
- 1
- 11
- 14
2
See https://www.rdkit.org/docs/GettingStartedInPython.html#looping-over-atoms-and-bonds
Copied from above link:
Atoms keep track of their neighbors:
>>> atom = m.GetAtomWithIdx(0) # You might need to adjust how you find the atom.
>>> [x.GetAtomicNum() for x in atom.GetNeighbors()]
[8, 6]
Then just check how many neighbours are hydrogen atoms. Hope this works for you.

Andrew McClement
- 1,171
- 5
- 14