1

I am trying to add atom numbers in smiles:

from rdkit import Chem
mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
for i, atom in enumerate(mol.GetAtoms()):
  atom.SetProp('molAtomMapNumber',str(i))
smi=Chem.MolToSmiles(mol)
print(smi)

The output is:

[cH:0]1[cH:1][cH:2][cH:3][cH:4][c:5]1C:6=[O:8]

Then I want to split the smiles into atoms:

from rdkit import Chem
mol=Chem.MolFromSmiles('c1ccccc(C(N)=O)1')
for i, atom in enumerate(mol.GetAtoms()):
  atom.SetProp('molAtomMapNumber',str(i))
  print(i,atom.GetSymbol())

the output is:

0 C
1 C
2 C
3 C
4 C
5 C
6 C
7 N
8 O

**But what I do want is something like this:

0 cH
1 CH
...
7 NH2
8 O**

Can anyone help me figure out how to get each atom with H from the smiles as above?

cybersam
  • 63,203
  • 6
  • 53
  • 76
hal
  • 11
  • 2

1 Answers1

0

You can get the atoms along with the Hydrogens using the SMILES smi variable you obtained.

import re
atoms_with_Hs = re.findall('\[(.*?)\:', smi)
print(atoms_with_Hs)

>> ['cH', 'cH', 'cH', 'cH', 'cH', 'c', 'C', 'NH2', 'O']
Vandan Revanur
  • 459
  • 6
  • 17