Hi i want to perform the task of grouping the same molecular structures by using smiles code.
However, even with the same structure, it is difficult to group them because the representation of dummy atoms is different.
I'm using the RDKIT program and I've tried changing several options but haven't found a solution yet. I would like to ask for your help. (rdkit version 2022.3.4)
Example smiles: (same structure but different smiles code -> desired code format)
- [1*]C(=O)OC, [13*]C(=O)OC -> *C(=O)OC
- [31*]C1=CC=CC2=C1C=CC=N2, [5*]C1=CC=CC2=C1C=CC=N2 -> *C1=CC=CC2=C1C=CC=N2
- [45*]C(N)=O, [5*]C(N)=O, [19*]C(N)=O, [16*]C(N)=O -> *C(N)=O