0

I am training a random forest with scikit-learn on Morgan fingerprints and would like to know which structural motifs are most important. For that I would like to draw all fragments that produce an on-bit in the x most important features.

I have found the Draw.DrawMorganBits module in the new release and these examples for usage: https://iwatobipen.wordpress.com/2018/11/07/visualize-important-features-of-machine-leaning-rdkit/ http://rdkit.blogspot.com/2018/10/using-new-fingerprint-bit-rendering-code.html

However, I don't know how to produce a unique set of fragments. Previously I went through my test set, collected the bitinfo and molecular environments and created SMILES with Chem.MolFragmentToSmiles. Then I created mols from a set of these SMILES and plotted them. However, this is a weak representation of the environment and some fragments cannot be plotted. I can provide my old code. It follows the old documentation https://rdkit.readthedocs.io/en/release_2017_03_1/GettingStartedInPython.html#explaining-bits-from-morgan-fingerprints

evilolive
  • 407
  • 1
  • 5
  • 12
  • 1
    If I unstand it correctly you want a MorganBit for each OnBit and when Importance gives e.g. [8,405,879,...] you want to display the corosponding MorganBit. I tried to make a dictionary with key=OnBit and value=MorganBit for a whole set of fingerprints, but the problem is, that the MorganBits are not unique because the depiction is made based on the molecule it is made from. So for e.g. onBit 405 every molecule gives a differnt picture. Maybe it is better to store SMARTS? – rapelpy Apr 21 '19 at 06:19
  • With the "old" MolFragmentToSmiles I got SMILES, yes. But the problem is that they sometimes don't show the whole information. Radius 0 gives atoms and for the other fragments we don't know if e.g. a 'c' is terminal or not/ aromatic or just a double bond. This is much nicer with the new module, but I don't want to print duplicates. Can we store SMILRD for r+1 and control the color for drawing to make it look like the new DrawMorganBits output? – evilolive Apr 23 '19 at 05:58
  • I don't know how to control the colors of a drawing. Have you tried SMARTS instead of SMILES? SMARTS are fragments and converted to mol they show differnt pics than SMILES. – rapelpy Apr 23 '19 at 18:05

0 Answers0