I'm using the RDKit package to convert some SMILES into a Fingerprint.
My problem is, I use Scikit-learn and I want to do a CV. For the CV, I need the np.array
data structure.
For one kind of fingerprint, I convert a data structure to a structure of 0's and 1's.
Here is just an arbitrary example
print (x)
# A vector containing a lot of
rdkit.DataStructs.cDataStructs.ExplicitBitVect object at 0x05E498F0
objects will be created
x=np.array(x)
print (x)
# A vector of 1's and 0's will be created.
I don't know why the numpy array converting, changes the type.
For an object like rdkit.DataStructs.cDataStructs.LongSparseIntVect object at 0x05DDF960
Numpy changes the vector to the same structure.
I'm asking because for 2 out of the 4 fingerprints, I get the following error, due to the Numpy conversion:
AttributeError: 'numpy.ndarray' object has no attribute 'GetNumBits'
for fingerprint morgan
Code
from rdkit import DataStructs
from rdkit.Chem.Fingerprints import FingerprintMols
from rdkit.Chem import AllChem
from rdkit import Chem
from rdkit import DataStructs
from rdkit.Chem import MACCSkeys
import numpy as np
ms = [Chem.MolFromSmiles('CCOC'), Chem.MolFromSmiles('CCO'),Chem.MolFromSmiles('COC')]
fps = [MACCSkeys.GenMACCSKeys(x) for x in ms]
a=DataStructs.FingerprintSimilarity(fps[0],fps[1])
#everything is fine
print fps
print a
# output: [<rdkit.DataStructs.cDataStructs.ExplicitBitVect object at 0x0325CE30>, <rdkit.DataStructs.cDataStructs.ExplicitBitVect object at 0x0325CE68>, <rdkit.DataStructs.cDataStructs.ExplicitBitVect object at 0x0325CEA0>]
#now the error occurs
fps=np.array(fps)
print fps
#output: [[0 0 0 0 1 0 1 .....] [1 0 0 0 1...0 1] [1 0 0 .... 1 1]
a=DataStructs.FingerprintSimilarity(fps[0],fps[1])
#AttributeError: 'numpy.ndarray' object has no attribute 'GetNumBits'