Questions tagged [rdkit]

RDKit is a popular open-source library for chemoinformatics and machine learning applied to chemoinformatics.

RDKit is a popular open-source library for chemoinformatics and machine learning applied to chemoinformatics.

251 questions
3
votes
2 answers

How can I compute a Count Morgan fingerprint as numpy.array?

I would like to use rdkit to generate count Morgan fingerprints and feed them to a scikit Learn model (in Python). However, I don't know how to generate the fingerprint as a numpy array. When I use from rdkit import Chem from rdkit.Chem import…
evilolive
  • 407
  • 1
  • 5
  • 12
3
votes
1 answer

How can I optimize this script so it does not take a week to finish the task it is doing? (Used BASH PARALLEL too.)

I have a directory full of 60,000 files that are named by their molid. I have a second file in CSV format that has molids in column 1 with their respective CHEMBLID in column 2. I need to match the file name molid in the directory with a molid in…
Lani
  • 33
  • 3
3
votes
0 answers

converting an array into Numpy array change the values

I'm using the RDKit package to convert some SMILES into a Fingerprint. My problem is, I use Scikit-learn and I want to do a CV. For the CV, I need the np.array data structure. For one kind of fingerprint, I convert a data structure to a structure of…
auronsen
  • 225
  • 1
  • 3
  • 12
3
votes
2 answers

How to handle/map custom postgresql type to django model

I am using rdkit a cheminformatics toolkit which provides a postgresql cartridge to allow the storage of Chemistry molecules. I want to create a django model as follows: from rdkit.Chem import Mol class compound(models.Model): internal =…
harijay
  • 11,303
  • 12
  • 38
  • 52
3
votes
1 answer

apache doesn't respect LD_LIBRARY_PATH?

In my web application I have this piece of code: from rdkit import Chem This causes it to crash under apache, in logs I can see: [Fri Sep 06 10:35:44 2013] [error] [client 172.22.69.51] ImportError: libRDGeneral.so.1: cannot open shared object…
mnowotka
  • 16,430
  • 18
  • 88
  • 134
3
votes
2 answers

Cannot install rdkit in ubuntu 11.10

I have spent many hours trying to build RDKit on ubuntu 11.10 for Python 2.7 (rdkit_201106+dfsg.orig.tar.gz) using a precompiled version of boost 1.49. And I am failing miserably. The recurring error is in the CMake GUI: CMake Error at…
zinon
  • 4,427
  • 14
  • 70
  • 112
2
votes
1 answer

Ubuntu with cc1plus - error Not Implemented

I am trying to use the make command on Ubuntu 11.10, but get an error. g++ -g -O2 -fPIC -fPIC -Wall -Wpointer-arith -Wendif-labels -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -fpic -Wno-deprecated…
bladepit
  • 853
  • 5
  • 14
  • 29
2
votes
1 answer

How to catch the error message from Chem.MolFromSmiles('Formula')

I'm new to this rdkit, below is the code that I'm using to get the chemical image from the formula, from rdkit import Chem m = Chem.MolFromSmiles('OCC1OC(C(C(C1O)O)O)[C]1(C)(CO)CC(=O)C=C(C1CCC(=O)C)C') m if the code is correct, it displays the…
Ajay Managaon
  • 450
  • 2
  • 9
2
votes
2 answers

Calculate Tanimoto coefficient for dataframe

I have a table that looks like this: and I want to calculate Tanimoto coefficient (Molecular similarity measure) by RDkit in python in order to have below result: but I failed. My data: {'name': ['16β-hydro-ent-kauran-17-oic acid ', …
jacobdavis
  • 35
  • 6
2
votes
2 answers

How to separate a list of molecules based on how many hydrogens are attached to a certain atom?

I have alkene molecules of formula C9H17B. How can I separate these molecules into three classes, one being the class that has C-B-H2, one that has C2-B-H and one that has C3-B. How would I do this? I've tried using smiles and also as mol but my…
BanAckerman
  • 103
  • 1
  • 8
2
votes
1 answer

How to save rdkit DrawMorganBit output as image?

code: import numpy as np from rdkit import Chem from rdkit.Chem import Draw, AllChem, PandasTools, DataStructs mol = Chem.MolFromSmiles('O=C1N([C@@H](C)C2CC2)CC3=CC(C4=C(C)N=C(NC(C)=O)S4)=CC(S(=O)(C)=O)=C31') bi = {} fp =…
Park
  • 27
  • 6
2
votes
1 answer

Handling SMILES with metal ions in RDKit

I have the following function that takes a dictionary of SMILES strings and converts them to RDKit mol objects. def smiles_dict_to_mol_list(smiles_dict): """smiles dict is a dictionary object containing molecule names as keys and smiles…
Paul
  • 165
  • 1
  • 12
2
votes
1 answer

Python argument types in rdkit.Chem.rdmolfiles.MolToMolBlock(NoneType)

I am trying to convert inchi to sdf format using rdkit python library. I am running following line of python code. #convert inchi to sdf def MolFromInchi(id,inchi): mol = Chem.MolFromInchi(inchi) mol_block = Chem.MolToMolBlock(mol) …
rshar
  • 1,381
  • 10
  • 28
2
votes
1 answer

RdKit Coordinates for atoms in a molecule

Hey everyone I need some help formatting coordinates for atoms in a molecule and I'm coding with Python. What I am needing is along the lines of: (atom) x y z coordinates For every atom in the molecule. So far my code is: for molecule in mol_list: …
2
votes
1 answer

RDKit: "TypeError: 'Mol' object is not iterable" when attempting looped enumeration

I am trying to use RDKit to enumerate large libraries of compounds and output the result as a single column of SMILES strings in a CSV file. I was able to use the following code successfully: import os os.chdir('xxx') from rdkit import Chem from…
1 2
3
16 17