I am using rdkit a cheminformatics toolkit which provides a postgresql cartridge to allow the storage of Chemistry molecules. I want to create a django model as follows:
from rdkit.Chem import Mol
class compound(models.Model):
internal = models.CharField(max_length=10 ,db_index=True)
external = models.CharField(max_length=15,db_index=True)
smiles = models.TextField()
# This is my proposed custom "mol" type defined by rdkit cartridge and that probably maps
# to the Mol object imported from rdkit.Chem
rdkit_mol = models.MyCustomMolField()
So the "rdkit_mol" I want to map to the rdkit postgres database catridge type "mol". In SQL the "mol" column is created from the "smiles" string using syntax like
postgres@compounds=# insert into compound (smiles,rdkit_mol,internal,external) VALUES ('C1=CC=C[N]1',mol_from_smiles('C1=CC=C[N]1'), 'MYID-111111', 'E-2222222');
These call the "mol_from_smiles" database function defined by the cartridge to create the mol object.
Should I have the database take care of this column creation during save. I could them define a custom TRIGGER in postgres that runs the mol_from_smiles function to populate the rdkit_mol column.
I also want to be able to execute queries using the mol custom features that return django models. For example one of the SQL queries could me return me compound models that look like this one chemically. Currently in SQL I do
select * from compound where rdkit_mol @> 'C1=CC=C[N]1';
This then essentially returns the chemical "compound" objects.
My questions are : given the custom nature of my field . Is there a way to mix and match the features of the database "mol" type with the django compound model? What are ways to achieve this.
Currently I am leaning towards not using the Django ORM and just use raw SQL to backtrip to and from the database. I want to find out if there is a django way of working with such custom types.
In my current hybrid approach my views would look like this.
def get_similar_compounds(request):
# code to get the raw smiles string for eg 'C1=CC=C[N]1' from a form
db_cursor.execute("select internal from compound where rdkit_mol @> 'C1=CC=C[N]1';")
# code to get internal ids from database cursor
similar_compounds = compound.objects.filter(internal__in = ids_from_query_above)
# Then process queryset
Is this hybrid method advisable or is there a more pythonic/django way of dealing with this custom data type.