3

I need some coulomb matrices of molecules for a machine learning task. Coulomb Matrix? Here's a paper describing it

I found the Python package molml which has a method for it. However I can't figure out how to use the API for a single molecule only. In all examples they provide the method is called with two molecules, why?

How the example provides the method:

H2 = (['H', 'H'],
      [[0.0, 0.0, 0.0],
      [1.0, 0.0, 0.0]])

HCN = (['H', 'C', 'N'],
       [[-1.0, 0.0, 0.0],
        [ 0.0, 0.0, 0.0],
        [ 1.0, 0.0, 0.0]])

feat.transform([H2, HCN])

I need something like this:

 atomnames = [list of atomsymbols]
 atomcoords = [list of [x,y,z] for the atoms] 
 coulombMatrice = CoulombMatrix((atomnames,atomcoords)

I also found another lib (QML) which promises the possibility to generate coulomb matrices, but, I'm not able to install it on windows because it depends on Linux gcc-fortran compilers, I already installed cygwin and gcc-fortran for this purpose.

Thank you, guys

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
maniac
  • 1,112
  • 1
  • 13
  • 19

1 Answers1

0

I've implemented my own solution for the problem. There's much room for improvements. E.g. randomly sorted coulomb matrix and bag of bonds are still not implemented.

    import numpy as np

def get_coulombmatrix(molecule, largest_mol_size=None):
    """
    This function generates a coulomb matrix for the given molecule
    if largest_mol size is provided matrix will have dimension lm x lm.
    Padding is provided for the bottom and right _|
    """
    numberAtoms = len(molecule.atoms)
    if largest_mol_size == None or largest_mol_size == 0: largest_mol_size = numberAtoms

    cij = np.zeros((largest_mol_size, largest_mol_size))

    xyzmatrix = [[atom.position.x, atom.position.y, atom.position.z] for atom in molecule.atoms]
    chargearray = [atom.atomic_number for atom in molecule.atoms]

    for i in range(numberAtoms):
        for j in range(numberAtoms):
            if i == j:
                cij[i][j] = 0.5 * chargearray[i] ** 2.4  # Diagonal term described by Potential energy of isolated atom
            else:
                dist = np.linalg.norm(np.array(xyzmatrix[i]) - np.array(xyzmatrix[j]))
                cij[i][j] = chargearray[i] * chargearray[j] / dist  # Pair-wise repulsion
    return cij
maniac
  • 1,112
  • 1
  • 13
  • 19
  • Any ideas on where the relation "0.5 * chargearray[i] ** 2.4" comes from? Besides the one-liner explanation in "Rupp et al Phys. Rev. Lett.,108(5):058301, 2012", do you know where can one find more detailed explanation of how this relation is derived? – Blade Sep 29 '19 at 00:58