-1

When I run the python script, it shows the problem "SMILES Pass Error"imageimage2image3. Codes show as below.

def convert_vocab_to_smiles_fn(smi_vocab_coding_padded,
                               token_EOS=SmilesVocab.smi_vocab.index('<EOS>')):
  index_eos = np.where(smi_vocab_coding_padded == token_EOS)[0]
  f_counter = 0
  if index_eos.size == 0:
    f_counter += 1
    return None, ''.join(
      [SmilesVocab.smi_vocab[x] for x in smi_vocab_coding_padded])
  elif index_eos[0] == 0:
    f_counter += 1
    return False, ''.join(
      [SmilesVocab.smi_vocab[x] for x in smi_vocab_coding_padded])
  else:
    int_encoding = smi_vocab_coding_padded[:index_eos[0]]
    smiles_encoding = ''.join(
      [SmilesVocab.smi_vocab[x] for x in int_encoding])

    try:
      rdkit_mol_encoding = rdkit_chem.MolFromSmiles(smiles_encoding)
    except:
      rdkit_mol_encoding = None
  return smiles_encoding, rdkit_mol_encoding

 

       

       

I search the solution on the Internet, but I find nothing

yan
  • 1
  • 2

1 Answers1

1

I don't understand how you build smiles_encoding, but it gives you corupted SMILES.

Here is an example where only the first SMILES is correct.

from rdkit import Chem

s = ['c1ccccc1', '-c1ccccc1c', 'c1c(cccc1', 'c1cccc']

mol = []

for smiles_encoding in s:
    try:
        rdkit_mol_encoding = Chem.MolFromSmiles(smiles_encoding)
        mol.append(rdkit_mol_encoding)
    except:
        rdkit_mol_encoding = None

print(mol)

Output:

[<rdkit.Chem.rdchem.Mol object at 0x00000175DDA89F20>, None, None, None]


[22:43:58] SMILES Parse Error: syntax error while parsing: -c1ccccc1c
[22:43:58] SMILES Parse Error: Failed parsing SMILES '-c1ccccc1c' for input: '-c1ccccc1c'
[22:43:58] SMILES Parse Error: extra open parentheses for input: 'c1c(cccc1'
[22:43:58] SMILES Parse Error: unclosed ring for input: 'c1cccc'

Either your creation of SMILES is not good or you are working with corrupted SMILES.

rapelpy
  • 1,684
  • 1
  • 11
  • 14