0

What I want to do is add in a custom resource that tells SyntaxNet to combine two tokens into a single token. I'm processing biomedical data from NCBI and species are almost always written with their genus (so, genus + species). I need to preserve the genus + species format into a single token.

Egs,

Arthrobacter globiformis (genus = "Arthrobacter", species = "globiformis")
Desulfosporosinus meridiei (genus = "Desulfosporosinus", species = "meridiei")
E. coli (genus = "E.", species = "coli")

Is there a way to do this in SyntaxNet that does not include retraining?

Shane
  • 11
  • 4

1 Answers1

0

I am afraid there is no easy (and principled) solution for your problem. You could try preprocessing your data before parsing it with SyntaxNet. More principled solutions would require code changes.