1

I want to use the graphaware nlp package to automatically perform nlp feature extraction on Dutch texts in neo4j.

For this purpose I wanted to use OpenNLP as it should have support for Dutch. The installation worked well, and I can annotate English texts, but for Dutch texts, the following error is thrown:

Neo.ClientError.Procedure.ProcedureCallFailed
Failed to invoke procedure `ga.nlp.annotate`: Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unsupported language : nl

I called the opennlp package using

MATCH (n:News)
CALL ga.nlp.annotate({text:n.text, id: n.uuid, textProcessor: "com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor", pipeline: "tokenizer"}) YIELD result
MERGE (n)-[:HAS_ANNOTATED_TEXT]->(result)
RETURN n, result

So it sucessfully detects that the fragment is Dutch, but it can not annotate this.

As a solution I was trying to manually download the dutch models, but I don't know how to load these up and connect them in a pipeline. It also seems weird that they would not come as default.

Ivo Merchiers
  • 1,589
  • 13
  • 29
  • 1
    Right now the process of using other language models is very complex, that said we are working on that currently, we expect to be able to release a new version of nlp where you can just put the models in the plugins directory in 2 or 3 weeks. – Christophe Willemsen Oct 11 '18 at 08:09
  • @ChristopheWillemsen Is there any update on this? – Ivo Merchiers Nov 02 '18 at 11:10
  • 1
    Yes, opennlp has been upgraded to catch up latest changes, we're refactoring the language management generally and are creating the base for language packs so people can choose to load models for their own language. We will officially only support english for community ourself and for enterprise arabic, chinese and german. – Christophe Willemsen Nov 02 '18 at 13:51
  • 1
    @ChristopheWillemsen Is there any progress already on the language management? I am also interested in applying the Dutch Opennlp pack for graphaware... – N Meibergen Jun 04 '19 at 15:44

0 Answers0