Stanford CoreNLP Dependency Parser Usage with Unsupported Languages

Question

I am trying to train CoreNLP's NN based dependency parser in Turkish. I have found the command below in the documentation of the parser:

Train a parser with CoNLL treebank data: java    edu.stanford.nlp.parser.nndep.DependencyParser -trainFile trainPath
-devFile devPath -embedFile wordEmbeddingFile -embeddingSize wordEmbeddingDimensionality -model modelOutputFile.txt.gz

I couldn't exactly figure out what the modelOutputFile is. It is stated in the documentation that this file is written in the training phase. Is modelOutputFile a pregenerated file that I should create or just an empty file that will be written automatically in the training phase?

Any help will be appreciated, thank you!

score 1 · Answer 1 · answered Nov 02 '17 at 20:04

1

When the training process is done it should write the trained model to modelOutputFile.txt.gz You can then use that trained file to parse new text. Full documentation here: https://nlp.stanford.edu/software/nndep.shtml

answered Nov 02 '17 at 20:04

StanfordNLPHelp

8,699
1
11
9

Stanford CoreNLP Dependency Parser Usage with Unsupported Languages

1 Answers1