1

So I am doing a project here I need to process the information of a text. I used opennlp and experimented with freeling and both got me good results (lemmas, divide by sentences, divide by phrases and POS). But then I trained maltparser with this CONLL (http://www.linguateca.pt/floresta/CoNLL-X/) file, and the POS tags that malparser uses is different from opennlp and freeling. I know that a way of doing this is converting the POS tag form opennlp (or freeling) to be accepted by the malparser. What I wanted to know is if is there any program that uses the CONLL format to train his algorithm, so that the POS and lemmas that I got are known by the malparser. If possible a program that work on Java and windows.

Ty

user3591111
  • 102
  • 13
  • Tried `http://www.nltk.org/_modules/nltk/corpus/reader/conll.html` and `https://github.com/nltk/nltk/blob/develop/nltk/parse/malt.py`? – alvas Feb 23 '16 at 21:23
  • From what I see the first one is to read CONLL files and the second one is hardcoded in English and my language is not English – user3591111 Feb 23 '16 at 23:13
  • The MaltParser API in NLTK is not specific to English. It can train new models given a CONLL file and also parse sentences using other pre-trained models, see https://github.com/alvations/nltk/blob/f4c16c2f9c46cc42c9b68ae746832b622581c6b5/nltk/parse/malt.py#L435 – alvas Feb 24 '16 at 00:04

0 Answers0