3

I found that NLKT in python does it via *raw_parse* function but I need to use Java. I found cleartk has a MaltParser wrapper but there is no documentation about it. I'm looking for a function or a project that first converts raw English text to conll file that MaltParser can use and parses it with MaltParser. Any help is appreciated.

demongolem
  • 9,474
  • 36
  • 90
  • 105
Halil
  • 2,076
  • 1
  • 22
  • 30

1 Answers1

0

There are examples coming with the MaltParser 1.7.2 distribution in the folder examples/apiexamples/srcex.

However, these examples only show how to run the MaltParser programmatically after tokenization and pos-tagging have already been performed (and after the output of these steps has been converted to a CONLL-like format).

Since I currently cannot offer a better (simpler/shorter) alternative, at least I could share with you a link to a Groovy script which performs tokenization, part-of-speech tagging (using OpenNLP) and dependency parsing (using MaltParser). The tools are made interoperable using UIMA. If one is familiar with Maven, it should be quite straight forward to derive a Java version of that script.

Mind, this is not the best answer, but at this point possibly better than nothing.

Note: I'm a developer on both, Apache UIMA and DKPro Core (the project to which the link points).

rec
  • 10,340
  • 3
  • 29
  • 43
  • I believe non of those parse raw text. They all take in conll formatted input. – Dana Jul 29 '13 at 16:58
  • 1
    What should I say, you're right... Stupid me... in order to run MaltParser on raw text, one would require a tokenizer and a part-of-speech tagger. – rec Jul 29 '13 at 19:34