7

Can someone recommend an open source POS tagger for Korean, Indonesian, Thai and Vietnamese?

That I can use to tag the corpus data that I currently have. (e.g. the stanford-postagger)

If you are a dev and care to share and let me test out the POS tagger, I don't mind either.

With some modifications of the output, I've POS tagged the Vietnamese data with jvntextpro

But I'd still like more input on Korean, Indonesian and Thai POS tagging.

hippietrail
  • 15,848
  • 18
  • 99
  • 158
alvas
  • 115,346
  • 109
  • 446
  • 738

2 Answers2

5

After acl wiki: Korean morphological analyzer and part-of-speech tagger

I would start to look on the websites of NLP research departments in Korea, Thailand, and Korean. On this page, you will find links to the research departments.

Good luck!

UPDATE: OpenNLP has thai PoS. Here are the models: http://opennlp.sourceforge.net/models/thai/ for PoS opennlp tagger.

Skarab
  • 6,981
  • 13
  • 48
  • 86
  • http://isoft.postech.ac.kr/Course/CS730b/2005/index.html i've found the korean tagger on this page. now the thai tagger is missing. hahaa.. thanks for the page, but we need a better collation of NLP resources. – alvas Apr 16 '11 at 04:49
0

You might want to try RDRPOSTagger: a robust, easy-to-use and language-independent toolkit for POS and morphological tagging.

(Programming language: Python & Java)

RDRPOSTagger obtains fast performance in both learning and tagging process. In addition, RDRPOSTagger achieves a very competitive accuracy in comparison to the state-of-the-art results. See experimental results including performance speed and tagging accuracy in this paper.

RDRPOSTagger now supports pre-trained POS and morphological tagging models for 13 languages, including Thai and Vietnamese.

NQD
  • 470
  • 5
  • 8