2

I am a beginner in natural language processing. I have to work on different languages that Tamil is one of them. Could I ask from experts whether there is any Tamil language tokenizer code (java,c,python or etc.) and part of speech tagger codes that I use it for my research?

I really appreciate if I can get some experts' opinion here. Any help is appreciated.

Thanks

S.EB
  • 1,966
  • 4
  • 29
  • 54
  • 2
    I'm not aware of any corpora for Tamil language. If there aren't any available, you could use [NLPf](https://gitlab.com/schrieveslaach/NLPf) to train custom NLP models (tokenization, sentence segmentation, POS tagging) for the Tamil language. I'm happy to demonstrate the project to you. – schrieveslaach Jan 23 '18 at 14:46
  • @Schrieveslaach Thank you very much for your help, You are really helpful. Thanks a lot – S.EB Jan 29 '18 at 12:02
  • 1
    you are welcome. :-) Have you tried NLPf? If you did, would you fill out a questionnaire about the project? It shouldn't take longer than fifteen minutes. Could you reach out to me via e-mail? – schrieveslaach Jan 29 '18 at 12:10
  • @Schrieveslaach, My apologies I did not pay attention `NLPf` is your project. Thank you very much for your help. Sure, definitely I will complete the questionnaire and contact via email. Thanks a lot once again. Sir, You did very big help. – S.EB Jan 29 '18 at 13:01
  • I'm happy to provide help with NLPf. Just send me an e-mail and I will sent you the questionnaire. You will find my e-mail in the commits or on my website (link is available on my profile). – schrieveslaach Jan 29 '18 at 14:09
  • did you find my e-mail address? – schrieveslaach Jan 31 '18 at 09:43
  • @S.EB Im a beginner at NLP, looking for a library for Tokenizing POS tagging and Morphological Analysis in Tamil. Were you able to find any good resources that you could recommend me? Thank you so much! – Niveditha Karmegam Feb 12 '19 at 14:16
  • @Schrieveslaach Is there any chance you could help me out on how to train NLP models for tokenizing and POS tagging in Tamil? But i was able to find some Tamil Corpora's as well – Niveditha Karmegam Feb 12 '19 at 14:20
  • @NivedithaKarmegam, currently I'm very busy due to my job and an up coming conference. Could you reach out to me in about one and a half week? – schrieveslaach Feb 13 '19 at 15:17
  • @Schrieveslaach Thank you so much! I will contact you in another one and a half weeks time. Thanks again, this would be of great help! – Niveditha Karmegam Feb 13 '19 at 18:57

1 Answers1

4

I have found one tool for tokenization Indic NLP Library. It supports Tamil.


I found no POS tagger tools available on the internet, but I have found some papers:

2008 Morpheme based Language Model for Tamil Part-of-Speech Tagging

2009 CRF Models for Tamil Part of Speech Tagging and Chunking

2009 Improvement of Rule Based Morphological Analysis and POS Tagging in Tamil Language via Projection and Induction Techniques

Maybe you can contact the authors for help.


Or if you can speak Tamil, search on the internet(especially university websites) in Tamil, you may find some resources and tools.

Lynten
  • 321
  • 3
  • 5