where can I get training data of part-of-speech tagger?

Question

I want to implement a part-of-speech tagger,but I don't know where I can get a lot of training data? Thanks!

https://www.google.com/search?q=pos+corpus – tripleee Aug 16 '14 at 13:36 — tripleee, Aug 16 '14 at 13:36

score 5 · Accepted Answer · answered Aug 16 '14 at 13:32

5

There's a training set and testing set from the chunking shared task of the CoNLL-2000 conference here:

Others have used this to train part-of-speech taggers:

answered Aug 16 '14 at 13:32

1

The link has changed to https://www.clips.uantwerpen.be/conll2000/chunking/ – Rouzbeh Sep 11 '17 at 17:58

score 3 · Answer 2 · answered Jul 18 '19 at 01:18

https://catalog.ldc.upenn.edu/LDC99T42 <--- They want $1700.00 or $850.00 if you have a Reduced-License :-(

2 Answers2