Part-Of-Speech tagging and Named Entity Recognition for C/C++/Obj-C

Question

need some help!

I'm trying to write some code in objective-c that requires part-of-speech tagging, and ideally also named entity recognition. I don't have much interest in "rolling my own", so I'm looking for a decent library to use for this purpose. Obviously the more accurate the better, but we're not talking anything critical here -- so as long as it's generally pretty accurate that's good enough.

It's going to be English-only, at least for the time being, but I don't want to have to do any training of models myself. So whatever the solution, it has to have an English language model already built.

And finally, it has to be available via a commercial-friendly license (e.g. BSD/Berkeley, LGPL). Can't do GPL or anything restrictive like that, though I'm open to paying a small amount for a commercial license if that's the only option.

C, C++ or Obj-C code is all fine.

So: Anyone familiar with something that'd do the trick here? Thanks!!

Yes. And now that iOS 5 is out from under the NDA, I can specifically tell you that what you want to look at is the NSLinguisticTagger class. It does its best to recognize proper names of people, places and organizations. It's reasonably successful. — DanM, Nov 03 '11 at 18:12

score 3 · Accepted Answer · answered Jul 09 '11 at 22:52

3

I suggest you check out the iOS 5 beta release notes.

answered Jul 09 '11 at 22:52

jtbandes

115,675
35
233
266

1

Since the NDA no longer applies to iOS 5: The class that handles this stuff is NSLinguisticTagger. – DanM Nov 03 '11 at 18:13

score 1 · Answer 2 · answered Jul 09 '11 at 22:45

As you've probably figured out most of the NLP code that's freely available is in python, perl or java. However, a quick look at Stanford's NLP tools page shows a few things in C/C++ that are available. Another list of tools can be found at a blog post.

Of the POS taggers, YamCha is well-known, though I have not used it myself (being a java/python/perl guy).

Unfortunately, I cannot suggest any NER nlp tools. However, I bet there's a maxent or svm implentation in C/C++ that you can work with: 1) create your training data and annotate it 2) define your features 3) use the ml library

Sorry I can't be of more help, but if anything else comes to mind I'll add it.

Maybe once I figure out objective-c to a respectable degree I'll write an NLP library for it!

Part-Of-Speech tagging and Named Entity Recognition for C/C++/Obj-C

2 Answers2