4

We have a spring boot application integrated with Node.js and socket.io chat application , to which we want to integrate Natural language processing. Not getting any direction on which of these two Apache-OpenNlp or NLTK would be a better choice for us as both of the frameworks offer the kind of processing we need.

Wrt to the features provided by the frameworks , they both are good. Both have features that we are looking for. More than how to choose between features , what would suit our architecture better is a perspective I would like..

Any suggestions ?

Sharanya K M
  • 1,805
  • 4
  • 23
  • 44

1 Answers1

6

It is tough to answer a question about which product will meet your needs better without know what your needs are. OpenNLP can perform Tokenization, Sentence Detection, POS tagging, Named entity detection, language detection, Document classification, Chunking, and Sentence Parsing. It also has lower-level access to maximum entropy and Naive-bayes classifiers. I use OpenNLP often. NLTK appears to do the same stuff (I don't really use it, so I can't tell you all its benefits). A small difference is that OpenNLP is Java whereas NLTK is Python. So your preference can come into play. Another difference is that NLTK has build in methods for downloading corpora.

If you were a little more specific about what you wanted, people could give you better advice.

bad_coder
  • 11,289
  • 20
  • 44
  • 72
HowYaDoing
  • 820
  • 2
  • 7
  • 15
  • I think I have.. I know the features provided for both frameworks.. but which would suit an architecture with the above mentioned technologies is the guidance I would like – Sharanya K M Oct 31 '17 at 07:23
  • 2
    Personally, I really like OpenNLP, and because you just need to add the OpenNLP dependencies to maven + download models. I think it will integrate easily with a spring boot app. -- I do this very thing, and it is easy. I am sure that the NLTK community would argue that even though they are python-based, you can integrate it into a java app. Long story short, I suggest openNLP. But keep in mind, I work with AND ON openNLP, so I have a pro-OpenNLP bias. – HowYaDoing Oct 31 '17 at 13:55
  • 1
    What do you think about using [DKPro Core](https://dkpro.github.io/dkpro-core/)? This framework provide a generic API for many NLP frameworks and comes as Maven dependency. This means you can exchange any of the frameworks. If you also use domain specific NLP models by building your own corpus, you can use [this project](https://git.noc.fh-aachen.de/marc.schreiber/Towards-Effective-NLP-Application-Development) to determine your best-performing NLP pipeline for your domain. Let me know if you have any questions. – schrieveslaach Nov 01 '17 at 06:31
  • @HowYaDoing... Thanks a lot :) – Sharanya K M Nov 01 '17 at 08:18
  • @Schrieveslaach... Oh Wow!!... Will check it out .... – Sharanya K M Nov 01 '17 at 08:20
  • 1
    @Schrieveslaach I am not familiar with DKPro Core, but it appears that wraps many different NLP projects for UIMA. This is very useful if you are creating a UIMA-based solution. But I am not sure that Sharanya is considering UIMA, which brings it's own issues. – HowYaDoing Nov 01 '17 at 15:27
  • 1
    @HowYaDoing, it is true that UIMA adds another architectural layer but DKPro Core unifies all NLP tools through their wrappers. This means Sharanya can exchange any of these tools. I'm don't know the application domain but if the application domain is different to news papers (NLP models won't work well), the models and tools can easily be exchanged with a better working version. Therefore, Sharanya can use the [second project](https://git.noc.fh-aachen.de/marc.schreiber/Towards-Effective-NLP-Application-Development) to find the best NLP tools for the application domain. – schrieveslaach Nov 01 '17 at 15:39
  • 1
    I checked it out and I'm not Keen on using UIMA.. So I have chosen to go with OpenNLP... Thanks a lot guys :-) – Sharanya K M Nov 02 '17 at 12:00
  • @HowYaDoing... Is there any tutorials for open nlp which can help us implement rather than just understanding the concepts ? I found a couple of articles.. But is there any blog or something that we can follow ? – Sharanya K M Nov 06 '17 at 10:45
  • People put up tutorials on using openNLP pretty fast. I would recommend a quick google search like "sentence parsing OpenNLP", or "part of speech tagging OpenNLP". That will get you started. Then you can ask a new, more specific question how how to use the sentence detector, tokenizer, postagger etc. – HowYaDoing Nov 07 '17 at 14:27