From website "ClearTK provides a framework for developing statistical natural language processing (NLP) components in Java and is built on top of Apache UIMA. It is developed by the Center for Computational Language and Education Research (CLEAR) at the University of Colorado at Boulder. Please see the conceptual overview for a broad introduction to ClearTK."
From website:
Features
- A common interface and wrappers for popular machine learning libraries such as SVMlight, LIBSVM, LIBLINEAR, OpenNLP MaxEnt, and Mallet.
- A rich feature extraction library that can be used with any of the machine learning classifiers. Under the covers, ClearTK understands each of the native machine learning libraries and translates your features into a format appropriate to whatever model you're using.
- Infrastructure for creating NLP components for specific tasks such as part-of-speech tagging, BIO-style chunking, named entity recognition, semantic role labeling, temporal relation tagging, etc.
- Wrappers for common NLP tools such as the Snowball stemmer, the OpenNLP tools, the MaltParser dependency parser, and the Stanford CoreNLP tools.
- Corpus readers for collections like the Penn Treebank, ACE 2005, CoNLL 2003, Genia, TimeBank and TempEval.