I want to write my own model using openNLP MaxEnt, for that I want to implement ContextGenerator and EventStream interfaces(as mentioned in documentation). I looked at the these implementations for openNLP Chuncker, POSTagger and NameFinder, but all those implementations used 'Pair' which is deprecated and by just looking at the code I don't understand what their respective ContextGenerators are doing. The model that I will be creating will classify each token as a RoomNumber or not a RoomNumber by looking at POS tags for each token. How should I start coding ContextGenerator and EventStream for this model. I know what a context is and what a feature is, but I don't know what a ContextGenerator do and what an EvenStream do. I did look at openNLP maxent page, but it is not helpful. Please help me understand this, thank you.
Asked
Active
Viewed 271 times
1
-
Are you interested in using the pure Maxent classifier from OpenNLP or the higher level API that used maxent ? – Viliam Simko Nov 06 '14 at 16:24
1 Answers
0
The following code might help, although it does not use the ContextGenerator
explicitly.
Actually, the BasicContextGenerator
is used within the BasicEventStream
and it just splits each input string into a list of features.
e.g. the String "a=1 b=2 c=1"
is split into 3 features: "a=1"
, "b=2"
and "c=1"
.
If you just want to use the Maxent API to train the model and then to use it for classification, you can use the following approach that worked for me:
package opennlptest;
import java.io.IOException;
import java.util.Arrays;
import java.util.List;
import opennlp.maxent.GIS;
import opennlp.model.Event;
import opennlp.model.EventStream;
import opennlp.model.ListEventStream;
import opennlp.model.MaxentModel;
public class TestMaxentEvents {
static Event createEvent(String outcome, String... context) {
return new Event(outcome, context);
}
public static void main(String[] args) throws IOException {
// here are the input training samples
List<Event> samples = Arrays.asList(new Event[] {
// outcome + context
createEvent("c=1", "a=1", "b=1"),
createEvent("c=1", "a=1", "b=0"),
createEvent("c=0", "a=0", "b=1"),
createEvent("c=0", "a=0", "b=0")
});
// training the model
EventStream stream = new ListEventStream(samples);
MaxentModel model = GIS.trainModel(stream);
// using the trained model to predict the outcome given the context
String[] context = {"a=1", "b=0"};
double[] outcomeProbs = model.eval(context);
String outcome = model.getBestOutcome(outcomeProbs);
System.out.println(outcome); // output: c=1
}
}

Viliam Simko
- 1,711
- 17
- 31