1

I am working on collective classification of entities and using the CRFClassifier class for sequence labelling. I have a requirement that a certain feature F_i should NOT be considered with certain class label C_i.

I have specified various flags in the property file for CRFClassifier (of Stanford CoreNLP) and accordingly NERFeature factory generates the features. Internally, I think it generates total L*N binary feature functions (indicator functions) where, L=#classLabels and N=#features. Out of total functions in this cross product, I do not want to consider few pairs of . What is the best way to achieve this?

Note: The L*N functions are I think generated by - getObjectiveFunction at following location.

CRFClassifier {
    protected double[] trainWeights(int[][][][] data, .......) {
              CRFLogConditionalObjectiveFunction func =
getObjectiveFunction(data, labels);
    }
}

EHat protected variable in Class CRFLogConditionalObjectiveFunction
 contains the empirical counts for these L*N features

For the combination that I do not want in my classifier, Will it be okay to explicitly set the empirical count to 0 for these combination (in EHat variable) before I call the Minimizer? Will it be same as saying that I do not have that combination in my model?

Does Mallet provide a way for doing this?

sapan shah
  • 11
  • 2

0 Answers0