1

I am using CRFSuite for sequence classification (POS tagging). To my surprise it seems like CRFSuite does not like the label':' Units or tokens that have ':' as actual label are entirely skipped (no remark in the prediction output about a missing or skipped item)

I use other punctuation-related labels such as '.' or ',', but these are correctly used and outputted.

Has someone made a similar experience or nows why ':' is skipped ?

Franck Dernoncourt
  • 77,520
  • 72
  • 342
  • 501
toobee
  • 2,592
  • 4
  • 26
  • 35

1 Answers1

2

From http://www.chokkan.org/software/crfsuite/tutorial.html:

CRFsuite accepts any string as an attribute name as long as the string does not contain a colon character (that is used to separate an attribute name and its weight).

So if you have an attribute like w[0]=the:0.5, the attribute name is "w[0]=the" and the weight is 0.5.

Tavian Barnes
  • 12,477
  • 4
  • 45
  • 118