0

I have been going through this blog post which contains a SimpleTagger example.

It says:

Given an input file "sample" as follows:

CAPITAL Bill  noun
        slept non-noun
        here non-noun
where all but the last token on each line is a binary feature, and the last token on the line is the label name

So, how do I add the word-level features here?

Example: The number of syllables in the word, the length of the word, etc

Dawny33
  • 10,543
  • 21
  • 82
  • 134

1 Answers1

1

Everything before the last token is treated as a feature. You should be able to add arbitrary features before this:

CAP SYL1 CHAR4 Bill noun
SYL3 CHAR9 responded non-noun
...
David Mimno
  • 1,836
  • 7
  • 7