-2

I have a list of words after Pos Tagging in Java. Now I want to remove particular words with specified tags.How to use string tokenizer to remove the tagged words? such as to-PRP? and all words with tags prp?

The input file:

mike-NNS

Buses-NNP

Walk_VRB

to_PRP

. . . . . . . . . and so on

Rohit Jain
  • 209,639
  • 45
  • 409
  • 525
  • 1
    If you post some code to show that you've attempted to solve your question, then we are more likely to help you correct any problems with it. We're not going to simply do your homework for you. – wattostudios Nov 19 '12 at 13:32
  • 1
    It seems that regex is a better choice to replace all your undesired words – cl-r Nov 19 '12 at 13:45

1 Answers1

1
    final List<String> result = new ArrayList<String>();

    final List<String> textList= getList(); // get your list

    final StringTokenizer tokenizer = 
      new StringTokenizer(textList, delimiter); // your delimiter
    while (tokenizer.hasMoreElements()) {
      final String token = tokenizer.nextToken();
      if (isValid(token)) { // implement your own isValid method
        result.add(token);
      }

    }
    return result;
Jan Schmidt
  • 203
  • 1
  • 4