Question on Java RegEx:
I have a tokenizer where i want to return only tokens that have length above a certain length.
For example: I need to return all tokens that are more than 1 char in this text: "This is a text ."
I need to get 3 tokens: "This", "is", "text" The following tokens are not needed: "a" and ".". Notice that the string can have any character (not only alpha-bet chars)
I tried this code but i am not sure how to complete it:
String lines[] = {"This is o n e l e tt e r $ % ! sentence"};
for(String line : lines)
{
String orig = line;
Pattern Whitespace = Pattern.compile("[\\s\\p{Zs}]+");
line = Whitespace.matcher(orig).replaceAll(" ").trim();
System.out.println("Test:\t'" + line + "'");
Pattern SingleWord = Pattern.compile(".+{1}"); //HOW CAN I DO IT?
SingleWord.matcher(line).replaceAll(" ").trim();
System.out.println("Test:\t'" + line + "'");
}
Thanks