I have written a custom analyzer that wraps a StandardAnalyzer with a Length. Now I want to filter out all terms that contain just numbers. What is the best way to implement this?
Asked
Active
Viewed 1,316 times
0
-
Is it on just one field? If so, you can just use a FieldBridge to not add terms containing just numbers to the Document. – robertvoliva Apr 14 '12 at 03:21
-
What is FieldBridge in Lucene? – Rohit Banga Apr 14 '12 at 04:20
2 Answers
1
You may be in for a custom TokenFilter
. Check out one of the simplest filters out there, the LowerCaseFilter. I think you'll find it easy to write your own along those lines.

Marko Topolnik
- 195,646
- 29
- 319
- 436
0
You can use the PatternReplaceFilter to detect and remove numbers from the TokenStream by using a regular expression.

Bertil Chapuis
- 2,477
- 1
- 18
- 12