2

I have a question which closely relates to this question.

In my schema I have a field

<field name="text" type="textgen" indexed="true" stored="true" required="true"/>

This gives an exact match, ie. stemming disabled

eat = eat

Is it possible, while configured to textgen to search for other variants of the word

eg. eat = eat, eats, eating

eat~0 will give similar sounding words such as meat, beat etc. but this is not what I want.

I'm starting to think that the only way to achieve this is to add another field with something other then textgen but if there is a simpler way I am very interested to hear it.

halfer
  • 19,824
  • 17
  • 99
  • 186
Ruth
  • 5,646
  • 12
  • 38
  • 45

2 Answers2

7

Using copyfield statements is the normal approach in Solr. Since stemming is the answer to exactly what you're asking, this is what I recommend you to use. You can set stored=false if you are worried about index size.

You might also use lemmatisation, which is the opposite of stemming - where you instead add a words all inflected forms. This is typically performed on the search query, expanding e.g., eat to eat, eats, eating etc.

The third alternative might be to use wildcard search, although I wouldn't encourage it. Not least since it bypasses all schema configured filters for the target field.

Johan Sjöberg
  • 47,929
  • 21
  • 130
  • 148
1

If you use text as the field type, then eat, eats, eaten and eating will all be stored as eat and a search for FieldName:eat will find all of them. If you change the field type to text-gen then the search for FieldName:eat will only find "eat", not eats, eaten or eating.

Michael Dillon
  • 31,973
  • 6
  • 70
  • 106