0

I Have a use-case where I need to tag dynamic keywords(NER tags) to words present in a text field. Now, the dynamic keywords are available in the TokenFilter either through another field or Through a http endpoint. As the TokenFilter doesn't have access to other fields what is the best way to do it??

For Example: This is a sample document below:

{
 "title_field": "my name is Rahul"
 "tag_field": {"rahul": ["person", "developer"]}
}

So, Basically while analysing the title_field, Tokens: person & developer should also be pointing to rahul so that when I search for "person", I get this document as the output.

Note: I do not want to add "tags" in a separate field and search there since I want to support highlighting and proximity queries.

Any help would be appreciated !!

Thank you !!

  • Just for clarification: Does the question mean that only this one document should be using these tokens (and other documents would use other tokens)? Or are all tokens applicable to all documents? For example, in this specific document, `rahul` has 2 tokens that need to be used. But what if `rahul` appears in other documents as well? Should those other docs also be retrieved when you use these 2 tokens? – andrewJames Sep 04 '22 at 17:59
  • If the answer is: _use these tokens across all docs_, then you can [use Lucene synonyms](https://stackoverflow.com/q/42071623/12567365) for that. Otherwise, I don't know how that can be achieved using Elasticsearch. – andrewJames Sep 04 '22 at 18:01
  • @andrewJames Only this one document should be using these tokens. Another document with *rahul* can have different tokens. – Rahul Agarwal Sep 05 '22 at 08:05
  • Understood, thanks. Then the only way I can think to do this is to build a query using regexes and spans - see [ElasticSearch and Regex queries](https://stackoverflow.com/q/25313051/12567365). Someone else may know a better way (not sure if my suggestion even works). – andrewJames Sep 05 '22 at 13:09

0 Answers0