ElasticSearch: check how analyzers/tokenizers/filters applied to an index split text into tokens?

Question

I'm quite new to ElasticSearch, so if I overlook something obvious/basic, please forgive me.

Now I'm using ElasticSearch at work, and want to see how the complex settings of analyzers/tokenizers/filters--which are set by my predecessors--split texts into tokens.

I did some research and found the way to do it:

GET /_analyze
{
  "tokenizer" : "whitespace",
  "filter" : ["lowercase", {"type": "stop", "stopwords": ["a", "is", "this"]}],
  "text" : "this is a test"
}

However, as I said, the settings of analyzers/tokenizers/filters is so complicated that writing the details every time I test the settings would horribly slow me down.

So I want to analyze a text with analyzers/tokenizers/filters settings already applied to an index. Is there way to do that?

I would appreciate it if anyone would shed some lights on it.

score 1 · Accepted Answer · answered Jan 09 '23 at 06:11

1

You don't have to supply the complete analyzer definition every time to analyze API, you can simply use the _analyze API on index and use it like following

GET <your-index-name>/_analyze
{
  "analyzer" : "standard",
  "text" : "Quick Brown Foxes!"
}

So instead of using the analyze API at a cluster level, you will be using it on index level, where analyzer definition is already present, so you just need to provide the analyzer name not its definition like filter etc to get the tokens based on the analyzer.

Refer Elasticsearch official documentation on using it on specific index or on a specific field with examples.

Hope this helps.

answered Jan 09 '23 at 06:11

Amit

30,756
6
57
88

Thank you for sharing great information again! You really saved my day. By the way, is there way to provide multiple analyzer names into a query? Looks like my predecessors applied multiple analyzers(kuromoji/n-gram) to an index…… – Nullable Yogurt Jan 09 '23 at 06:22
@NullableYogurt can you please ask a followup question with all the details ? also don't forget to mark this answer accepted :) and I am glad my answer helping you and community :) – Amit Jan 09 '23 at 06:27
1

Defenitely. Sorry for my unfamiliar reaction. I'll make a new question soon. – Nullable Yogurt Jan 09 '23 at 06:33
@NullableYogurt thats ok, its just very difficult to put all the info in the comments, also its not related to this question so would not be useful for others also :) – Amit Jan 09 '23 at 06:38

ElasticSearch: check how analyzers/tokenizers/filters applied to an index split text into tokens?

1 Answers1