I want to add synonyms to my index and was wondering whether to go the synonym_path
route vs the synonyms
route. I could not if there is a performance difference between so wanted to make sure. My synonyms file is pretty big, so I was thinking of going with the path route to prevent cluttering of the settings. So completely from a performance standpoint, will there be a performance difference if I keep the file as a path vs directly appending to the settings?
Asked
Active
Viewed 46 times
1

Amit
- 30,756
- 6
- 57
- 88

Belphegor21
- 454
- 1
- 5
- 24
1 Answers
1
If I understood, you want to know the performance impact of creating the index setting, which includes a synonym list in the analyzer definition versus the synonym file's file path.
I am curious why you are so worried about it. It's normally a one-time process unless you update the index settings very frequently(which is very rare).
Also, you have not mentioned how many synonym words are present in your list, and even if it's in thousands, it shouldn't matter much and mainly API call will be slow based on the data-size(I believe in your case, it will be a few MB maxes) you transfer over the network.

Amit
- 30,756
- 6
- 57
- 88
-
The update as you said will be rare. The synonym file path will be present on the same host as the es cluster. My question is completely on the performance point of view that if i have hardcoded synonym list vs synonym_path, which is faster? Considering that very rare updates will be there both are viable options. – Belphegor21 Oct 16 '20 at 08:18
-
@Belphegor21, thanks for your confirmation. As I mentioned, there would be hardly any performance difference in both the approaches, its just when you send the hardcoded list API call will take more time but that will not be significant if you are not sending more than 1 million hardcoded words – Amit Oct 16 '20 at 08:26
-
@Belphegor21 also may I know the size of your list? – Amit Oct 16 '20 at 08:31
-
3-4 files of approx 1000 lines of synonyms. Total 3-4K of synonyms total. – Belphegor21 Oct 16 '20 at 11:18
-
@Belphegor21 that's nothing, you can choose any approach, there will be very less perf difference :) – Amit Oct 16 '20 at 11:24
-
In my particular i'm aware that there would be no impact but just to satisfy curiosity i was thinking which would be better if it was much bigger. – Belphegor21 Oct 16 '20 at 13:42
-
@Belphegor21, file based would be better if you have it much bigger, but as mentioned earlier its one time thing and you should not think much about the performance of it. – Amit Oct 17 '20 at 03:55