In most of the case including yours, Standard Analyzer
is sufficient. Also, it is default analyzer in ElasticSearch and it provides grammar based tokenization
. For example:
"The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
will be tokenized into [ the, 2, quick, brown, foxes, jumped, over, the, lazy, dog's, bone ]
.
In your case, domain names are tokenized into list of terms as [techtarget, americanexpress, theamericanexpress, thefacebook]
.
Why query search for facebook
doesnot return anything?
Because, there is no facebook
term stored in the dictionary and hence search result return no data. Whats going on is that ES try to find search term facebook
in the dictionary but the dictionary only contain thefacebook
and hence search return no result.
Solution:
In order to match search term facebook
with thefacebook
, you need to wrap wildcards around your search term i.e. .*facebook
will match thefacebook
. However, you should know that using regex will have a performance overheads.
Other workaround is that you can use synonyms. What synonyms does is that you can specify synonyms (list of alternative search terms) for your search terms. e.g. "facebook, thefacebook, facebooksocial, fb, fbook"
, with these synonyms, you can provide any of search term from these synonyms, the it will match with any of these synonyms. i.e. If your search term is facebook
and your domain is stored as thefacebook
then the search will be matched.
Also, for prioritization you need to first understand how scoring work in ES and then you can use Boosting.