I use elastic search for news articles search. If I search for "Vlamadir Putin", it works because he is in news a lot and Vlamidir and Putin are both not very popular. But if I search for "Raja Ram", it does not work. I have a few articles of "Raja Ram", but some of "Raja Mohanty" and "Ram Srivastava". These articles rank higher than articles quoting "Raja Ram". Is there something wrong in my tokenizer or search functions?
es.indices.create(
index="article-index",
body={
'settings': {
'analysis': {
'analyzer': {
'my_ngram_analyzer' : {
'tokenizer' : 'my_ngram_tokenizer'
}
},
'tokenizer' : {
'my_ngram_tokenizer' : {
'type' : 'nGram',
'min_gram' : '1',
'max_gram' : '50'
}
}
}
}
},
# ignore already existing index
ignore=400
)
res = es.search(index="article-index", fields="url", body={"query": {"query_string": {"query": keywordstr, "fields": ["text", "title", "tags", "domain"]}}})