I am fairly new to auto-suggestion world. My area of interest is to provide top 'N' address suggestions (output) for partial address (input). Like how google maps does it or uber app does it when you type in a partial address.
I have explored few technologies such as Elastic Search's Completion Suggestor, Apache Solr's Suggestion Component.
I have come up with multiple combinations of queries and data indices to perform best string with available geospatial information like geocode(latitude, longitude) or city or state (varies from country to country, like Province in Japan).
[Side Question-1 : Which is better Apache Solr vs Elastic Search for this use case?]
Assume that there is standard address data store (holding around 100 Million address) to provide address suggestions (output) and there is set of partial addresses (input, say around 100 K partial addresses). Also assume that I know the complete addresses for partial 100 K partial addresses or in other words, I know the intended completion value of those partial addresses.
Now I want to run experiments and evaluate each combination based on the relevancy of suggested address.
Here is My Current understanding of relevancy measurement :
key stroke versus matching percentage (using levenshtein distance algo) of suggested address with partial address * (multiply by) 1/N position-number in the suggestion list.
I want to mathematically derive the quality of my suggestions. Please evaluate above measurement formula (it may be completely wrong, but please explain the reason behind it).
[Question-2] How to measure relevancy in this use case ?
Also I read about couple of article on measuring quality of recommendation engine/system, that spoke about Mean Average Precision
or Mean Absolute Error
or Mean Squared Error
or Root Mean Squared Error
.
[Question-3] Are strategies applicable to measure relevance of address suggestion application ?