0

I'm writing a service which should sensibly suggest UK place names based on user entered text, my data set is just under 2500 entries. So far I'm applying a slightly modified version of the Damerau Levenshtein algorithm which ignores the edit distance for comparing against longer strings.

This is giving me a reasonable set of suggestions but I'd like to manually weight some terms e.g. currently entering new will give New Mills as the top result.

I'd like to weight these results so major cities appear above towns and villages e.g. entering new will give Newcastle as the top result.

Can anyone suggest either a different search algorithm, or a separate weighting process I can apply to my results to achieve the weighted results I'm after?

ScouseChris
  • 4,377
  • 32
  • 38

1 Answers1

1

Levenshtein is more for typos - what you want is NLP, you can google: NLP address or see Detect/Parse Mailing Addresses in Text

Community
  • 1
  • 1
Jim W
  • 4,866
  • 1
  • 27
  • 43