1

I'm trying to build a query parsing algorithm for a local search site that can classify a free text search query (single input text box) into various type of possible searches possible on the site.

For e.g. the user could type chinese restaurants near xyz. How should I go about breaking it down to Cuisine:"chinese", locality:"xyz" given that

- there could be spelling mistakes
- keywords may match in different columns e.g. a restaurant may have "chinese" in its name

This is not really a natural language parsing problem since we're trying to search in a very limited set of posiibilities

My initial thoughts are to dump all values of a particular type into a field from the database and use the users query to match in all those fields. Then based on the score (and a predifined confidence level) divide the query into the 3-4 search fields like name/cuisine/locality.

Is there a better/standard way of doing this.

Gunjan
  • 1,177
  • 2
  • 11
  • 22

1 Answers1

-1

About spelling mistakes, you have to work with a dictionary/thesaurus. This can be part of your pre-processing and normalization.

About querying in multiple columns you can do; cuisine:chinese OR restaurant_name:chinese

You can boost one of the two: cuisine:chinese^0.8 OR restaurant_name:chinese

Bob Yoplait
  • 2,421
  • 1
  • 23
  • 35