Free text (natural language) query parsing with solr

Question

I'm trying to build a query parsing algorithm for a local search site that can classify a free text search query (single input text box) into various type of possible searches possible on the site.

For e.g. the user could type chinese restaurants near xyz. How should I go about breaking it down to Cuisine:"chinese", locality:"xyz" given that

- there could be spelling mistakes
- keywords may match in different columns e.g. a restaurant may have "chinese" in its name

This is not really a natural language parsing problem since we're trying to search in a very limited set of posiibilities

My initial thoughts are to dump all values of a particular type into a field from the database and use the users query to match in all those fields. Then based on the score (and a predifined confidence level) divide the query into the 3-4 search fields like name/cuisine/locality.

Is there a better/standard way of doing this.

score -1 · Answer 1 · answered Feb 17 '11 at 18:38

About spelling mistakes, you have to work with a dictionary/thesaurus. This can be part of your pre-processing and normalization.

About querying in multiple columns you can do; cuisine:chinese OR restaurant_name:chinese

You can boost one of the two: cuisine:chinese^0.8 OR restaurant_name:chinese

Free text (natural language) query parsing with solr

1 Answers1