2

The solr syntax for fuzzy search is:

q~n where q is the query term and n is the Levenshtein Distance (e.g. 1-3).

The syntax for prefix search is:

q* where q is a query term and the * indicates a wildcard.

Combining both like q~n* (with even n=1) has the side effect, that nearly everything matches (for a reason, that i still need to find out).

Combining both like q*~n (with even n=1) has the side effect, that the query performs as it will be a prefix search only.

In our use case we need to offer suggestions based on historical queries stored in index. That seam also to be the thing google does when you type in a misspelled term, and it is a great solution for suggestions. The problem is, we can either offer suggestions wich start with the same index or some with a defined Levenshtein Distance <= 3 which is impracticable when it comes to long terms.

Now, I know that there is a similar question asked 3 years ago, where the solution says it aint possible to express in solr syntax and the whole case does not make any particular sense, but in my opinion it makes sense and a combination would be a perfekt solution to practical problems.

Community
  • 1
  • 1
Macilias
  • 3,279
  • 2
  • 31
  • 43

2 Answers2

2

Not a tested solution, did you think of using this ? q* OR q~1 for example name:S* OR name: S~1 ,

Larger example : name:Samson~3 OR name:Samson* returned : <str name="name">Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133</str></doc>

Arun
  • 1,777
  • 10
  • 11
  • jes i did indeed, and its nice that you mention it also here, but despite the fact that it is a step in the right direction, it is not a solution. – Macilias Jan 21 '14 at 13:50
0

I have not tried this specifically, but it looks like you might be able to do what you want with the ComplexPhraseQueryParser.

It looks like the ComplexPhraseQueryParser is slated to be distributed with 4.8, but for now you can get the plugin (there are install instructions in the zip files) from Solr's Jira. https://issues.apache.org/jira/browse/SOLR-1604

There is some discussion using distance here. http://lucene.472066.n3.nabble.com/ComplexPhraseQueryParser-and-wildcards-td2742244.html

I would expect with the ComplexPhraseQueryParser you could do a query like "q*"~n.

a_hardin
  • 4,991
  • 4
  • 32
  • 40