1

I am trying to search for a term that has spaces and a wildcard at the end of it; i.e. name:John S* Solr is failing to return any result although I have the below entries indexed and are returned when querying *:* from Solr web interface;

  • John Dow
  • Johny English
  • John Smith

I am using Sol7.4 with DIH that index my DB, and I am creating a contact search (by name and phone) feature to my web app.

I have followed this thread Solr wildcard query with whitespace however it did not solve the problem;

  1. I have tried changing my field type (for field name) to text_en, text_ws and currently text_general, plus, tried to escape spaces with a backslash "\ " and it did not work.
  2. Tried Solr "complex phrase query parser" which partially solved the issue, as it will extremely increase the query time, in addition, Solarium is throwing an exception if the term contains spaces at the end i.e. "jhon\ *" and if I try to run the same query from Solr web interface I got no result http://localhost:8983/solr/collection/select?q{!complexphrase inOrder=true}displayName:John\ *
  3. Also, tried Prefix Query Parser with no luck

Note: that I have reloaded solr, cleared and re-index my data after each try.

Expected result:

  • when searching for "John" I should get all the 3 entries:

    • John Dow
    • Johny English
    • John Smith
  • when searching for "John\ " that will be parsed to "John "; I should get:

    • John Dow
    • John Smith
  • and when searching for "John\ S*", I should be getting:

    • John Smith

Update #1

search.php

...
    $term = str_replace(' ', '\ ', $request_params['term']);
    $query->setQuery('phone:"%1%" OR name:"%1%" OR contact:%2%*', [$request_params['term'], $term]);
    // $query->setQuery('phone:"%1%" OR name:"%1%" OR contact:"%2%*"', [$request_params['term'], $term]);
...

managed-schema

...
  <fieldType name="lowercase" class="solr.TextField" positionIncrementGap="100">
    <analyzer>
      <tokenizer class="solr.KeywordTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>
...
  <field name="name" type="text_general" multiValued="false" indexed="true" stored="true"/>
  <field name="contact" type="lowercase" indexed="true" stored="true"/>
  <field name="phone" type="string" docValues="false" multiValued="false" indexed="true" required="true" stored="true"/>
  <copyField source="displayName" dest="card"/>
  <!-- <copyField source="phone" dest="card"/> -->
...
Geo Salameh
  • 323
  • 2
  • 8

1 Answers1

2

Use a second field for wildcard matches that has a KeywordTokenizer with a LowercaseFilterFactory attached. Use a copyField instruction to copy the content from the main field into the second, wildcard based field.

That way you can perform regular searches against the regular field, while performing wildcard searches against the field that supports wildcards properly.

Your second example above (John\ *) in effect will probably only match anything that has the token John present (and you're missing a = between the argument name q and the argument itself).

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
  • Thank you Mats, but still this did not solve my issue, I have updated my question with a code snippets from my files (I am using Solarium lib). Using the above provided code: `John a` did not return any result when using `%2%*` and `"%2%*"` – Geo Salameh Jan 22 '19 at 12:30
  • Wouldn't your first call to `str_replace` make Solarium attempt to escape the `\\` when using the placeholders syntax as well? Make sure you look at the Solr log for what Solarium is sending to the server, and use the actual query screen under Solr admin for getting your query syntax to work before handing it over to Solarium - so that you can properly know where the problem lies. – MatsLindh Jan 22 '19 at 18:06
  • Actually, I was using Solr admin query to debug, but still did not get any error. At last, I did a dirty trick by replacing spaces with "+" sign on import. So once I need to search `John a*` I use `John\+a*` in Solr admin, and from code side I use str_replace to replace the term space with "+" before sending it to Solr through Solarium. – Geo Salameh Jan 29 '19 at 08:58