0

I am new to Solr 6.0 & Solarium integration. I have the set up running but results are not being returned where fields do not exactly match query. e.g I have a url field containig 'http://ayodeji.com' or 'http://ayo-tuntun.com' but a query for 'ayo' does not return these rows, Although they are returned with *:* queries in the Solr admin section. I have changed string to text in the managed-schema file but still wont work. Please help Below is the code from Solarium dismax example that I am using. Thank you.

    $client = new Solarium\Client($config);

$query = $client->createSelect();

$dismax = $query->getDisMax();

$dismax->setQueryFields('url^5 author^3 body^1 title');

$searchTerm = 'ayo';

$query->setQuery($searchTerm);

$resultset = $client->select($query);

echo 'NumFound: '.$resultset->getNumFound();

foreach ($resultset as $document) {

    echo '<hr/><table>';

    // the documents are also iterable, to get all fields
    foreach ($document as $field => $value) {
        // this converts multivalue fields to a comma-separated string
        if (is_array($value)) {
            $value = implode(', ', $value);
        }
        echo '<tr><th>' . $field . '</th><td>' . $value . '</td></tr>';
    }
    echo '</table>';
}
Deji Epe
  • 3
  • 1
  • 3

2 Answers2

1

You need to use WordDelimiterFilter to split url by small parts.

https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters

 <filter class="solr.WordDelimiterFilterFactory"
            generateWordParts="1" catenateWords="1" types="wdfftypes.txt"
            generateNumberParts="1" catenateNumbers="1" splitOnNumerics="1"
            catenateAll="1" splitOnCaseChange="1" 
            stemEnglishPossessive="0" preserveOriginal="0" />
    </analyzer>

I have attached an image of tested results.

enter image description here

On the left side of an analysis tool you can see that ayo keyword has been matched.

My example of text_general fieldType

 <fieldType name="text_general" class="solr.TextField" omitNorms="false"  positionIncrementGap="100" multiValued="true">
    <analyzer type="index">
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.SynonymFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
     <tokenizer class="solr.WhitespaceTokenizerFactory" /> 
     <filter class="solr.WordDelimiterFilterFactory"
            generateWordParts="1" catenateWords="1" types="wdfftypes.txt"
            generateNumberParts="1" catenateNumbers="1" splitOnNumerics="1"
            catenateAll="1" splitOnCaseChange="1" 
            stemEnglishPossessive="0" preserveOriginal="0" />
    </analyzer>
    <analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="solr.WordDelimiterFilterFactory"
            generateWordParts="1" catenateWords="1" types="wdfftypes.txt"
            generateNumberParts="1" catenateNumbers="1" splitOnNumerics="1"
            catenateAll="1" splitOnCaseChange="1"
            stemEnglishPossessive="0" preserveOriginal="0" />
    </analyzer>
  </fieldType>
Oyeme
  • 11,088
  • 4
  • 42
  • 65
  • Pardon my late reply, I was making the adjustments but was getting errors, turns out its the `wdfftypes.txt` , so I created the file and placed in the same location as managed-schema and the queries work partially. Instances like the two examples are working where the word starts with `ayo` or has an hypen in between works very flawlessly. But where the word is at the end of the whole word e.g `freeayo` or `needayo` does not return. I will read the doc link your provided. Accept my appreciation. – Deji Epe Apr 12 '16 at 16:43
0

Solr does not search for substrings. I.e.: it is the normal behaviour that a search for “ello” does not find a document containing “helloworld”. If you want that you should use *ello* as search string.

BlueM
  • 3,658
  • 21
  • 34