2

Here, I am trying to get search results for multiple terms. Say fulltext="Lee jeans", then regexresult={"lee","jeans"}.

Code :

IProviderSearchContext searchContext = index.CreateSearchContext();
IQueryable<SearchItem> scQuery = searchContext.GetQueryable<SearchItem>();
var predicate = PredicateBuilder.True<SearchItem>();

   //checking if the fulltext includes terms within " "
                    var regexResult = SearchRegexHelper.getSearchRegexResult(fulltext);

                    regexResult.Remove(" ");

                    foreach (string term in regexResult)
                    {
                        predicate = predicate.Or(p => p.TextContent.Contains(term));
                    }
                    scQuery = scQuery.Where(predicate);

IEnumerable<SearchHit<SearchItem>> results = scQuery.GetResults().Hits;

results=sortResult(results);

Sorting is based on sitecore fields:

  switch (query.Sort)
  {
    case SearchQuerySort.Date:
    results = results.OrderBy(x => GetValue(x.Document, FieldNames.StartDate));
    break;
    case SearchQuerySort.Alphabetically:
    results = results.OrderBy(x => GetValue(x.Document, FieldNames.Profile));
    break;
    case SearchQuerySort.Default:
    default:
    results = results.OrderByDescending(x => GetValue(x.Document, FieldNames.Updated));
    break;
}

Now, what i need is to have results for "lee" first and sort them and then find results for "jeans" and sort them. The final search result will have the concatenated sets of sorted items for "lee" first and then for "jeans".

Thus we would have to get results for "lee" first and then results for "jeans"

Is there a way to get results term by term ?

  • 1
    I don't quite get the use case. You say you want "lee" hits first followed by "jeans" hits. But if you sort by say alphabetic then the entire resultset should be sorted alphabetically no? Can you provide some sample input with expected output? – Christian Hagelid Jul 28 '15 at 15:05
  • @Christian Current code situation is doing what u said. But that is not what i need. Consider the default sort ie. latest updated item. Expected case. If regexResult has {"Jack","Daniels"} Then we should : First, get the latest updated items for "Jack" . Second, get the latest updated items for "Daniels" . So, results will be First followed by Second. – Prathamesh dhanawade Jul 28 '15 at 16:17
  • I don't agree with your logic. If you type "Jack Daniels" in Google then you would expect it to match everything with *Jack AND Daniels* as most relevant, sorting by anything else should ignore everything else and just sort by that criteria. But anyway, I've update my answer with a possible solution... – jammykam Jul 28 '15 at 16:43
  • @Jammy I couldnt agree more with you..so we can add "Jack Daniels" as first term (most relevant) and leave others be. – Prathamesh dhanawade Jul 28 '15 at 17:22

1 Answers1

6

You can use Query-Time Boosting to give the terms more relevance and therefore affect the ranking:

You want to give the first term the highest boost, and then gradually reduce for each additional term:

var regexResult = SearchRegexHelper.getSearchRegexResult(fulltext);
regexResult.Remove(" ");
float boost = regexResult.Count();

foreach (string term in regexResult)
{
    predicate = predicate.Or(p => p.TextContent.Contains(term)).Boost(boost--);
}

EDIT: Boosting and sorting in the same query is not possible, at least, the sorting will undo the "relevance" based sorting that was returned due to boosting.

Alternative way would be to search multiple times and concatenate the results returning a single list. Not as efficient since you are essentially making multiple searches:

IProviderSearchContext searchContext = index.CreateSearchContext();
var items = new List<SearchResultItem>();

var regexResult = SearchRegexHelper.getSearchRegexResult(fulltext);

regexResult.Remove(" ");

foreach (string term in regexResult)
{
    var results = searchContext.GetQueryable<SearchResultItem>()
                                   .Where(p => p.Content.Contains(term));
    SortSearchResults(results); //results passed in by reference, no need to return object to set it back to itself

    items.AddRange(results);
}

NOTE: The above does not take into account duplicates between the result sets.

jammykam
  • 16,940
  • 2
  • 36
  • 71
  • I am mainly concerned with getting all results for "lee" and sort them before getting results for "jeans". – Prathamesh dhanawade Jul 28 '15 at 11:30
  • @Prathameshdhanawade What are you sorting your results by? Pls post up the code for the sort method. – jammykam Jul 28 '15 at 13:30
  • 2
    "Boosting and sorting in the same query is not possible", actually it's possible if you are using solr (not sure about lucene). You can write the sort query like: dataQuery.OrderByDescending(i => i["score"]).ThenByDescending(i => i.Title); – Ehab ElGindy Jul 31 '15 at 09:45
  • @EhabElGindy That's good to know and makes sense. I guess it would be sensible to increase your boost factor to skew the results more so there is less chance of crossover. – jammykam Jul 31 '15 at 10:25
  • I would definitely shy away from the multiple queries + concat results method because that will start to create performance issues for you. The boosting is a nice approach. Also note that when you search with an OR predicate, results with multiple matches for the terms already score higher in SOLR. Something with "Lee" and "Jeans" will be given a higher score by SOLR than something with only "Lee" or "Jeans". – Daved Sep 21 '15 at 13:50