0

I want to know if Retrieve & Rank service, and especially during the ranking, allows searching by proximity.

Example :

Ranker learned : 

a. Query = "I have a problem with my mailbox"

b. Documents with pertinence score : "Doc1":3, "Doc2":4", "Doc3":1

So we can imagine that when I use Retrieve service only, the result of the query is :

1. Doc1 
2. Doc2
3. Doc3

And when I use the Ranker to re-order the previous result, we have :

1. Doc2 
2. Doc1
3. Doc3

At this moment, everything is OK.

Now I want to execute a new (and similar) query by using the Ranker : "I encountered a problem with my mailbox"

The question is :

  1. Does the Ranker will match my new query with the query that it learned previously? So the result will be :

     1. Doc2 
     2. Doc1
     3. Doc3
    
  2. Or the Ranker will not match my new query with the query that it learned previously, and so the result will be the result from the Retrieve service execution :

     1. Doc1
     2. Doc2
     3. Doc3
    

This documentation https://www.ibm.com/watson/developercloud/doc/retrieve-rank/plugin_query_syntax.shtml , and especially this text, makes me think that the Ranker will not match the queries :

The following modifiers are not supported with the /fcselect request handler:
 - [...]
 - Search by proximity
 - [...]

But when I try this example, it seems that the Ranker match the queries...

Thanks for your time.

1 Answers1

0

So the ranker does not work by memorizing your training questions OR by mapping new questions to the closest question in the training data set. In fact, the ranker doesn't directly work with questions at all.

Instead, as per the overview material in the RnR documentation, the ranker uses an approach called 'learning-to-rank' (it might be helpful to take a look through the wikipedia entry for it: https://en.wikipedia.org/wiki/Learning_to_rank).

Essentially, the learning-to-rank approach is to first generate a bunch of features that capture some notion of how well each of the candidate documents returned from the initial Retrieve phase matches the query. See this post for more info on features: watson retrieve-and-rank - manual ranking.

Then, based on the training data, the ranker will learn how to pay attention to these features in order to best re-rank the set of candidate documents in order to optimize for relevance. This approach allows it to generalize to different questions that come in the future (these might have the same topics, or they might not).

chakravr
  • 126
  • 4