How can I evaluate out-of-domain question in a domain-specific Q&A bot when I only have in-domain data?

Question

I learned that some popular bots like RASA or LUIS will have "confidence scores" to evaluate the out-of-domain questions, but none of them provide documentation of how they calculate these scores. Also, information retrieval has some approaches to compute similarity, but I don't know what approaches it will use for out-of-domain classification. Could someone give me some ideas about which papers, directions, or codes I can work on?

score 0 · Answer 1 · answered Jun 18 '19 at 19:48

0

What I usually do is to create an intent out_of_scope and add examples for messages, which are out of scope, to this intent. If a message is now out of scope, the prediction will either be not confident or the message is categorized as out_of_scope. With Rasa you can also run an evaluation on a test set, which gives you a histogram of the confidence levels. This helps to select a suited threshold for the confidence scores. Regarding the confidence calculation: for Rasa that's a bit different depending on which pipeline component you are using.

answered Jun 18 '19 at 19:48

Tobias

1,880
11
17

do i need to add the examples for out_of_scrope intent,, i mean cannot i say if the user input does not match with the other training data then automatically it is out_of_scope – Sunil Garg Mar 09 '21 at 07:53
1

You would need to add examples for this. In Rasa Open Source there is the concept of a fallback intent but this is rather for intent classifications with low model prediction confidence scores. The docs are pretty good here: https://rasa.com/docs/rasa/fallback-handoff#handling-out-of-scope-messages – Tobias Mar 16 '21 at 08:45

How can I evaluate out-of-domain question in a domain-specific Q&A bot when I only have in-domain data?

1 Answers1