How to fix Rasa Nlu Confidence giving 0 if there is an underscore in the word?

Question

I am trying to build simple chatbot application using Rasa, but my bot is giving confidence 0 if there is an underscore in the word.

Below is my config.yml configuration:

language: en  
pipeline: supervised_embeddings  
policies:  
  - name: KerasPolicy  
  #- name: MappingPolicy  
  #- name: MemoizationPolicy  
  #- name: FallbackPolicy

nlu.md configuration:

## intent:name
- name
- nmae
- nme
- what is my name?

## intent: firstname
- firstName
- FName
- first name

## intent: gender
- gender
- sex
- gnder
- gendr
- sx

## intent: lastname
- lastName
- lname
- surname
- lstnme
- lstname

## intent: username
- userName
- uname
- usrnme
- usernme
- userid

If I pass firstname I am getting the correct intent and confidence and if I try with _firstname or first_name I am getting the below result:

first_name
{
  "intent": {
    "name": null,
    "confidence": 0.0
  },
  "entities": [],
  "intent_ranking": [],
  "text": "first_name"
}

score 1 · Answer 1 · answered Sep 04 '19 at 12:24

1

You're getting 0 confidence precisely because you've used underscore in your word. The word first_name hasn't been used in your training data so, that word is foreign to your model. That's why it doesn't predict anything for that word. (By default, it uses a whitespace tokenizer so words are only tokenized by whitespace.)

So, to fix your issue, just don't use underscore in your word or you can edit the whitespace tokenizer to tokenize by whitespace and underscore.

Hope that helps.

answered Sep 04 '19 at 12:24

lahsuk

1,134
9
20

Thanks for the help and I'm able to solve this issue by using **CountVectorsFeaturizer** with analyzer 'char_wb' . – Karthik Mannava Sep 05 '19 at 10:13
There's that way too but I don't really like using the `char_wb` option. I don't think it works that well because I think it'll give more false-positives. – lahsuk Sep 05 '19 at 11:07

How to fix Rasa Nlu Confidence giving 0 if there is an underscore in the word?

1 Answers1