0

Rasa NLU version: 0.12.3

Operating system (windows, osx, ...): Ubuntu 18.04

Content of model configuration file:

language: "en"


pipeline:
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
  intent_tokenization_flag: true
  intent_split_symbol: "_"

Issue:

my intent training data has been included the following data

{{ firstName | limitTo: 7 }}{{firstName.length > 7 ? '...' : ''}}

once the data is trained while testing this intent is matching for the input 20/7/2018

1 Answers1

0

There are typical question which would be useful to know the answer to like:

  • How much training data do you have?
  • How many intents do you have?
  • How similar are the intents?

But I think in your case two things (at minimum) are working against you:

  • The featurizer you are using lumps numerals together. See this note in the code:

    Creates bag-of-words representation of intent features using sklearn's `CountVectorizer`. All tokens which consist only of digits (e.g. 123 and 99 but not ab12d) will be represented by a single feature."""

  • The tokenizer is more or less whitespace based:

    "token_pattern": r'(?u)\b\w\w+\b'

Also keep in mind that Rasa NLU will always choose an intent. So you should be monitoring confidence as well. Perhaps the confidence of that input is low enough that you can add a threshold to fallback to some sort of canned response.

Can you provide the expected intent and entities for both given examples?

Caleb Keller
  • 2,151
  • 17
  • 26
  • I am expecting fallback intent since I have trained something which is a pattern.. the confidence score i am getting is 0.91 – Sharma Saravanan Jul 18 '18 at 14:14
  • That's not how Rasa NLU works at all, or ML based natural language processing in general. You train it with real full text examples and the algorithm determines the pattern itself. Also fallbacks are not implemented by Rasa, you must implement them yourself. – Caleb Keller Jul 18 '18 at 17:38
  • yes, I implemented fallback with the intent name as default. But i don't know how and why that intent format is matching with the input in the format 11 55 220 or 20/07/2018 – Sharma Saravanan Jul 19 '18 at 09:25