4

I need to extract free-text entities like :

"Can you research for coconut nut is not a nut"

then, the entity should be "coconut nut is not a nut".

So there is not really a precise entity. In Wit, dialogflow and luis, they use wildcards (@sys.any, wit/local_search_query ...).

Is there a wildcard like this in rasaNLU? I cannot find the list of prebuilt entities in the documentation...

Thank you.

DeepProblems
  • 597
  • 4
  • 24

1 Answers1

2

Rasa NLU doesn't treat this any differently from other entities: entities can span multiple words, so you can annotate this as:

Can you research for [coconut nut is not a nut](query) (in markdown format)

For built-in entities you can use the duckling and spaCy NER components .

amn41
  • 1,164
  • 1
  • 9
  • 17
  • 1
    This doesn't work for me. After training rasa nlu will not detect free text, but only text similar to "coconut nut is not a nut". But this is not a free text entity. I expect it to detect any kind of text after "Can you research for" . – asmaier Aug 27 '18 at 15:43
  • How many training example did you use for training? That worked for me with around 15-20 examples. – DeepProblems Sep 04 '18 at 12:45
  • 1
    This doesn't really work. I have an intent like: find (reports)[object_type] that use the (Item Count)[obj_name] (metric)[search_object_type] In this case, the obj_name entity should take ANY string. I've trained my bot with 30+ examples, but it will only pick up the entity if it's an exact match or partial match (ex: Item instead of Item Count). I can't put in every string imaginable to train the bot, there has to be a way for it to just take anything a user enters since other platforms have figured it out (like @sys.any in Dialogflow). – userwithquestions Oct 18 '18 at 14:55
  • if you don't need flexibility on how the user can express the query, then you can write a regex like `'Can you research for (.*)'` and catch every case that way. If you want a system that is flexible with respect to the phrasing, you can: (1) write a whole bunch of regexes, or (2) you can train a model like in Rasa NLU. Neither will perfectly catch every variation, because your regexes will never be complete, and your model will never be perfect. Maybe try [lookup tables](https://medium.com/rasa-blog/entity-extraction-with-the-new-lookup-table-feature-in-rasa-nlu-94c6c30876a3) – amn41 Oct 19 '18 at 15:12
  • @userwithquestions I trained my bot with 30 very different examples (think random strings and numbers) using the tensorflow pipeline and this seems to work. Note that the tensorflow pipeline in gerneral needs more examples, see https://stackoverflow.com/a/53412974/179014 . – asmaier Nov 28 '18 at 13:45