3

Me and my team have been using Rasa NLU as a replacement for MS LUIS for over 2 months now, and it has worked out pretty well for us so far. Now we have around 900 entries as Entity Synonyms(as we were using List entity in LUIS).

And only for some utterances, the entity is detected as synonyms and for the majority of utterances, it is unable to detect Entity Synonyms. In order to detect synonyms, I have to create another simple entity which again we are manually training with all the synonym values, once the intents are trained with this simple Entity Rasa seems to detect entity for this intent as both simple and synonyms.

And another quick question, Is the Entity Synonyms in Rasa designed to return only one matched entity(unlike LUIS which used to return all the matched entities values)?

Is there any alternative to list entity from LUIS here in Rasa?

Hari Govind
  • 369
  • 2
  • 14

2 Answers2

8

Entity Synonyms in Rasa can lead to some confusion. The actual functionality that they provide is very simple. For each entity that is parsed by the model the value of that entity is checked against the list of entity synonyms. If the value matches an entity synonym then it is replaced with the synonym value.

The big catch in the above statement is that the the entity has to be identified by the model before it can be replaced with a synonym.

So take this as a simplified example. Here is my entity synonym definition:

{
  "value": "New York City",
  "synonyms": ["NYC", "nyc", "the big apple"]
}

If my training data only provides this example:

{
  "text": "in the center of NYC",
  "intent": "search",
  "entities": [
    {
      "start": 17,
      "end": 20,
      "value": "New York City",
      "entity": "city"
    }
  ]
}

It is very unlikely that my model will be able to detect an entity in a sentence like In the center of the big apple. As I said above if the big apple isn't parsed as an entity by the model it cannot be replaced by the entity synonyms to read New York City.

For this reason you should include more examples in the actual common_examples of the training data with the entities labeled. Once all of the variations of the entity are being classified correctly then add those values to the entity synonym and they will be replaced.

[
  {
    "text": "in the center of NYC",
    "intent": "search",
    "entities": [
      {
        "start": 17,
        "end": 20,
        "value": "New York City",
        "entity": "city"
      }
    ]
  },
  {
    "text": "in the centre of New York City",
    "intent": "search",
    "entities": [
      {
        "start": 17,
        "end": 30,
        "value": "New York City",
        "entity": "city"
      }
    ]
  }
]

I've opened a pull request into the Rasa docs page to add a note to this effect.

Jack
  • 2,891
  • 11
  • 48
  • 65
Caleb Keller
  • 2,151
  • 17
  • 26
  • Thanks for the answer mate. So is there something else like LUIS's list entity. I have around 900 values to be added as synonyms, any suggestions? – Hari Govind Nov 15 '17 at 06:28
  • We use a simple script (ours is in node, but yours could be in python, etc) that swaps the value out, calculates the start/end position, and pushes that to the `common_examples` array. It's worth noting that in `common_examples` you don't have to label the intent. If you have an object with just `text` and `entities` then it will just impact entity classification without impacting intent classificaiton. – Caleb Keller Nov 15 '17 at 06:33
  • Hey @caleb keller sorry to disturb you again, I have an automated script which as you recommended is constructing the entry for me to add inside common_examples. So I have around 900 names and all I want is for them to be detected as custom "People" entity(in total it has some synonyms which makes it around 2000 entities). Even after training all of the entities Rasa is not able to detect all the names. Any suggestions as to how can I optimise the detection. – Hari Govind Nov 16 '17 at 14:32
  • @HariGovind did you sort your issue? – Kunal Mukherjee Dec 04 '17 at 10:33
0

Firstly, I have downloaded some LUIS model JSON for doing this, as shown in the following screenshot:

enter image description here

Next, I have written a sample C# console app for converting LUIS Model Schema into RASA.

Here is the LUISModel model class.

using Newtonsoft.Json;
using System;
using System.Collections.Generic;

    namespace JSONConversion.Models
    {

        public class LuisSchema
        {
            public string luis_schema_version { get; set; }
            public string versionId { get; set; }
            public string name { get; set; }
            public string desc { get; set; }
            public string culture { get; set; }
            public List<Intent> intents { get; set; }
            public List<entity> entities { get; set; }
            public object[] composites { get; set; }
            public List<Closedlist> closedLists { get; set; }
            public List<string> bing_entities { get; set; }
            public object[] actions { get; set; }
            public List<Model_Features> model_features { get; set; }
            public List<regex_Features> regex_features { get; set; }
            public List<Utterance> utterances { get; set; }
        }


        public class regex_Features
        {
            public string name { get; set; }
            public string pattern { get; set; }
            public bool activated { get; set; }
        }
        public class Intent
        {
            public string name { get; set; }
        }

        public class entity
        {
            public string name { get; set; }
        }

        public class Closedlist
        {
            public string name { get; set; }
            public List<Sublist> subLists { get; set; }
        }

        public class Sublist
        {
            public string canonicalForm { get; set; }
            public List<string> list { get; set; }
        }

        public class Model_Features
        {
            public string name { get; set; }
            public bool mode { get; set; }
            public string words { get; set; }
            public bool activated { get; set; }
        }

        public class Utterance
        {
            public string text { get; set; }
            public string intent { get; set; }

            [JsonProperty("entities")]
            public List<Entities> Entities { get; set; }
        }

        public class Entities
        {
            [JsonProperty("entity")]
            public string Entity { get; set; }
            public int startPos { get; set; }
            public int endPos { get; set; }
        }
    }

Here is the RASAModel model class:

using Newtonsoft.Json;
using System;
using System.Collections.Generic;

namespace JSONConversion.Models
{
    public class RASASchema
    {
        public Rasa_Nlu_Data rasa_nlu_data { get; set; }
    }

    public class Rasa_Nlu_Data
    {
        public List<Entity_Synonyms> entity_synonyms { get; set; }

        public List<Regex_Features> regex_features { get; set; }
        public List<Common_Examples> common_examples { get; set; }

    }

    public class Entity_Synonyms
    {
        public string value { get; set; }
        public List<string> synonyms { get; set; }
    }

    public class Common_Examples
    {
        public string text { get; set; }
        public string intent { get; set; }
        public List<Entity> entities { get; set; }
    }


    public class Entity
    {
        public string entity { get; set; }
        public string value { get; set; }
        public int start { get; set; }
        public int end { get; set; }
    }

    public class Regex_Features
    {
        public string name { get; set; }
        public string pattern { get; set; }
    }
}

And I have written 2 methods which parse the LUISModel model class for synonyms from the phraselist section and adds them in the common_examples object in RASA_NLU training object.

using JSONConversion.Models;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Threading.Tasks;

namespace JSONConversion.Services
{
    public static class JSONHelper
    {
        public static Task<string> ReadFromFile(string FilePath)
        {
            try
            {
                Task<string> readFromFileTask = Task.Run<string>(() => 
                {
                    return File.ReadAllText(FilePath);
                });
                return readFromFileTask;
            }
            catch(Exception ex)
            {
                throw;
            }
        }

        public static RASASchema ConvertLUISJSON(string StringifiedLUISJson)
        {
            try
            {
                LuisSchema luisSchema = JsonConvert.DeserializeObject<LuisSchema>(StringifiedLUISJson);

                RASASchema rasaSchema = new RASASchema();
                rasaSchema.rasa_nlu_data = new Rasa_Nlu_Data();
                rasaSchema.rasa_nlu_data.common_examples = new List<Common_Examples>();
                rasaSchema.rasa_nlu_data.entity_synonyms = new List<Entity_Synonyms>();
                rasaSchema.rasa_nlu_data.regex_features = new List<Regex_Features>();


                luisSchema.closedLists.ForEach(x =>
                {
                    x.subLists.ForEach(y =>
                    {
                        rasaSchema.rasa_nlu_data.entity_synonyms.Add(new Entity_Synonyms()
                        {
                            value = y.canonicalForm,
                            synonyms = y.list
                        });
                    });
                });

                luisSchema.model_features.ForEach(x =>
                {
                    rasaSchema.rasa_nlu_data.entity_synonyms.Add(new Entity_Synonyms()
                    {
                        value = x.name,
                        synonyms = x.words.Split(',').ToList()
                    });
                });

                luisSchema.regex_features.ForEach(x =>
                {
                    rasaSchema.rasa_nlu_data.regex_features.Add(new Regex_Features()
                    {
                        name = x.name,
                        pattern = x.pattern
                    });
                });

                luisSchema.utterances.ForEach(x =>
                {
                    Common_Examples rasaUtterances = new Common_Examples();
                    rasaUtterances.text = x.text;
                    rasaUtterances.intent = x.intent;

                    List<Entity> listOfRASAEntity = new List<Entity>();

                    x.Entities.ForEach(y =>
                    {
                        listOfRASAEntity.Add(new Entity()
                        {
                            start = y.startPos,
                            end = y.endPos,
                            entity = y.Entity,
                            value = x.text.Substring(y.startPos, (y.endPos - y.startPos) + 1)
                        }); 
                    });

                    rasaUtterances.entities = listOfRASAEntity;
                    rasaSchema.rasa_nlu_data.common_examples.Add(rasaUtterances);
                });

                return rasaSchema;
            }
            catch (Exception ex)
            {
                throw;
            }
        }
    }
}

And just called those JSON conversion methods to convert LUIS Models into RASA models.

using System.Text;
using JSONConversion.Services;
using System.IO;
using Newtonsoft.Json;
using Newtonsoft.Json.Serialization;

namespace JSONConversion
{
    class Program
    {
        static void Main(string[] args)
        {

            string json = JsonConvert.SerializeObject(JSONConversion.Services.JSONHelper.ConvertLUISJSON(JSONHelper.ReadFromFile(@"C:\Users\xyz\Documents\luis.json").Result), new JsonSerializerSettings()
            {
                ContractResolver = new CamelCasePropertyNamesContractResolver(),
                Formatting = Formatting.Indented
            });

            File.WriteAllText(@"C:\Users\xyz\Desktop\RASA\data\examples\RasaFormat.json", json, Encoding.UTF8);

        }
    }
}

After getting the RASA model, you can just simply train RASA for synonyms.

Kunal Mukherjee
  • 5,775
  • 3
  • 25
  • 53
  • Hey @kunal thanks for your answer but in my scenario I have around 980 values as entity synonyms and I have already written methods to load the data from LUIS into Rasa, but my issue is the inconsistency of detection of the synonyms. – Hari Govind Dec 08 '17 at 07:22
  • @Caleb Keller currently I am stuck on how to feed the synonyms to RASA, the MITIE pipeline is not getting installed on my Windows Server machine. If you have any clue, please do let me know. I have used Cmake with Visual Studio 2015 to build MITIE nlp from their Git link. But I am not able to integrate MITIE pipeline as a python PyPi package. – Kunal Mukherjee Dec 08 '17 at 07:33
  • @HariGovind what kind of setup are you using for RASA, are you using Windows server machine and running it on docker or an Ubuntu VM? – Kunal Mukherjee Dec 08 '17 at 07:34
  • @HariGovind are you referring entity synonyms as LUIS "phraselist" or what? Please explain. – Kunal Mukherjee Dec 08 '17 at 07:35
  • Hey @kunal, I am talking about something in the lines of List entity in LUIS, which has a canonical form and synonyms. And my Configuration is SkLearn + Spacy on Windows Server. – Hari Govind Dec 08 '17 at 08:45
  • @HariGovind I am also using the same sPacy + skLearn pipeline, as I read, spacy is a poor choice for custom entities and synonyms, instead Mitie is recommended for custom entities. And for synonyms Tensorflow is recommended. But for integrating it with RASA, i am not getting. – Kunal Mukherjee Dec 08 '17 at 09:13
  • Yes @kunal mitie is recommended but the training time is not very convenient to what my client's requirement was, and also we have around 200 intents and SkLearn has a pretty good intent classification. Tensorflow is a really good alternative but i am not sure about integrating it with Rasa. – Hari Govind Dec 08 '17 at 09:57