1

To be more specific, The traditional chatbot framework consists of 3 components:

  1. NLU (1.intent classification 2. entity recognition)
  2. Dialogue Management (1. DST 2. Dialogue Policy)
  3. NLG.

I am just confused that If I use a deep learning model(seq2seq, lstm, transformer, attention, bert…) to train a chatbot, Is it cover all those 3 components? If so, could you explain more specifically how it related to those 3 parts? If not, how can I combine them?

For example, I have built a closed-domain chatbot, but it is only task-oriented which cannot handle the other part like greeting… And it can’t handle the problem of Coreference Resolution (it seems doesn't have Dialogue Management).

Alex Bravo
  • 1,601
  • 2
  • 24
  • 40
Charon
  • 11
  • 1

2 Answers2

0

It seems like your question can be split into two smaller questions:

  1. What is the difference between machine learning and deep learning?
  2. How does deep learning factor into each of the three components of chatbot frameworks?

For #1, deep learning is an example of machine learning. Think of your task as a graphing problem. You transform your data so it has an n-dimensional representation on a plot. The goal of the algorithm is to create a function that represents a line drawn on the plot that (ideally) cleanly separates the points from one another. Each sector of the graph represents whatever output you want (be it a class/label, related words, etc). Basic machine learning creates a line on a 'linearly separable' problem (i.e. it's easy to draw a line that cleanly separates the categories). Deep learning enables you to tackle problems where the line might not be so clean by creating a really, really, really complex function. To do this, you need to be able to introduce multiple dimensions to the mapping function (which is what deep learning does). This is a very surface-level look at what deep learning does, but that should be enough to handle the first part of your question.

For #2, a good quick answer for you is that deep learning can be a part of each component of the chatbot framework depending on how complex your task is. If it's easy, then classical machine learning might be good enough to solve your problem. If it's hard, then you can begin to look into deep learning solutions.

Since it sounds like you want the chatbot to go a bit beyond simple input-output matching and handle complicated semantics like coreference resolution, your task seems sufficiently difficult and a good candidate for a deep learning solution. I wouldn't worry so much about identifying a specific solution for each of the chatbot framework steps because the tasks involved in each of those steps blend into one another with deep learning (e.g. a deep learning solution wouldn't need to classify intent and then manage dialogue, it would simply learn from hundreds of thousands of similar situations and apply a variation of the most similar response).

I would recommend handling the problem as a translation problem - but instead of translating from one language to another, you're translating from the input query to the output response. Translation frequently needs to resolve coreference and solutions people have used to solve that might be an ideal course of action for you.

Here are some excellent resources to read up on in order to frame your problem and how to solve it:

nlpnoah
  • 26
  • 4
0

There is always a trade-off between using traditional machine learning models and using deep learning models.

  1. Deep learning models require large data to train and there will be an increase in training time & testing time. But it will give better results.

  2. Traditional ML models work well with fewer data with moderate performance comparatively. The inference time is also less.

For Chatbots, latency matters a lot. And the latency depends on the application/domain.

If the domain is banking or finance, people are okay with waiting for a few seconds but they are not okay with wrong results. On the other hand in the entertainment domain, you need to deliver the results at the earliest.

The decision depends on the application domain + the data size you are having + the expected precision.

RASA is something worth looking into.

Kalsi
  • 579
  • 5
  • 13