0

I am going to build a chat bot from scratch with rasa.The biggest difficulty now is how to automate production training data.Training data includes nlu.md and stories.md .

I have tried rasa-nlu-trainer and Chatito,But there are still a lot of manual operations,If there are tens of thousands of corpora in the future.How to mark the data to make the data meet the data format of nlu.md and stories.md

Is there an automated tool or program to do this? Thanks a lot!

shaojie
  • 121
  • 1
  • 11
  • i'm not a pro but, have you tried https://botsociety.io/chat ? – Rajitha Fernando Oct 16 '19 at 09:16
  • I didn't downvoted but you have to explain in detail what you mean with "production training data"; are data text documents (you mentioned corpora) ? Or what? Please re-edit your question with examples – Giorgio Robino May 01 '20 at 16:27

1 Answers1

2

Well, if you're doing anything ML related, your data is the most important thing that you'll need for the model to learn from. And because we want the model to learn from that data, we create the data and then train the model with it. What you're asking for is for something to somehow create the data for it. It's precisely because there doesn't exist anything like that that we create datasets to train the AI on, by ourselves, so that the model learns form it. So, if you automate the data creation process, what do you expect the model to learn?

So, you can't create the data automatically because if that were possible, we would already have had Artificial General Intelligence (AGI) by now.

But if your goal is to just format the data then you can just write a script for that.

lahsuk
  • 1,134
  • 9
  • 20