0

I'm currently working on an app on which I have blocs of text and would like to know if they're related to cooking / recipe vocabulary. I've seen and tried a few things, but I'm starting to wonder if I'm not going to much overkill on that ( I don't want to recreate the wheel ).

The road on which I'm working now implies to get all words related to this vocabulary ( ingredients, actions, objects.. in many languages) and compare my database to each word on my texts blocs and then define a score for each bloc that would be used to decide (depending on my threshold) if should keep it or not.

The main problem with this method is that I need to create a very big database myself (which is a long ass process) and the bigger my database gets, the longer/less effective the comparing process might be. Any ideas of howto do that ? Thank you !

sblr
  • 1
  • 1

1 Answers1

0

Consider using a text classifier and train it using relevant examples. A simple starting point is to use a naive bayes text classifier — it is fast and generates models of reasonable size.

sks
  • 149
  • 6