How to identify/detect a vocabulary in a text (Node JS)

Question

I'm currently working on an app on which I have blocs of text and would like to know if they're related to cooking / recipe vocabulary. I've seen and tried a few things, but I'm starting to wonder if I'm not going to much overkill on that ( I don't want to recreate the wheel ).

The road on which I'm working now implies to get all words related to this vocabulary ( ingredients, actions, objects.. in many languages) and compare my database to each word on my texts blocs and then define a score for each bloc that would be used to decide (depending on my threshold) if should keep it or not.

The main problem with this method is that I need to create a very big database myself (which is a long ass process) and the bigger my database gets, the longer/less effective the comparing process might be. Any ideas of howto do that ? Thank you !

Take a look at [Natural Language AI](https://cloud.google.com/natural-language) — Vahid, Nov 04 '21 at 17:34

score 0 · Answer 1 · answered Jan 31 '22 at 17:06

0

Consider using a text classifier and train it using relevant examples. A simple starting point is to use a naive bayes text classifier — it is fast and generates models of reasonable size.

answered Jan 31 '22 at 17:06

sks

149
6

How to identify/detect a vocabulary in a text (Node JS)

1 Answers1