To improve the recomender system for Buyer Material Groups, our company is willing to train a model using customer historial spend data. The model should be trained on historical "Short text descriptions" to predict the appropriate BMG. The dataset has more that 500.000 rows and the text descriptions are multilingual (up to 40 characters).
1.Question: can i use supervised learning if i consider the fact that the descriptions are in multiple languages? If Yes, are classic approaches like multinomial naive bayes or SVM suitable?
2.Question: if i want to improve the first model in case it is not performing well, and use unsupervised multilingual emdedding to build a classifier. how can i train this classifier on the numerical labels later?
if you have other ideas or approaches please feel free :). (It is a matter of a simple text classification problem)