Multi class text classification when having only one sample for classes

Asked Feb 06 '21 at 17:50

Active Feb 07 '21 at 07:34

Viewed 112 times

I have a dataset of texts, each text was identified with an ID number. I would like to do a prediction by finding the best match ID number for upcoming new texts. To use multi text classification, I am not sure if this is the right approach since there is only one text for most of ID numbers. In this case, I wouldn't have any test set. Can up-sampling help? Or is there any other approach than classification for such a problem?

The data set looks like this:

id1 'text1', id2 'text2', id3 'text3', id3 'text4', id3 'text5', id4 'text6', . . id200 'text170'

I would appreciate any guidance to find the best approach for this problem.

edited Feb 07 '21 at 07:18

STA

30,729
8
45
59

asked Feb 06 '21 at 17:50

Fara

It seems to be a question for ai.stackexchange.com – user31264 Feb 07 '21 at 07:26
Thank you very much. I will just post the question there also. – Fara Feb 07 '21 at 09:29

Multi class text classification when having only one sample for classes

0 Answers0