I have a question regarding a project in which I have to classify text. In this project I have several thousand questions (strings), which should be put into the categories tech, sports, politics, history, science and geography. My training data (already labeled) is 200 questions in size (I can easily expand on that). I tried TextBlob (which uses NLTK) with the NB-classifier, but that only brought me an accuracy of 28%. Currently I am in search of new possibilities to improve accuracy (k-NN, SVM, ...).
Do you have any suggestion what I should use to categorize these questions?
Sincerely, James