I have a list of reviews, each element of the list is a review of IMDB data set in kaggle. there are 25000 reviews in total. I have the label of each review +1 for positive and -1 for negative.
I want to train a Hidden Markov Model with these reviews and labels.
1- what is the sequence that I should give to HMM? is it something like Bag of words or is it something else like probabilities which I need to calculate? what kind of feature extraction method is appropriate? I was told to use Bag of words on review's list, but when I searched a little I find out HMM cares about the order but bag of words doesn't maintain the order of words in sequences. how should I prepare this List of reviews to be able to feed it into a HMM model?
2- is there a framework for this? I know hmmlearn, and I think I should use the MultinomialHMM, correct me if I'm wrong. but it is not supervised, its models do not take labels as input when i want to train it, and I get some funny errors which I don't know how to solve because of the first question I asked about the correct type of input I should give to it. seqlearn is the one I find recently, is it good or there is a better one to use?
I appreciate any guidance since I have almost zero knowledge about NLP.