I am new to Mallet and using it for making a MaxEnt model. What I want to achieve is I wanted to classify a text in some categories. (Using sample names for categories) I have my training data in a folder named as fruits_training_data
which have 4 files:
apples.txt
bananas.txt
oranges.txt
mangoes.txt
First I imported this data in mallet using this command.
bin\mallet import-dir --input fruits_training_data --output fruits_training.mallet
I had also test data for each category in separate folder and in separate files. It's hierarchy is also same. Folder name is fruits_testing_data
. It has same file names as training folder. Same thing I did with test data.
bin\mallet import-dir --input fruits_testing_data --output fruits_testing.mallet
Then I am making MaxEnt model using this command.
bin\mallet train-classifier --training-file fruits_training.mallet --testing-file fruits_testing.mallet --trainer MaxEnt --report test:accuracy
This is giving me an error as:
Training and testing alphabets don't match! at cc.mallet.classify.tui.Vectors2Classify.main(Vectors2Classify.java:27 5)
I searched this and did not find any help until now. Can someone help me to figure out at which step I am doing mistake? Will be thankful to you.