i have an arff file that is built with stringtowordvector
and contains features and it's TFIDF wight like this:
@relation 'sss'
-weka.filters.unsupervised.attribute.StringToWordVector-R-W100-prune-rate-1.0-C-T-I-N0-S-stemmerweka.core.stemmers.NullStemmer -tokenizerweka.core.tokenizers.WordTokenizer -delimiters \" ؟،؛\\r\\t\\n.,;:\\\'\\\"()?!-><#$\\\%&*+/@^_=[]{}|`~0123456789\"'
@attribute @@class@@ {mis,pol}
@attribute water numeric
@attribute start numeric
@attribute government numeric
{2 0.285724,6 0.338022,7 0.517187,8 0.164801,9 ...}
{7 1.191401,8 0.560813,9 0.904039,10 0.322267....}
..
....
{0 pol,6 1.276448,36 0.702977,...}
now i have a test folder that contain 2 class text.(such as train set:pol and mis ). and i want to classify this test and evaluate my train set.i know that for this purpose i should use batch filter so i read this link : http://weka.wikispaces.com/Use+WEKA+in+your+Java+code#Filter-Batch%20filtering based on this link , my test and train set should be in the same format (simple text format) .i don't know what should i do when my train set is in arff format and my test set is in text format.(i don't have train set in text files format )