I'm trying to use LibSVM classifier in Weka to build a one class SVM classifier.
My training file has list of noun words. My test file has many words. My aim is to use the classifier to predict the words which are nouns in test file.
My input arff file (ip.arff)(training file) looks like this:
@relation test1
@attribute name string
@attribute class {yes}
@data
'building',yes
'car',yes
..... and so on
My test file(test.arff) (test file) looks like this:
@relation test2
@attribute name string
@attribute class {yes}
@data
'car',?
'window',?
'running',?
..... and so on
Here's what I've done:
- Since the datatype is string, I used batch Filtering on both input files to generate ipstd.arff and teststd.arff as mentioned in http://weka.wikispaces.com/Batch+filtering
Next i load and run the classifier with ipstd.arff. (Note: All the words are classified as yes)
Next I load the test set teststd.arff and re-evaluate the model.
But all the words are classified as nouns('yes')
=== Predictions on user test set ===
inst# actual predicted error prediction
1 1:? 1:yes 1 2 1:? 1:yes 1 3 1:? 1:yes 1
and so on
My problem is that all words in test file(teststd.arff) are classified as nouns
Can someone tell where I'm going wrong.. What should I do classify noun words in test set with 'yes' and others as outliers. Thanks...