0

I want to start text classifying a database and all classifiers are locked and i can't use any of them, i've used stringtowordVector filter for them to open and they're still locked and this is my arff file

@relation emails
@attribute email string
@attribute class {normal,spam}


@data
"Hi, How are you I wish you are fine, would you fill this"  normal
"buy this awesome shirt and get a free one" normal
"this is our bestselling product"   normal
"hi, we just wanted to inform you that you had been fired"  normal
"what's wrong with you are you crazy! Answer me now"    spam
"Are you free now? I want to call you"  spam
"reset your password now"   normal
"your subscription is about to expire"  normal
"what the hell is this, you are awesome"    spam
"this is just a reminder, don't forget your assignment" normal
"you are a stupid manager"  spam
"I just wanted to inform you that you are a silly man"  spam
"hi, thanks for your interest in our platform, kindly pay"  normal
"confirm your email to get started" normal
"your course had been withdrawed, see you later"    normal
"you are my best friend"    normal
"why didn't you came!"  spam
"hi doctor, I didn't make any assignment, what should I do" spam
  • 1
    Values in [ARFF](https://waikato.github.io/weka-wiki/formats_and_processing/arff_stable/) files are to be separated by commas, not blanks or tabs. Otherwise you might experience strange behavior if the parser is less lenient in the future. – fracpete Jan 17 '22 at 20:14

1 Answers1

1

After applying the StringToWordVector, ensure that you select the correct attribute (class in your case) on the Classify tab. By default, the Explorer selects the last one. However, the StringToWordVector filter moves any class attribute to the start and therefore a word attribute, which is most likely numeric, will be selected automatically as the class.

fracpete
  • 2,448
  • 2
  • 12
  • 17