0

I am using KNIME in order to activate a WEKA node AttributeSelectedClassifier . But i keep getting this exception claiming that my attribute is nominal and has duplicate values.

But, it is numeric and it is very expected to have duplicate values in the dataset!

AttributeSelectedClassifier - How to deal with error "A nominal attribute (likes) cannot have duplicate labels ('(0.045455-0.045455]')"

I found similar topics to this one but none of them is covering how to chose the scalar to scale values with

1st Question: so i will be happy if someone can explain why is this behavior? I mean why duplicate values is bad?!

Anyway, One of the threads of a similar topic recommended to scale the values by a large enough number (a scalar)!

Based on that I multiplied values with 10^6 and got error about this value: 27027.027027-27027.027027

I multiplied by 10^7 and then got an error about this value: 270270.27027-270270.27027

when i multiplied by 10^8 it succeeded.

2nd Question: what is the right way to deal with this? and how can i, programatically, chose the scalar to scale with ?

The full error:

ERROR AttributeSelectedClassifier - Execute failed: IllegalArgumentException in Weka during training. Please verify your settings. A nominal attribute (Meanlikes) cannot have duplicate labels ('(0.045455-0.045455]').

Samer Aamar
  • 1,298
  • 1
  • 15
  • 23
  • Your input label is nominal, not numeric. Probably you have binned the numeric values before. – Gábor Bakos Sep 14 '17 at 19:22
  • As @GáborBakos says, have you pre-binned the numeric values? Alternatively, the values as so small as to be within the limit of whatever difference value is being used internally to decide numeric equality - which is why multiplying by a 'big enough' scalar works – SteveR Sep 15 '17 at 09:33
  • @SteveR no i didn't bin them maybe the AttributeSelectedClassifier itself does that but i didn't do it for sure The type of the field is double. If i have to multiply by "big enough" scalar - how to i determine that "big"? – Samer Aamar Sep 16 '17 at 04:59
  • @SamerAamar - I'm guessing that depends to some extent on the node configuration details and the data you have. You may be able to get some clues based on that from the WEKA source code at [https://github.com/bnjmn/weka/](https://github.com/bnjmn/weka/) – SteveR Sep 19 '17 at 12:26

0 Answers0