I'm using SMOTE filter in WEKA to balance data.
I have doubts about the two parameters nearestNeighbors
and percentage
.
nearestNeighbors -- The number of nearest neighbors to use.
percentage -- The percentage of SMOTE instances to create.
How should I set them?
I thought the number of neighbors is the amount of syntetic samples it is going to create.
So what's the meaning of percentage? It should be less than or equal to the number of neighbors, right? Is the percentage of syntetic samples considered?
For example:
If I put 10 neighbors and 200% what will happen?
Can anyone give me some examples of correct use?