I am using the scikit-learn KNeighborsClassifier for classification on a dataset with 4 output classes. The following is the code that I am using:
knn = neighbors.KNeighborsClassifier(n_neighbors=7, weights='distance', algorithm='auto', leaf_size=30, p=1, metric='minkowski')
The model works correctly. However, I would like to provide user-defined weights for each sample point. The code currently uses the inverse of the distance for scaling using the metric='distance'
parameter.
I would like to continue to keep the inverse distance scaling but for each sample point, I have a probability weight as well. I would like to apply this as a weight in the distance calculation. For example, if x
is the test point and y,z
are the two nearest neighbors for which distance is being calculated, then I would like the distance to be calculated as (sum|x-y|)*wy and (sum|x-z|)*wz respectively.
I tried to define a function that was passed into the weights
argument but then I also would like to keep the inverse distance scaling in addition to the user defined weight and I do not know the inverse distance scaling function. I could not find an answer from the documentation.
Any suggestions?