Varying n_neighbors in scikit-learn KNN regression

Question

I am using scikit-learn's KNN regressor to fit a model to a large dataset with n_neighbors = 100-500. Given the nature of the data, some parts (think: sharp delta-function like peaks) are better fit with fewer neighbors (n_neighbors ~ 20-50) so that the peaks are not smoothed out. The location of these peaks are known (or can be measured).

Is there a way to vary the n_neighbors parameter?

I could fit two models and stitch them together, but that would be inefficient. It would be preferable to either prescribe 2-3 values for n_neighbors or, worse, send in an list of n_neighbors.

score 1 · Accepted Answer · answered Nov 10 '16 at 22:11

I'm afraid not. In part, this is due to some algebraic assumptions that the relationship is symmetric: A is a neighbour to B iff B is a neighbour to A. If you give different k values, you're guaranteed to break that symmetry.

I think the major reason is simply that the algorithm is simpler with a fixed quantity of neighbors, yielding better results in general. You have a specific case that KNN doesn't fit so well.

I suggest that you stitch together your two models, switching dependent on the imputed second derivative.

Thanks, I was afraid that was the case. I did not know about the symmetry postulate but it makes sense. — saud, Nov 10 '16 at 22:24

Varying n_neighbors in scikit-learn KNN regression

1 Answers1