I'm messing around with machine learning, and I've written a K Means algorithm implementation in Python. It takes a two dimensional data and organises them into clusters. Each data point also has a class value of either a 0 or a 1.
What confuses me about the algorithm is how I can then use it to predict some values for another set of two dimensional data that doesn't have a 0 or a 1, but instead is unknown. For each cluster, should I average the points within it to either a 0 or a 1, and if an unknown point is closest to that cluster, then that unknown point takes on the averaged value? Or is there a smarter method?
Cheers!