I'm new in machine learning and recently got job to do R&D related to Big Data.
The main idea is to get the insight from random collection of big data (I don't know yet what will be the data) and turn it into information and then from information turn it into knowledge. Common things.
I realized that in the end mostly Big Data analysis are using Machine Learning to do some of it jobs automatically. Therefore, my focus for now is changed to Machine Learning first.
The first things I know is, to get insight from a data that we don't know, it is most likely suitable for Unsupervised Learning. So, I tried Clustering first using K-means.
In here, I started to have questions:
In K-means, we need to decided the K. Which is weird for me, why we need to decided the cluster quantity result, when I expect it will be able to make it's own border and decide how many cluster it found ?
Even if the cluster is decided, how do I know what is the insight that I got ? While I don't even know how the cluster had been decided. So in the end we still need manual analysis for this kind of things ?
I wonder, is there a way to get insight from random data without additional manual analysis, or is it supposed to be like that ?