I have recently got in to using SKLearn, especially Classification models and had a question more on use case examples, than being stuck on any particular bit of code, so apolgies in advance if this isn't the right place to be asking questions such as this.
So far I have been using sample data where one trains the model based on data that has already been classified. The 'Iris' data set for example, all the data is classified in to one of the three species. But what if one wants to group/classify the data without knowing the classifications in the first place.
Let's take this imaginary data:
Name Feat_1 Feat_2 Feat_3 Feat_4
0 A 12 0.10 0 9734
1 B 76 0.03 1 10024
2 C 97 0.07 1 8188
3 D 32 0.21 1 6420
4 E 45 0.15 0 7723
5 F 61 0.02 1 14987
6 G 25 0.22 0 5290
7 H 49 0.30 0 7107
If one wanted to split the names in to 4 separate classifications, using the different features, is this possible, and which SKLearn model(s) is needed? I'm not asking for any code, I'm quite able to research on my own if someone could point me in the right direction? So far I can only find examples where the classifications are already known.
In the example above, if I wanted to break the data down in to 4 classifications I would want my outcome to be something like this (note the new column, denoting the class):
Name Feat_1 Feat_2 Feat_3 Feat_4 Class
0 A 12 0.10 0 9734 4
1 B 76 0.03 1 10024 1
2 C 97 0.07 1 8188 3
3 D 32 0.21 1 6420 3
4 E 45 0.15 0 7723 2
5 F 61 0.02 1 14987 1
6 G 25 0.22 0 5290 4
7 H 49 0.30 0 7107 4
Many thanks for any help