0

I have different Features in my dataset these features names as following A B C D E F G H

There is a correlation between these features

Features   Correlation
----------------------
A B        70
A C        78
B C        96
A G        93
.
.
.

Therefore, I would like to group similar features together so they can be represented by one feature

Something Like this

Seed   Group        Correlations Avg
-----------------------------------
A      D & G         98 + 93 / 2 = 95.5
B      F & C & E     85 + 96 + 79 / 3 = 86.6
..
..
..
H      -             -

So I get all close correlations in the same group

Another view to the problem

multiple cities in the country (City A B C D.. H)

Each city has a connection to another city

Cities   Connection %
----------------------
A B        70
A C        78
B C        96
A G        93
.
.
.

We would like to hire area managers where cities with close connections can be served by the same area manager

We want to have the optimal number of area managers and where they should reside

Office Area   Other Served Areas        Connection Avg
------------------------------------------------------
A             D & G         98 + 93 / 2 = 95.5
B             F & C & E     85 + 96 + 79 / 3 = 86.6
..
..
..
H             -             -

I just want a method of how to figure how to split these features/cities in an optimum way that can cover most features/cities with a minimum number of links/area managers

asmgx
  • 7,328
  • 15
  • 82
  • 143

0 Answers0