0

I made a method in python to compute a triangular mutual information matrix.

def get_mutual_information_matrix(X_train):
    p = len(X_train.columns)
    
    MI_matrix = np.zeros((p,p))

    for i in range(p):
        for j in range(p):
            # triangular matrix
            if i < j:
                continue
            elif i == j:
                MI_matrix[i,j] = 1
            else:
                MI_matrix[i,j] = mutual_info_regression(X_train.iloc[:,i].to_frame(), X_train.iloc[:,j], discrete_features=[False])[0]
    return MI_matrix

I would like to use this matrix to drop redundant features. For each feature that has a mutual information with other features above a certain treshold, I would like to remove the one that has the less mutual information with the target.

Does that make sense ? How could I do that ?

0 Answers0