I have a pandas dataframe where each row corresponds to one sample and each column represents one feature. Now one of my columns is a string column which contains text like "This is a red apple". How can I convert this to a form that pearson's correlation matrix can be computed for this dataframe? Similarly I have another column which takes in a list of identifiers.
Below is an example:
id text list_of_ids score1 score2
1. "This is An apple" [1, 2, 3, 4] 4.6. 1.0
2. "This is An orange" [1, 5, 6] 5.2 1.4