I want to compute the pearson p value for values of two dictionaries using for loop. Dictionaries represent the data of two dataframes one of which has some changes. The dictionaries contain the info on the name of columns, the keys and the histogram values for each column. So basically I want to compute the p values for each column for these two dictionaries
both of the dictionaries have the following structure:
{'columnname1': {'keys': [0, 46.72, 50], 'values': [41, 13, 23, 21...0, 0, 1]},
'columnname2': {'keys': [0, 20, 50], 'values': [21, 43, 25, 2...0, 3, 15},...}
To compute the p-value for each column I tried to do the next function:
def ChiTest(hist_1, hist_2):
hist = {}
for column1 in hist_1.keys():
for column2 in hist_1.keys():
hist[column1] = {}
hist[column1]['keys'] = hist_2[column2]['keys']
hist[column1]['pearson'] = pearsonr(hist_1[column1]['values'], hist_2[column2]['values'])
return (hist)
test = ChiTest(one, two)
The hist[column]['keys'] work well but the hist[column]['pearson'] = pearsonr(hist_2[column]['values'], hist_1[column]['values']) raise the KeyError message
KeyError: 'values'
And I can't figure out what have I missed. Any help is appreciated.