I have a dataset which has 683 samples and 9 features. I want to compare KLDivergence of two datasets for each column.
originalAttribute = np.asarray(originalData[:, i]).reshape(row)
histOriginal = np.histogram(originalAttribute, bins=binSize)
hist_original_dist = st.rv_histogram(histOriginal)
generatedAttribute = np.asarray(generatedData[:, i]).reshape(row)
histGenerated = np.histogram(generatedAttribute, bins=binSize)
hist_generated_dist = st.rv_histogram(histGenerated)
x = np.linspace(-5, 5, 100)
summation += st.entropy(hist_original_dist.pdf(x), hist_generated_dist.pdf(x))
It returns infinitive but I think I did something wrong. In hist_original_dist.pdf(x)
function, I have some values such as 2.65 which shouldn't exist for pdf in python