1

If I have 2 lists of numbers (shown below), how could I find the KL Divergence? Do I first have to find the probability distribution in them (if so, how could one do that)?

I've tried putting the data through a kernel density function but it has not worked

data = [18, 16, 46, 4, 10, 7, 14, 51, 7, 4, 49, 9, 7, 7]
data = np.reshape(data, (-1, 1)) # Reshape data for KernelDensity() function
data2 = [0, 17, 0, 20, 77, 23, 7, 8, 8, 19, 0, 48, 19, 7, 4, 7, 16]
data2 = np.reshape(data2, (-1, 1)) # Reshape data for KernelDensity() function

from sklearn.neighbors import KernelDensity
kd = KernelDensity(kernel='gaussian', bandwidth=0.75).fit(data)
kd2 = KernelDensity(kernel='gaussian', bandwidth=0.75).fit(data2)

from scipy.special import kl_div
kl_div(kd, kd2)

When I run the code, I receive the following error

TypeError: ufunc 'kl_div' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

I've been trying to figure this out for a few hours. Thanks in advance.

  • Did you try rel_entr? `from scipy.special import rel_entr` `sum(rel_entr(kd, kd2))` should return the kL of the two distributions kd and kd2. – Sadcow Apr 15 '22 at 18:27
  • Hi, thanks for the response. When I try rel_entr, I get the same error message: `TypeError: ufunc 'rel_entr' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''` – BKBlaze king Apr 15 '22 at 18:34
  • Yes. your data should be same as list. Your input data is a real data? or the distribution of the data? – Sadcow Apr 15 '22 at 19:15
  • @Sadcow my data is the actual data not the distribution of the data. How can I turn it into a distribution? – BKBlaze king Apr 15 '22 at 19:24

0 Answers0