1

I was trying to run some entropy() calculations on Force Platform data and i get a warning message:

> library(entropy)
> d2 <- read.csv("c:/users/SLA9DI/Documents/data2.csv")
> entropy(d2$CoPy, method="MM")
[1] 10.98084
> entropy(d2$CoPx, method="MM")
[1] 391.2395
Warning message:
In log(freqs) : NaNs produced

I am sure it is because the entropy() is trying to take the log of a negative number. I also know R can do complex numbers using complex(), however i have not been successful in getting it to work with my data. I did not get this error on my CoPy data, only the CoPx data, since a force platform gets Center of Pressure data in 2 dimensions. Does anyone have any suggestions on getting complex() to work on my data set or is there another function that would work better to try and get a proper entropy calculation? Entropy shouldn't be that much greater in CoPx compared to CoPy. I also tried it with some more data sets from other subjects and the same thing was popping up, CoPx entropy measures were giving me warning messages and CoPy measurements were not. I am attaching a data set link so anyone can try it out for themselves and see if they can figure it out, as the data is a little long to just post into here.

Data

Edit: Correct Answer

As suggested, i tried the table(...) function and received no warning/error message and the entropy output was also in the expected range as well. However, i apparently overlooked a function in the package discretize() and that is what you are supposed to use to correctly setup the data for entropy calculation.

Kunio
  • 149
  • 1
  • 11

1 Answers1

3

I think there's no point in applying the entropy function on your data. According to ?entropy, it

estimates the Shannon entropy H of the random variable Y from the corresponding observed counts y

(emphasis mine). This means that you need to convert your data (which seems to be continuous) to count data first, for instance by binning it.

krlmlr
  • 25,056
  • 14
  • 120
  • 217
  • Do you mean by cutting it into smaller intervals? The data posted is 60 seconds of 1 trial. In some reporting, they split this type of data into 20 second intervals over the period so look at how it changes over the data collection period. Say 20 sec, 40 sec, 60 sec.......etc etc. – Kunio Sep 11 '15 at 21:20
  • I'm not familiar with your particular data collection method, but simply looking at the data it seems that they are continuous. My remark was about binning the *measured values*, not about splitting the time axis. – krlmlr Sep 11 '15 at 22:17
  • I tried using the `shingles()` function from the lattice package, `equal.count()`, and a method from [a binning question](http://stackoverflow.com/questions/24359863/binning-data-in-r) and none have gotten rid of the warning message. Is there another method of binning that i have not tried? – Kunio Sep 12 '15 at 13:56
  • @technos_eric: Try `cut()`. – krlmlr Sep 12 '15 at 20:08
  • I tried using just the cut function as so: `> datum <- cut(d2$CoPx, breaks=3) > entropy(datum, method="MM") Error in Summary.factor(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, : ‘sum’ not meaningful for factors` `cut()` was also used in the other method i linked as well as `tapply()` – Kunio Sep 13 '15 at 21:02
  • 1
    Forgot -- you'll need to tabulate the result to obtain count data, e.g., `table(cut(...))`. – krlmlr Sep 13 '15 at 22:12
  • I used the `table(cut(...)` method and it worked! No warnings. The calculated entropy though was very very low for what should be expected. I then tried the data only applying `table(d2$CoPx)` and that worked as well, and was at the level i would expect for this type of data. I will edit in the correct answer to my question and select your answer as correct. Thank you! – Kunio Sep 15 '15 at 14:18