My dataset has 523 rows and 93 columns and it looks like this:
data <- structure(list(`2018-06-21` = c(0.6959635416667, 0.22265625,
0.50341796875, 0.982942708333301, -0.173828125, -1.229259672619
), `2018-06-22` = c(0.6184895833333, 0.16796875, 0.4978841145833,
0.0636718750000007, 0.5338541666667, -1.3009207589286), `2018-06-23` = c(1.6165364583333,
-0.375, 0.570800781250002, 1.603515625, 0.5657552083333, -0.9677734375
), `2018-06-24` = c(1.3776041666667, -0.03125, 0.7815755208333,
1.5376302083333, 0.5188802083333, -0.552966889880999), `2018-06-25` = c(1.7903645833333,
0.03125, 0.724609375, 1.390625, 0.4928385416667, -0.723074776785701
)), row.names = c(NA, 6L), class = "data.frame")
Each row is a city, and each column is a day of the year.
After calculating the row average in this way
data$mn <- apply(data, 1, mean)
I want to create another column data$duration
that indicates the average length of a period of consecutive days where the values are > than data$mn
.
I tried with this code:
data$duration <- apply(data[-6], 1, function(x) with(rle`(x > data$mean), mean(lengths[values])))
But it does not seem to work. In particular, it appears that rle( x > data$mean)
fails to recognize the end of a row.
What are your suggestions?
Many thanks
EDIT
Reference dataframe has been changed into a [6x5]