How to count the number of values per column above a sequence of thresholds ?
i.e.: calculate for each column, the number of values above 100, then above 150, then above ... and store the results in a data frame ?
# Reproductible data
# (Original data is daily streamflow values organized in columns per year)
set.seed(1234)
data = data.frame("1915" = runif(365, min = 60, max = 400),
"1916" = runif(365, min = 60, max = 400),
"1917" = runif(365, min = 60, max = 400))
# my code chunck
mymin = 75
mymax = 400
my step = 25
apply(data, 2, function (x) {
for(i in seq(mymin,mymax,mystep)) {
res = (sum(x > i)) # or nrow(data[x > i,])
return(res)
}
})
This code works well for one iteration, but I can't store the result of each iteration in a data frame.
I also tried approaches such as :
for (i in 1:n){
seuil = seq(mymin, mymax, my step)
lapply(data, function(x) {
res [[i]] = nrow(data[ x > seuil[i], ])
return(res)}
})
Which does not work really well...
The output would be something like :
year | n value above 75 | n values above 100 | n value above ... |
---|---|---|---|
1915 | 348 | 329 | ... |
1916 | 351 | 325 | ... |
... | ... | ... | ... |
Thanks for your comments and suggestions :)