I have a dataframe, similar to the example below, but larger (15000 rows):
df.example <-structure(list(Date = structure(c(3287, 3386, 4286, 5286, 6286), class = "Date"),v1 = c(1L, 1L, 1L, 1L, 1L), v2 = c(0.60378, 12.82581, 3.55357, 4.96079, 0.0422),perc = c(0.598, 0.598, 0.609, 1, 0.609), v3 = c(-99, -99, 5.83509031198686, 4.96079,0.0692939244663383)), .Names = c("Date", "v1", "v2", "perc", "v3"), row.names = c(1L, 100L, 1000L, 2000L, 3000L), class = "data.frame")
df.example:
Date v1 v2 perc v3
1 1979-01-01 1 0.60378 0.598 -99.00000000
100 1979-04-10 1 12.82581 0.598 -99.00000000
1000 1981-09-26 1 3.55357 0.609 5.83509031
2000 1984-06-22 1 4.96079 1.000 4.96079000
3000 1987-03-19 1 0.04220 0.609 0.06929392
What I would like to do is calculate the percentage of rows that are below a "certain threshold value" for column "perc". I would like to do this multiple times for multiple "certain threshold values", given below:
### "certain threshold values":
seq(from =0, to = 1, by = 0.1)
### formula to be repeated/iterated/looped: (the i stands for "certain value")
100*sum(df.example$perc<=i)/nrow(df.example)
I would like the outcome to be a vector called "vector1", like the example below:
vector1 <- c(0,0,0,0,0,0,0.2,0.6,0.6,0.6,1.0)
This is what I have so far, but it is not working:
### create vector to store calculated values in
vector1=c()
vector1[1]=3
### loop calculation of percentage of rows that are below "certain threshold value" in column df.example$perc
for(i in seq(0,1, by=0.1)){
vector1[i]=sum(df.example$perc<=i)/nrow(df.example)
}
I only get one value, which I would expect to be the last one of my vector1.
I already looked at similar topics in SO, as R create a vector with loop structure & How to make a vector using a for loop
Any suggestions?
By the way: please comment if the dput() I used doesn't create the data to work with, its the first time I use dput().