I have a data frame exemplified by the following
dist <- c(1.1,1.0,10.0,5.0,2.1,12.2,3.3,3.4)
id <- rep("A",length(dist))
df<-cbind.data.frame(id,dist)
df
id dist
1 A 1.1
2 A 1.0
3 A 10.0
4 A 5.0
5 A 2.1
6 A 12.2
7 A 3.3
8 A 3.4
I need to clean it up so no row values in the dist column is bigger than 2 times the next row value at any time. A cleaned up data frame would look like this:
id dist
1 A 1.1
2 A 1.0
5 A 2.1
7 A 3.3
8 A 3.4
I have tried making a function with a for loop and if statement to clean it
cleaner <- function (df,dist,times_larger) {
for (i in 1:(nrow(df)-1)) {
if (df$dist[i] > df$dist[i+1]*times_larger){
df<-df[-i,]
break
}
}
df
}
Obviously if I dont break the loop it will create an error because the number of rows in df will change in the process. If I manually run the loop on df several times:
df<-cleaner(df,"dist",2)
it will clean up as I want.
I have also tried different function constructions and applying it to the data frame with apply, but without any luck.
Do any have a good suggestion of either how to repeat the function on the data frame until it does not change anymore, a better function structure or maybe a better way of cleaning?
Any suggestions are most appreciated