I have the following dataset:
>dput(df)
structure(list(Author = c("hitham", "Ow", "WPJ4", "Seb", "Karen", "Ow", "Ow", "hitham", "Sarah",
"Rene"), diff = structure(c(28, 2, 8, 3, 7, 8, 11, 1, 4, 8), class = "difftime", units = "secs")),
row.names = 1:10, class = "data.frame")
As we can see, the author Ow
appears three times and author hitham
two times:
Author diff
1 hitham 28 secs
2 Ow 2 secs
3 WPJ4 8 secs
4 Seb 3 secs
5 Karen 7 secs
6 Ow 8 secs
7 Ow 11 secs
8 hitham 1 secs
9 Sarah 4 secs
10 Rene 8 secs
These rows represent some activities performed by the authors. For exampe, hitham
performs its activity after 1sec and then after 18 secs in the second time.
I would like to make sure that there are at least 10 seconds between one activity and another.
I would like to delete those activities (lines) that do not meet this requirement. For example, Ow
performs its activity after 2 secs and then after 8 secs: the latter should be deleted. The desired result is then:
Author diff
1 hitham 28 secs
2 Ow 2 secs
3 WPJ4 8 secs
4 Seb 3 secs
5 Karen 7 secs
6 Ow 11 secs
7 hitham 1 secs
8 Sarah 4 secs
9 Rene 8 secs
Edit. I add this hoping to be clearer. Let us consider hitham
. If we consider hitham
rows (sorted by diff
field):
hitham 1 secs
hitham 28 secs
We have that (28-1)+1>10
, then there is no need to delete either of them.
Let us now consider Ow
.
Ow 2 secs
Ow 8 secs
Ow 11 secs
The differences in seconds between consecutive rows are (see last column):
Ow 2 secs -
Ow 8 secs 7
Ow 11 secs 4
The desired result can be obtained deleting the first row that show in the last column a number less than 10. In fact:
Ow 2 secs -
Ow 11 secs 10
We don't have to delete the last line because the difference here is just 10.