This could be a silly thing, but I have a data.frame and make a filter and I don't have the same results using a variable a constant with dplyr::filter or base subsetting, first an example
tt <- data.frame( t = runif(100,max=100)) %>% mutate(period =trunc( (t+3) / 12))
i <- 0
tt %>% filter(period==0)
tt %>% filter(period==i)
tt[tt$period == i,]
and the results are equivalent
> tt %>% filter(period==0)
t period
1 4.047352 0
2 2.391890 0
3 6.050928 0
4 1.646503 0
5 2.335137 0
> tt %>% filter(period==i)
t period
1 4.047352 0
2 2.391890 0
3 6.050928 0
4 1.646503 0
5 2.335137 0
> tt[tt$period == i,]
t period
23 4.047352 0
47 2.391890 0
75 6.050928 0
93 1.646503 0
95 2.335137 0
then the real (big) data.frame I made the same operations and did not get equivalent results
patch_sparse <- patch_sparse %>% mutate(period = trunc( (t+3) / 12))
str(patch_sparse)
'data.frame': 768307 obs. of 7 variables:
$ t : num 1 1 1 1 1 1 1 1 1 1 ...
$ i : int 2864 2864 2864 2864 2876 2876 2875 2876 2875 2857 ...
$ j : int 3109 3110 3111 3112 3112 3113 3114 3114 3115 3116 ...
$ data : logi TRUE TRUE TRUE TRUE TRUE TRUE ...
$ date : chr "2000-11-01" "2000-11-01" "2000-11-01" "2000-11-01" ...
$ region: chr "Australia" "Australia" "Australia" "Australia" ...
$ period: num 0 0 0 0 0 0 0 0 0 0 ...
#
i <- 0
patch_sparse %>% filter(period==0)
patch_sparse %>% filter(period==i)
patch_sparse[patch_sparse$period == i,]
And the result are:
> patch_sparse %>% filter(period==0)
t i j data date region period
1 1 2864 3109 TRUE 2000-11-01 Australia 0
2 1 2864 3110 TRUE 2000-11-01 Australia 0
3 1 2864 3111 TRUE 2000-11-01 Australia 0
...
142 2 3457 1524 TRUE 2000-12-01 Australia 0
[ reached 'max' / getOption("max.print") -- omitted 2346 rows ]
> patch_sparse %>% filter(period==i)
[1] t i j data date region period
<0 rows> (or 0-length row.names)
> patch_sparse[patch_sparse$period == i,]
t i j data date region period
1 1 2864 3109 TRUE 2000-11-01 Australia 0
2 1 2864 3110 TRUE 2000-11-01 Australia 0
3 1 2864 3111 TRUE 2000-11-01 Australia 0
..
142 2 3457 1524 TRUE 2000-12-01 Australia 0
[ reached 'max' / getOption("max.print") -- omitted 2346 rows ]
I tried to change the data.frame
to tibble
or to change trunc()
to as.integer()
with similar results, and I can't get a reproducible example. Any ideas?