I have a dataset with 3.9M rows, 5 columns, stored as a tibble. When I try to convert it to tsibble, I run out of memory even though I have 32 GB which should be way more than enough. The weird thing is that if I apply a filter function before piping it into as_tsibble() then it works, even though I'm not actually filtering out any rows.
This does not work:
dataset %>% as_tsibble(index = TimeStamp, key = c("TSSU", "Phase"))
This works. But there are no "Phase" values less than 1 so the filter does nothing, no rows are actually removed.
dataset %>% filter(Phase > 0) %>% as_tsibble(index = TimeStamp, key = c("TSSU", "Phase"))
Any ideas why the second option works? Here's what the dataset looks like:
Volume <dbl> | Travel_Time <dbl> | TSSU <chr> | Phase <int> | TimeStamp <dttm> |
---|---|---|---|---|
105 | 1.23 | 01017 | 2 | 2020-09-28 10:00:00 |
20 | 1.11 | 01017 | 2 | 2020-09-28 10:15:00 |