0

I'm struggling to regularize a time series using the tsibble package. The documentation indicates that this can be done using index_by() + summarise(), but I'm clearly missing some details. Here's what I've tried:

library(tidyverse)
library(lubridate)
library(tsibble)

# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = FALSE)
  
# regularize the tsibble (ie time series)
tsbl %>% 
  index_by(date, unit = "day") %>% # unit value "day" is intuitive but incorrect?
  mutate(week = isoweek(date)) %>% # add (numeric) week column
  summarise(date = date,
            fish = sum(fish),
            volume = sum(volume),
            n = sum(n), 
            cpue = fish/volume) # calculate catch per unit effort

TIA!

Peter Nelson
  • 85
  • 1
  • 2
  • 7
  • Just set `regular = TRUE` in the `tsibble` creation. The summarise isn't doing anything -- did you mean to group by week? – Rob Hyndman May 19 '22 at 22:00
  • Setting ```regular = TRUE``` didn't work for me. I get a tsibble of 4 rows with the same irregular interval btwn rows. ```tsbl <- tsibble(tbl, index = date, regular = TRUE)``` I was able to regularize the example using ```complete(tbl, date = seq.Date(min(date), max(date), by = "day"))```. What am I missing? – Peter Nelson May 19 '22 at 23:35

1 Answers1

3

With so little information provided about what you are actually trying to do, I will have to guess.

Perhaps you want daily data with each day explicitly included. In that case, do this:

library(tidyverse)
library(lubridate)
library(tsibble)

# example data set
date <- ymd(c("1976-05-18", "1976-05-19", "1976-05-24", "1976-06-01"))
fish <- c(203, 282, 301, 89)
volume <- c(210749, 287555, 378965, 308935)
n <- c(5, 7, 10, 8)
tbl <- tibble(date, fish, volume, n)
tsbl <- tsibble(tbl, index = date, regular = TRUE) %>%
  fill_gaps()
tsbl
#> # A tsibble: 15 x 4 [1D]
#>    date        fish volume     n
#>    <date>     <dbl>  <dbl> <dbl>
#>  1 1976-05-18   203 210749     5
#>  2 1976-05-19   282 287555     7
#>  3 1976-05-20    NA     NA    NA
#>  4 1976-05-21    NA     NA    NA
#>  5 1976-05-22    NA     NA    NA
#>  6 1976-05-23    NA     NA    NA
#>  7 1976-05-24   301 378965    10
#>  8 1976-05-25    NA     NA    NA
#>  9 1976-05-26    NA     NA    NA
#> 10 1976-05-27    NA     NA    NA
#> 11 1976-05-28    NA     NA    NA
#> 12 1976-05-29    NA     NA    NA
#> 13 1976-05-30    NA     NA    NA
#> 14 1976-05-31    NA     NA    NA
#> 15 1976-06-01    89 308935     8

Created on 2022-05-20 by the reprex package (v2.0.1)

I'm not sure what you are trying to achieve with the summarize, but perhaps you want to create weekly data from these daily data. In that case, do this:

tsbl %>% 
  mutate(week = isoweek(date)) %>% # add (numeric) week column
  index_by(week) %>%
  summarise(fish = sum(fish, na.rm=TRUE),
            volume = sum(volume, na.rm=TRUE),
            n = sum(n, na.rm=TRUE), 
            cpue = fish/volume) # calculate catch per unit effort
#> # A tsibble: 3 x 5 [1]
#>    week  fish volume     n     cpue
#>   <dbl> <dbl>  <dbl> <dbl>    <dbl>
#> 1    21   485 498304    12 0.000973
#> 2    22   301 378965    10 0.000794
#> 3    23    89 308935     8 0.000288

Created on 2022-05-20 by the reprex package (v2.0.1)

Rob Hyndman
  • 30,301
  • 7
  • 73
  • 85