1

I would like to inquire about suggestions on most efficient approach(es) to convert a dataframe (or tibble) into a tsibble.

The dataframe has dates in the first column, and all other columns represent various time series with values given at the corresponding date. I would like to efficiently create a tsibble with key = Name of each Time Series and index = each Date.

So the output would be a tsibble that would show like this:

Key                  Index             Value
TimeSeriesOne       FirstDate        Value TimeSeriesOne on first date
TimeSeriesOne       SecondDate       Value TimeSeriesOne on second date
......................................................................
TimeSeriesOne       LastDate         Value TimeSeriesOne on last date
TimeSeriesTwo       FirstDate        Value TimeSeriesTwo on first date
......................................................................
TimeSeriesN         LastDate         Value TimeSeriesN on last date

Example of input data:

numRows <- 15
startDate <- lubridate::as_date('2018-06-10')
endDate <- startDate + base::months(x = numRows-1)
theDates <- base::seq.Date(
    from = startDate,
    to = endDate,
    by = "month")  
inputData <- tibble::tibble(
    "Dates" = theDates,
    "SeriesOne" = stats::rnorm(numRows),
    "SeriesTwo" = stats::rnorm(numRows),
    "SeriesThree" = stats::rnorm(numRows), 
    "SeriesFour" = stats::rnorm(numRows))
Aex
  • 131
  • 1
  • 11
  • Here is an effiicient method `as_tibble(melt(setDT(inputData), id.var = 'Dates', variable.name = 'Key', value.name = 'Value')[, Key := paste0("Time", Key)])` (I have to delete my answer due to the downvoting ) – akrun Jan 05 '20 at 20:54
  • 1
    It appears to be efficient and work (almost) as intended. I am not clear who (and why) downvoted your answer – Aex Jan 05 '20 at 21:11

3 Answers3

4

You can convert to "long format" using tidyr:

tsibble_input <- tidyr::pivot_longer(inputData, cols = -Dates, names_to = "Key", values_to = "Value") 

And get the tsibble:

tsibble::as_tsibble(tsibble_input, index = "Dates", key = "Key")
robertdj
  • 909
  • 1
  • 5
  • 10
1

We can use melt from data.table to do this efficiently and then convert it to tibble

library(data.table)
library(tibble)
as_tibble(melt(setDT(inputData), id.var = 'Dates', variable.name = 'Key', 
      value.name = 'Value')[, Key := paste0("Time", Key)])
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Thank you. It is almost there (but the remaining step is just to convert it to tsibble). I have upvoted it – Aex Jan 06 '20 at 00:34
1

Convert to zoo and then to long data frame and finally to tsibble

library(tsibble)
library(zoo)

inputData %>%
  read.zoo %>%
  fortify.zoo(melt = TRUE) %>%
  as_tsibble(key = "Series", index = "Index")

or use stack (or any of a number of other reshaping functions including reshape, melt, gather, pivot_longer) to create a long data frame and then to tsibble. If by efficient you mean minimal prerequisites then this only uses the tsibble package and its dependencies.

library(tsibble)

inputData %>%
  { cbind(.[1], stack(.[-1])) } %>%
  as_tsibble(key = "ind", index = "Dates")
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thank you. Your second approach indeed is preferable. I have upvoted it – Aex Jan 06 '20 at 00:32
  • I wouldn't say that the second is preferable. There are pros and cons of each. The first is more consistent with R's whole object approach. No components are specified at all but it does have an extra dependency.. The second is messier but has no additional dependencies. – G. Grothendieck Jan 06 '20 at 03:12