0

I have read the following post, as well as and related ones, and I have found it really useful: Interpolate / Extend quarterly to monthly series

I have a similar but more general/extended problem, which I still haven't figure how to solve. Got a matrix of seven time series (named value1, ..., value7, below) including quarterly data for 63 dates, as well as NAs.

> str(test) 
'data.frame':      63 obs. of  8 variables: 
$ Date  : Date, format: "2001-03-30" "2001-06-29" "2001-09-28" ... 
$ value1: num 320 181.1 19.7 133.1 160.6 ... 
$ value2: num 4741 4556 4115 3892 3605 ... 
$ value3: num 146.8 -163.9 73.2 111.6 210.5 ... 
$ value4: num -135 -383.3 104.3 74.7 -75.4 ... 
$ value5: num 21.6 20.2 NA NA NA ... 
$ value6: num -19.1 -82.4 85 134.6 111 ... 
$ value7: num -163 -215 -164 -137 -199 ...


> test
          Date   value1   value2   value3   value4  value5   value6   value7 
1  2001-03-30  319.952 4740.905  146.756 -134.998  21.645  -19.0611 -162.713 
2  2001-06-29  181.103 4555.732 -163.867 -383.334  20.199   -82.3660 -215.105 
3  2001-09-28   19.724 4115.053   73.189  104.300      NA    84.9740 -164.073 
4  2001-12-31  133.134 3891.754  111.567   74.683      NA   134.6460 -136.974 
5  2002-03-28  160.564 3605.080  210.533  -75.351      NA   110.9770 -199.083 
6  2002-06-28 -111.902 3220.115 -107.759  -22.624      NA   408.4770 -172.327 
7  2002-09-30 -127.751 2962.472  -93.616  241.749      NA   687.2240 -195.772 
8  2002-12-31  -59.553 2697.029  -98.068  119.288      NA   903.8211 -137.965 
9  2003-03-31   86.427 2509.511  -78.662 -124.428      NA  1130.9380 -180.496 
10 2003-06-30   90.070 2554.473  -14.345 -66.764      NA   925.9010 -103.080 
11 2003-09-30  246.801 3000.005    0.001 -244.487      NA  1005.6370 -123.959 
12 2003-12-31  325.088 3519.168  388.592  129.915      NA   739.5460 -162.781 
13 2004-03-31  359.263 4041.043  206.260 -101.966      NA   745.8810 -202.047 
14 2004-06-30  367.347 4657.622  254.678  -59.913      NA   852.4181 -360.963 
15 2004-09-30  373.089 4943.322  263.395  -37.116      NA   857.8670 -406.748 
16 2004-12-31  351.817 5001.434  362.188  118.842      NA   663.5370 -470.379 
17 2005-03-31  287.224 4991.632  251.327  39.029  24.245   785.3220 -518.472 
18 2005-06-30  311.324 4989.710  265.163   11.546  25.653   676.1650 -303.265 
19 2005-09-30  369.478 5273.006  429.086  133.030  30.615   667.2330 -362.296 
20 2005-12-30  482.974 5847.577  537.279   63.616  24.447  -265.5200 -329.140 
21 2006-03-31  432.157 5953.107  566.349  196.971  -4.915 -1807.2560 -310.326 
22 2006-06-30  295.014 5909.556  218.850   -6.842 -17.449 -1837.8140 -455.364 
23 2006-09-29  318.926 5714.423  230.185   14.135 -13.551 -1667.5960 -424.892 
24 2006-12-29  232.784 5649.147  271.616  142.736  46.000  2256.0000 -666.418 
25 2007-03-30 -190.000 5549.989   41.000  373.000  62.000  2674.0000 -586.000 
26 2007-06-29  -70.000 5642.622 -635.000 -412.000  80.000  3943.0000 -414.000 
27 2007-09-28  153.000 5873.000  223.000  168.000  76.000  3807.0000 -419.000 
28 2007-12-31  234.000 5858.000   61.000 -153.000  76.000  3380.0000 -266.000 
29 2008-03-31   83.000 6112.000   16.000  110.000  86.000  3534.0000 -323.000 
30 2008-06-30  -18.000 6165.000 -242.000  -82.000  91.000  3694.0000 -106.000 
31 2008-09-30  426.000 6404.000 -216.000 -497.000  87.000  3799.0000  -82.000 
32 2008-12-31 -237.000 5808.000 -250.000  110.000  88.000  3680.0000 -113.000 
33 2009-03-31  -18.000 5498.000 -391.000 -252.000  94.000  2844.0000  -84.000 
34 2009-06-30   33.000 5320.000 -144.000 -120.000 102.000  3107.0000 -112.000 
35 2009-09-30  205.000 4919.000 -142.000 -288.000 110.000  3059.0000 -97.000 
36 2009-12-31 1572.000 5403.000 1150.000 -361.000 116.000  1884.0000 -174.000 
37 2010-03-31  282.000 5800.000   23.000 -237.000  46.000   672.0000  -48.000 
38 2010-06-30  221.000 6269.000  -98.000 -279.000  52.000   684.0000  -31.000 
39 2010-09-30  217.000 6491.000 -124.000 -343.000  53.000   671.0000  -31.000 
40 2010-12-31  511.000 6494.000 -213.000 -647.000  37.000   632.0000  -38.000 
41 2011-03-31  142.000 6533.000 -168.000 -326.000  45.000   485.0000  -38.000 
42 2011-06-30  185.000 6454.000  174.000   17.000  45.000   338.0000 -67.000 
43 2011-09-30  217.000 6526.000  189.000   -5.000  39.000   203.0000  -58.000 
44 2011-12-30  140.000 6568.000  187.000   63.000  41.000   102.0000  -87.000 
45 2012-03-30 -517.000 6540.000  107.000  384.000  41.000   306.0000  -40.000 
46 2012-06-29  142.000 6379.000   81.000  -49.000  41.000   262.0000  -39.000 
47 2012-09-28  -65.000 5958.000 -240.000 -185.000  42.000   560.0000  -32.000 
48 2012-12-31 -356.000 5422.000 -286.000   82.000  43.000   859.0000  -22.000 
49 2013-03-28  -32.000 4925.000 -155.000 -159.000  43.000   861.0000 -20.000 
50 2013-06-28   30.000 4673.000  -35.000   -8.000  40.000   930.0000  -28.000 
51 2013-09-30  152.000 4865.000   21.000  -61.000  46.000   868.0000  -15.000 
52 2013-12-31  189.000 5299.000   21.000 -128.000  43.000   871.0000  -21.000 
53 2014-03-31  102.000 5608.000 -204.000 -277.000  46.000  1156.0000  -21.000 
54 2014-06-30  116.000 5888.000  -28.000 -118.000  46.000  1262.0000  -23.000 
55 2014-09-30  112.000 5856.000   18.000  -65.000  42.000  1270.0000  -29.000 
56 2014-12-31 -282.000 5506.000  116.000  170.000  40.000  1172.0000 -22.000 
57 2015-03-31  -91.000 5139.000 -172.000 -129.000  40.000  1362.0000  -22.000 
58 2015-06-30  -92.000 4640.000  -57.000   55.000      NA  1440.0000  -17.000 
59 2015-09-30 -116.000 4272.000  -59.000 64.000      NA  1505.0000  -25.000 
60 2015-12-31  -15.000 3991.000   53.000  112.000      NA  1477.0000  -32.000 
61 2016-03-31  -35.000 3793.000  -42.000   19.000      NA  1520.0000  -26.000 
62 2016-06-30   25.000 3878.000  -85.000  -67.000      NA  1281.0000  -21.000 
63 2016-09-30 -260.000 4124.000   29.000   67.000      NA   374.0000 -9.000

I want to interpolate daily values (output will include 5664 days in total), using a cubic spline or linear relation. The solution provided in the link above is good but it works only if I apply it on each time series separately, to which I always need to associate the "Date" column: (Date, value1); (Date, value2); ..., as below, which is quite time-consuming:

DateSeq <- seq(test$Date[1],tail(test$Date,1),by="1 day")   
test1 <- test[1:2] 
test2 <- test[c(1,3)] 
...   
test1Daily <- data.frame(test=DateSeq, Interp.Value=spline(test1, method="natural", xout=DateSeq)$y) 
test2Daily <- data.frame(test=DateSeq, Interp.Value=spline(test2, method="natural", xout=DateSeq)$y) 
...
merge1 <- merge(test1, testDaily1, by='Date', all.y = T) 
merge2 <- merge(test2, testDaily2, by='Date', all.y = T)
...

...then finally merge all the merged variables above. Does anyone knows how to apply the interpolation once to the whole matrix (meaning to each column, or time series)?

Many thanks in advance.

Community
  • 1
  • 1
stem
  • 103
  • 1
  • 8

1 Answers1

1

I have found the following solution, and the way to plot it to verify that things work out well. Hope it may be useful for others!

test_z1 <- zoo(test, order.by = test$Date, frequency = 1)
test_t1 <- as.ts(x=test_z1)
test_t2 <- as.zoo(test_t1)
index(test_t2) <- as.Date(index(test_t2), origin = '1970-01-01')
test_t2_ncol <- test_t2[,-c(1)]
test_g <- na.spline(test_t2_ncol)

Now I put together each time series ("value1, value2,...", in "test") with its own interpolated time series in "test_g", and plot them to verify by eye the goodness of the interpolation:

interp_val1 <- test_g[,-c(2:7)]
orig_val1 <- test[,-c(3:8)]
orig_val1_z <- read.zoo(orig_val1)
merge_val1 <- merge(orig_val1_z, interp_val1)
options(stringsAsFactors = FALSE) # to avoid conversion to factors
merge_val1_df <- data.frame(Date=time(merge_val1), merge_val1, check.names=FALSE, row.names=NULL)
plot(merge_val1_df$orig_val1_z, lwd=2)
lines(merge_val1_df$interp_val1, lwd=1, col="green")

It seems that the interpolation works well!

stem
  • 103
  • 1
  • 8