-3

Say I have the following sample minute data.

> data = xts(1:12, as.POSIXct("2020-01-01")+(1:12)*60*20)
> data
                    [,1]
2020-01-01 00:20:00    1
2020-01-01 00:40:00    2
2020-01-01 01:00:00    3
2020-01-01 01:20:00    4
2020-01-01 01:40:00    5
2020-01-01 02:00:00    6
2020-01-01 02:20:00    7
2020-01-01 02:40:00    8
2020-01-01 03:00:00    9
2020-01-01 03:20:00   10
2020-01-01 03:40:00   11
2020-01-01 04:00:00   12

This already aligned minute data, but now I want to get hourly.

Easy, just use the to.hourly command right?

> to.hourly(data)
                    data.Open data.High data.Low data.Close
2020-01-01 00:40:00         1         2        1          2
2020-01-01 01:40:00         3         5        3          5
2020-01-01 02:40:00         6         8        6          8
2020-01-01 03:40:00         9        11        9         11
2020-01-01 04:00:00        12        12       12         12

The problem is that it puts the end values of each bar into the next bar, and the last value is creates its own hour period.

Now to only show correct hourly bars I use align.time.

> align.time(to.hourly(data),60*60)
                    data.Open data.High data.Low data.Close
2020-01-01 01:00:00         1         2        1          2
2020-01-01 02:00:00         3         5        3          5
2020-01-01 03:00:00         6         8        6          8
2020-01-01 04:00:00         9        11        9         11
2020-01-01 05:00:00        12        12       12         12

The previous last entry creates its own hour bar which I need to remove.

The same issue occurs if I convert to daily, the last enry goes to the next day and an extra day is created.

The question is how to convert to different periods correctly?

The desired result for the example is:

                    data.Open data.High data.Low data.Close
2020-01-01 01:00:00         1         3        1          3
2020-01-01 02:00:00         4         6        4          6
2020-01-01 03:00:00         7         9        7          9
2020-01-01 04:00:00        10        12       10         12

This seems like a very basic option and I have searched and found many examples, but not one that considers the last value in a period. Thank you.

UPDATE: Allan Cameron gave a fantastic answer and it absolutely works, I am just concerned that it will fail at some point with different time periods.

My workflow starts with tick data which I convert to second and minute and so on. Converting tick to higher periods would work perfectly, but it is too much data to handle at once, hence the staggered approach. That is why the aligned data needs to work with any period conversion.

I made as small modification to Allan's code:

setNames(shift.time(to.hourly(shift.time(data, -.0000001193)), .0000001193), c("Open", "High", "Low", "Close"))

.0000001193 was the smallest value I found to work with simple trial and error.

Is there any time where this would not work or would the min value be different?

Is this the best way to handle this issue?

Thank you.

MichaelE
  • 763
  • 8
  • 22
  • 1
    We can distract 1 second before calling `to.hourly`. But you should really reconsider your chosen approach. The cutoff for the hour makes more sense at 0:00 than at 0:01... The new year starts at 12:00 not at 12:01, doesn't it ;) – dario Feb 27 '20 at 21:51
  • That was my thinking too @dario, though you can add the second back to make it look tidier. – Allan Cameron Feb 27 '20 at 21:56
  • The example is basic, the workflow starts with tick data, which is then converted to second data, then minute, hours and days. At each point it has to be aligned correctly. Tick to daily will always work, but there is far too much data. I need to work in stages so that bars must align in the end. – MichaelE Feb 27 '20 at 22:05

1 Answers1

2

You can shift the time back 60 seconds, do as.hourly, then shift the time forward 60 seconds. This maintains the groupings. You'll need to rename the columns too:

setNames(shift.time(to.hourly(shift.time(data, -60)), 60), c("Open", "High", "Low", "Close"))
#>                     Open High Low Close
#> 2020-01-01 01:00:00    1    3   1     3
#> 2020-01-01 02:00:00    4    6   4     6
#> 2020-01-01 03:00:00    7    9   7     9
#> 2020-01-01 04:00:00   10   12  10    12
Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Thanks, I did think of that, however I need to be able to handle finer time intervals than minutes, even sub second. Is it possible to subtract something as small as a nanosecond? I will try. Thanks again. – MichaelE Feb 27 '20 at 22:07
  • @MichaelE so you post a specific question with a specific desired output, get a simple solution to the question which gives the expected output, but now you're saying that wasn't really the question you wanted answered? Do you maybe want to update the question so that it reflects the actual problem you need solved? – Allan Cameron Feb 27 '20 at 22:16
  • Thank you, it works, but with a limit. I need the xts to go to the previous second, and the smallest value I found quickly was `.0000001193` which is about 120 nanoseconds, which I really hope is good enough. – MichaelE Feb 27 '20 at 22:20
  • 1
    @MichaelE Please do not use the comments to ask questions. If it's just a clarification of the original question, edit the original question. Or open a new question if it's a new requirement. Also consider accepting the answer if the solution works. Thank you. – dario Feb 27 '20 at 22:34