Not a duplicate of: Binning Dates in R or Binning time data in R
Context
I am using getMultipleTicks
in Rblpapi
to pull tick data for a stock (TSLA in this example) over a month:
rawData = getMultipleTicks("tsla us equity", eventType = "TRADE", startTime = as.POSIXlt("2017-03-10 13:30:00"), endTime = as.POSIXlt("2017-04-10 20:00:00"), tz="America/New_York")
> str(rawData)
'data.frame': 1130690 obs. of 3 variables:
$ times: POSIXct, format: "2017-03-10 08:30:07" ...
$ value: num 246 246 246 246 246 ...
$ size : num 58 42 80 5 9 1 4 73 100 941 ...
Objective
This data needs to be converted from this:
Raw Data:
> head(rawData, 5)
times value size
1 2017-04-10 09:30:00 309 1
2 2017-04-10 09:30:00 309 1
3 2017-04-10 09:30:02 309 1
4 2017-04-10 09:30:02 308 1
5 2017-04-10 09:30:04 309.38 1
To this:
Clean Data:
> head (cleanData, 5)
times value size
1 2017-04-10 09:30:00 309 2
2 2017-04-10 09:30:01 0
3 2017-04-10 09:30:02 308.5 2
4 2017-04-10 09:30:03 0
5 2017-04-10 09:30:04 309.38 1
- Missing times (in seconds) are filled in
- Prices (values are in VWAP)
- Volumes (size) are added together
Compute time is not a concern.
Things I tried
I naively attempted using ?cut
but could not achieve any meaningful results per Binning time data in R.
A colleague suggested using a for-loop but not sure how to begin implementing that with the requirements above.