I have a problem with missing data. So I have a returns table created this way:
returns:update logret:log ret from update ret:{0.0, 1_deltas x} mid by sym from spots;
and looking like this:
meta returns
c | t f a
------| -----
date | d
time | p
sym | s
mid | f
ret | f
logret| f
and I pivot it this way:
rettbl:0!exec (raze `$raze ( string (distinct sym)),\:/:("_",/:string `mid`ret`logret))!(mid, ret, logret) by time:time from returns;
so that works on a small range of time, but when I extend it fails with a "type: Mismatched types"
I believe the problem is that I don't have the same observations for all sym in my returns
table, as
select count time by sym from returns
gives me a different count by sym.
So my question is: how would you go about filling any missing mid
for all times I have in my table returns
?
To make it quick and dirty i was thinking about having a linear interpolation for missing points between two `time ...
Any hint, link, etc... would be greatly appreciated.
EDIT:
I can't share the data, but this should allow to reproduce the issue:
mid1: 1.2 + ({rand 1.0} each til 10) %100
mid2: 0.8 + ({rand 1.0} each til 10) %100
mid3: 104 + ({rand 1.0} each til 10) %100
sym1:{`$"EUR/USD"} each til 10
sym2:{`$"GBP/USD"} each til 10
sym3:{`$"USD/JPY"} each til 10
time1:2020.07.01D00:00:00.000000000 + 1D00:00:00 * til 10
/time2:2020.07.01D00:00:00.000000000 + 1D00:00:00 * til 10
time2:(2020.07.01D00:00:00.000000000 + 1D00:00:00 * til 3) , (2020.07.06D00:00:00.000000000 + 1D00:00:00 * til 3) , (2020.07.10D00:00:00.000000000 + 1D00:00:00 * til 4)
time3:2020.07.01D00:00:00.000000000 + 1D00:00:00 * til 10
spots:([] sym:sym1,sym2,sym3; time:time1,time2,time3; mid:mid1,mid2,mid3)
spots:update date:"d"$time from spots
returns:update logret:log ret from update ret:{0.0, 1_deltas x} mid by sym from spots;
rettbl:0!exec (raze `$raze ( string (distinct sym)),\:/:("_",/:string `mid`ret`logret))!(mid, ret, logret) by time:time from returns;
do note the time2 definition.
as per the desired output, if you replace the time2 definition with the one I commented out you will see.
as per the solution, I suspect that adding to the table returns the missing time, so that the list of time is the same for all sym are identical will solve the issue, ie value select time by sym from returns
should be identical, and a quick a dirty work around would be to add missing mid with a linear interpolation of the surrounding mid.