I have a large table that contains a date
field of type datetime
. As part of a function that takes as input two lists of datetime
namely a list of afroms
and atos
I'd like to compute for each of these afrom,ato
pairs all the rows of the large table whose date is between them.
I worked out a not very efficient way to do this i.e. it has a serious scalability drawback:
/ t1 contains my afrom,ato pairs
q)t1:([] afrom:`datetime$(2017.10.01T10:00:00.000 2017.10.02T10:00:00.000);ato:`datetime$(2017.10.01T12:00:00.000 2017.10.02T12:00:00.000));
q)t1
afrom ato
-----------------------------------------------
2017.10.01T10:00:00.000 2017.10.01T12:00:00.000
2017.10.02T10:00:00.000 2017.10.02T12:00:00.000
/ t2 contains my very very large dataset
q)t2:([] date:`datetime$(2017.10.01T10:01:00.000 2017.10.01T10:02:00.000 2017.10.01T10:03:00.000 2017.10.02T10:01:00.000 2017.10.02T10:02:00.000 2017.10.02T10:03:00.000); ccypair:(3#`EURUSD),(3#`USDCHF); mid:6?1.05);
q)t2
date ccypair mid
----------------------------------------------------
2017.10.01T10:01:00.000 EURUSD 0.24256133290473372
2017.10.01T10:02:00.000 EURUSD 0.091602176288142809
2017.10.01T10:03:00.000 EURUSD 0.10756538207642735
2017.10.02T10:01:00.000 USDCHF 0.91046513157198206
2017.10.02T10:02:00.000 USDCHF 0.76424539103172717
2017.10.02T10:03:00.000 USDCHF 0.17090452200500295
Then I can use cross
like this:
select from (t1 cross t2) where afrom<date,date<ato
and this produces the correct results:
afrom ato date ccypa..
-----------------------------------------------------------------------------..
2017.10.01T10:00:00.000 2017.10.01T12:00:00.000 2017.10.01T10:01:00.000 EURUS..
2017.10.01T10:00:00.000 2017.10.01T12:00:00.000 2017.10.01T10:02:00.000 EURUS..
2017.10.01T10:00:00.000 2017.10.01T12:00:00.000 2017.10.01T10:03:00.000 EURUS..
2017.10.02T10:00:00.000 2017.10.02T12:00:00.000 2017.10.02T10:01:00.000 USDCH..
2017.10.02T10:00:00.000 2017.10.02T12:00:00.000 2017.10.02T10:02:00.000 USDCH..
2017.10.02T10:00:00.000 2017.10.02T12:00:00.000 2017.10.02T10:03:00.000 USDCH..
However, when I have a large list of afroms
and atos
the cross
will "unnecessarily" expand the potentially large table t2
times the size of t1
and that doesn't scale well.
Is there a better way to do this? e.g. I have tried things like:
select from t2 where (exec afrom from t1)<date,date<(exec ato from t1)
error: `length
I would need to do a loop I guess but not sure how. Subquestion .. is it possible to have a sigle list of interval tuples i.e. intervals(afrom;ato)
instead of having separated afroms
and atos
?