EXAMPLE:
We have two time-series lazy sequences of map created by reading csv. The two lazy-sequences start at different days:
INPUT
lazy-seq1
({:date "20110515" :val1 123}
{:date "20110516" :val1 143}
{:date "20110517" :val1 1153} ...)
lazy-seq2
({:date "20110517" :val2 151}
{:date "20110518" :val2 1330} ...)
EXPECTED OUTPUT
lazy-seq3
({:date "20110515" :vals {:val1 123}}
{:date "20110516" :vals {:val1 143}}
{:date "20110517" :vals {:val1 1153 :val2 151}}
{:date "20110518" :vals {:val1 ... :val2 1330}}
...))
To be exact, type of :date is not string, but Jodatime coerced by clj-time and :date is sorted for each sequences.
The first choice will be using group-by function, but I guess that this cannot create lazy-seq. I believe that group-by needs eager evaluation.
The second choice will be using partition-by function, but I cannot apply this to my INPUTS because of lack of my closure skill.
Input seq is quite huge (~1GB per sequence) and I want to calculate many (~100) sequences at once. So, I want lazy evaluation to avoid Outofmemory error.