It's actually quite hard to completely approach this with with pure LINQ. To make life easier, you'll need to write atleast one helper method that allows you to transform an enumeration. Take a look at the example below. Here I make use of an IEnumerable
of TimeInterval
and have a custom Split
method (implemented with C# iterators) that Joins two elements together in one Tuple
:
class TimeInterval
{
DateTime Start;
DateTime End;
int Value;
}
IEnumerable<TimeInterval> ToHourlyIntervals(
IEnunumerable<TimeInterval> halfHourlyIntervals)
{
return
from pair in Split(halfHourlyIntervals)
select new TimeInterval
{
Start = pair.Item1.Start,
End = pair.Item2.End,
Value = pair.Item1.Value + pair.Item2.Value
};
}
static IEnumerable<Tuple<T, T>> Split<T>(
IEnumerable<T> source)
{
using (var enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
T first = enumerator.Current;
if (enumerator.MoveNext())
{
T second = enumerator.Current;
yield return Tuple.Create(first, second);
}
}
}
}
The same can be applied to the first part of the problem (extracting half hourly TimeInterval
s from the list of strings):
IEnumerable<TimeInterval> ToHalfHourlyIntervals(
IEnumerable<string> inputLines)
{
return
from triple in TripleSplit(inputLines)
select new TimeInterval
{
Start = DateTime.Parse(triple.Item1.Replace("Start: ", "")),
End = DateTime.Parse(triple.Item2.Replace("End: ", "")),
Value = Int32.Parse(triple.Item3)
};
}
Here I make use of a custom TripleSplit
method that returns a Tuple<T, T, T>
(which will be easy to write). With this in place, the complete solution would look like this:
// Read data lazilzy from disk (or any other source)
var lines = File.ReadLines(path);
var halfHourlyIntervals = ToHalfHourlyIntervals(lines);
var hourlyIntervals = ToHourlyIntervals(halfHourlyIntervals);
foreach (var interval in hourlyIntervals)
{
// process
}
What's nice about this solution is that it is completely deferred. It processes one line at a time, which allows you to process indefinately big sources without the danger of any out of memory exception, which seems important considering your given requirement:
This data keeps going for a week then 30 days and 365days.