1

I'm not a super expert on LINQ, I've a data below provided by third party: Data

Start: 6:00
End: 6:30
value: 1 
Start: 7:00
End: 7:30
value: 1
Start: 8:00
End: 8:30
value: 1
Start: 9:00
End: 9:30
value: 1
Start: 10:00
End: 10:30
value: 1
Start: 11:00
End: 11:30
value: 1
Start: 12:00
End: 12:30
value: 1
Start: 13:00
End: 13:30
value: 1
Start: 14:00
End: 14:30
value: 1
...
Start: 05:00
End: 05:30
value: 1

This data keeps going for a week then 30 days and 365days.

I need to transform each 30minute block in to an hour.

e.g

Start: 6:00
End: 7:00
Value: 2
Start:7:00
End: 8:00
Value:2
......

Assuming that Start, End and Value comes as one row, could someone help how above can be achieved?

Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
Nil Pun
  • 17,035
  • 39
  • 172
  • 294
  • I notice on the second example, the value is 2. Are you changing the end values AND adding the current value with the next? – Thinking Sites May 08 '12 at 12:55
  • Your example gives gaps of 30 minutes each time. Is that correct? `End` is always `**:30` and `Start` is always `**:00` – hwcverwe May 08 '12 at 12:56
  • I don't agree with the closers to close this question because of it being too localized. The question is about parsing a structures collection of items of several types and this gives and is very interesting for other visitors. However, the question actually IS phrased very poorly, which would be a good reason to close it. – Steven May 08 '12 at 13:25
  • Use Rx for time based LINQ queries. – Richard Anthony Hein May 08 '12 at 13:39

3 Answers3

1

This query is able to group by the given AggregationType and it is able to filter out incomplete groups using the second parameter checkType.

private enum AggerationType { Year = 1, Month = 2, Day = 3, Hour = 4 }

private IList<Data> RunQuery(AggerationType groupType, AggerationType checkType)
{
    // The actual query which does to trick
    var result =
        from d in testList
        group d by new {
            d.Start.Year,
            Month = (int)groupType >= (int)AggerationType.Month ? d.Start.Month : 1,
            Day = (int)groupType >= (int)AggerationType.Day ? d.Start.Day : 1,
            Hour = (int)groupType >= (int)AggerationType.Hour ? d.Start.Hour : 1
        } into g
        // The where clause checks how much data needs to be in the group
        where CheckAggregation(g.Count(), checkType)
        select new Data() { Start = g.Min(m => m.Start), End = g.Max(m => m.End), Value = g.Sum(m => m.Value) };

    return result.ToList();
}

private bool CheckAggregation(int groupCount, AggerationType checkType)
{
    int requiredCount = 1;
    switch(checkType)
    {
        // For year all data must be multiplied by 12 months
        case AggerationType.Year:
            requiredCount = requiredCount * 12; 
            goto case AggerationType.Month;
        // For months all data must be multiplied by days in month
        case AggerationType.Month:
            // I use 30 but this depends on the given month and year
            requiredCount = requiredCount * 30; 
            goto case AggerationType.Day;
        // For days all data need to be multiplied by 24 hour
        case AggerationType.Day:
            requiredCount = requiredCount * 24;
            goto case AggerationType.Hour;
        // For hours all data need to be multiplied by 2 (because slots of 30 minutes)
        case AggerationType.Hour:
            requiredCount = requiredCount * 2;
            break;

    }
    return groupCount == requiredCount;
}

Here some Test data if you want:

class Data
{
    public DateTime Start { get; set; }
    public DateTime End { get; set; }
    public int Value { get; set; }
}

// Just setup some test data simulary to your example
IList<Data> testList = new List<Data>();
DateTime date = DateTime.Parse("6:00"); 

// This loop fills just some data over several years, months and days
for (int year = date.Year; year > 2010; year--)
{
    for(int month = date.Month; month > 0; month--)
    {
        for (int day = date.Day; day > 0; day--)
        {
            for(int hour = date.Hour; hour > 0; hour--)
            {
                DateTime testDate = date.AddHours(-hour).AddDays(-day).AddMonths(-month).AddYears(-(date.Year - year));
                testList.Add(new Data() { Start = testDate, End = testDate.AddMinutes(30), Value = 1 });
                testList.Add(new Data() { Start = testDate.AddMinutes(30), End = testDate.AddHours(1), Value = 1 });
            }
        }
    }
}
hwcverwe
  • 5,287
  • 7
  • 35
  • 63
  • Thanks @hwcverwe, the solution seems to break if we have multiple dates. – Nil Pun May 09 '12 at 05:49
  • @nilpun. My answer contains now a query which does the trick with multiple days – hwcverwe May 09 '12 at 11:29
  • Thanks a lot @hwcverwe. Is there anyway we can use this trick to get aggregated values for day, week, month and year for given data at all? – Nil Pun May 09 '12 at 12:43
  • @nilpun, I am not sure I understand your question well. Are you asking for a more generic way to get aggregated on Hours OR Days OR Months OR Years? – hwcverwe May 09 '12 at 13:40
  • yes that's correct. For a given data i'm trying to get aggregated values by hour, day, week, month and year. Thanks again – Nil Pun May 09 '12 at 22:28
  • also if there is no 30 minute pair e.g. 10:00AM to 10:30AM the solution seems to aggregate this into 30min interval instead of ignoring it. – Nil Pun May 10 '12 at 02:07
0

Below is the code. It seems a little bit ugly because of switch statement. It would be better to refactor it but it should show the idea.

var items = input.Split('\n');

Func<string, string> f = s =>
{
    var strings = s.Split(new[] {':'}, 2);
    var key = strings[0];
    var value = strings[1];

    switch (key.ToLower())
    {
        case "start":
            return s;
        case "value":
            return String.Format("{0}: {1}", key, Int32.Parse(value) + 1);
        case "end": 
            return String.Format("{0}: {1:h:mm}", key,
                DateTime.Parse(value) +
                TimeSpan.FromMinutes(30));
        default:
            return "";
    }
};

var resultItems = items.Select(f);

Console.Out.WriteLine("result = {0}",
                          String.Join(Environment.NewLine, resultItems));
the_joric
  • 11,986
  • 6
  • 36
  • 57
0

It's actually quite hard to completely approach this with with pure LINQ. To make life easier, you'll need to write atleast one helper method that allows you to transform an enumeration. Take a look at the example below. Here I make use of an IEnumerable of TimeInterval and have a custom Split method (implemented with C# iterators) that Joins two elements together in one Tuple:

class TimeInterval
{
    DateTime Start;
    DateTime End;
    int Value;
}

IEnumerable<TimeInterval> ToHourlyIntervals(
    IEnunumerable<TimeInterval> halfHourlyIntervals)
{
    return
        from pair in Split(halfHourlyIntervals)
        select new TimeInterval
        {
            Start = pair.Item1.Start,
            End = pair.Item2.End,
            Value = pair.Item1.Value + pair.Item2.Value
        };
}

static IEnumerable<Tuple<T, T>> Split<T>(
    IEnumerable<T> source)
{
    using (var enumerator = source.GetEnumerator())
    {
        while (enumerator.MoveNext())
        {
            T first = enumerator.Current;

            if (enumerator.MoveNext())
            {            
                T second = enumerator.Current;
                yield return Tuple.Create(first, second);
            }
        }
    }
}

The same can be applied to the first part of the problem (extracting half hourly TimeIntervals from the list of strings):

IEnumerable<TimeInterval> ToHalfHourlyIntervals(
    IEnumerable<string> inputLines)
{
    return
        from triple in TripleSplit(inputLines)
        select new TimeInterval
        {
            Start = DateTime.Parse(triple.Item1.Replace("Start: ", "")),
            End = DateTime.Parse(triple.Item2.Replace("End: ", "")),
            Value = Int32.Parse(triple.Item3)
        };
}

Here I make use of a custom TripleSplit method that returns a Tuple<T, T, T> (which will be easy to write). With this in place, the complete solution would look like this:

// Read data lazilzy from disk (or any other source)
var lines = File.ReadLines(path);

var halfHourlyIntervals = ToHalfHourlyIntervals(lines);

var hourlyIntervals = ToHourlyIntervals(halfHourlyIntervals);

foreach (var interval in hourlyIntervals)
{
    // process
}

What's nice about this solution is that it is completely deferred. It processes one line at a time, which allows you to process indefinately big sources without the danger of any out of memory exception, which seems important considering your given requirement:

This data keeps going for a week then 30 days and 365days.

Steven
  • 166,672
  • 24
  • 332
  • 435