5

I've got an IQueryable(Of Job) where Job has, amongst other things:

Property CreatedOn as DateTime
Property JobType as JobTypes

Enum JobTypes
    JobType1
    JobType2
    JobType3
End Enum

What I want to get out of it is a list, ordered by CreatedOn, then Grouped by JobType with a count

Eg Say i've got (abbreviated dates)

11:00  JobType1
11:01  JobType2
11:02  JobType2
11:03  JobType2
11:04  JobType2
11:05  JobType3
11:06  JobType1
11:07  JobType1

I want

JobType1 1
JobType2 4
JobType3 1
JobType1 2

I don't know how to take ordering into account when grouping. can someone point me at the right way to do this? By preference, I'd prefer fluent Syntax. VB.Net or C# is fine.

Basic
  • 26,321
  • 24
  • 115
  • 201

5 Answers5

3

This trick is fairly easy to train LinqToObjects to do:

public static IEnumerable<IGrouping<TKey, TSource>> GroupContiguous<TKey, TSource>(
  this IEnumerable<TSource> source,
  Func<TSource, TKey> keySelector)
{
  bool firstIteration = true;
  MyCustomGroupImplementation<TKey, TSource> currentGroup = null;

  foreach (TSource item in source)
  {
    TKey key = keySelector(item);
    if (firstIteration)
    {
      currentGroup = new MyCustomGroupImplementation<TKey, TSource>();
      currentGroup.Key = key;
      firstIteration = false;
    }
    else if (!key.Equals(currentGroup.Key))
    {
      yield return currentGroup;
      currentGroup = new MyCustomGroupImplementation<TKey, TSource>();
      currentGroup.Key = key;
    }
    currentGroup.Add(item);
  }
  if (currentGroup != null)
  {
    yield return currentGroup;
  }
}

public class MyCustomGroupImplementation<TKey, TSource> : IGrouping<TKey, TSource>
{
  //TODO implement IGrouping and Add
}

Used by

IEnumerable<IGrouping<JobType, Job> query = Jobs
  .OrderBy(j => j.CreatedOn)
  .GroupContiguous(j => j.JobType);

It's not so easy to do a "look at the previous row" with just any old linq provider. I hope you don't have to teach LinqToSql or LinqToEntities how to do this.

Amy B
  • 108,202
  • 21
  • 135
  • 185
  • It's actually Entities but I have no problem with `.ToList()`ing and doing it in-memory - it's not a huge list (hopefully no more than a few hundred objects) and I'm alredy retrieving the list in full for other reasons. I'll give this a spin – Basic Jul 03 '12 at 13:23
  • 1
    What is MyCustomGroupImplementation? – Hogan Jul 03 '12 at 15:55
  • @Hogan : class MyCustomGroupImplementation : IGrouping. implement the interface and make the key settable and the items addable. – Amy B Jul 03 '12 at 20:40
  • 2
    ... um.. I guess I was really asking "Why didn't you post the source to MyCustomGroupImplementation?" – Hogan Jul 03 '12 at 20:59
  • @Hogan because it's trivial. 90% of it is generated by typing in the declaration in my last comment and clicking "implement interface". – Amy B Jul 03 '12 at 23:37
2

Here's a reasonable approach that uses the Aggregrate method.

If you start with a list of JobTypes like this:

var jobTypes = new []
{
    JobTypes.JobType1,
    JobTypes.JobType2,
    JobTypes.JobType2,
    JobTypes.JobType2,
    JobTypes.JobType2,
    JobTypes.JobType3,
    JobTypes.JobType1,
    JobTypes.JobType1,
};

You can use Aggregate by first defining the accumulator like so:

var accumulator = new List<KeyValuePair<JobTypes, int>>()
{
    new KeyValuePair<JobTypes, int>(jobTypes.First(), 0),
};

Then the Aggregate method call looks like this:

var results = jobTypes.Aggregate(accumulator, (a, x) =>
{
    if (a.Last().Key == x)
    {
        a[a.Count - 1] =
            new KeyValuePair<JobTypes, int>(x, a.Last().Value + 1);
    }
    else
    {
        a.Add(new KeyValuePair<JobTypes, int>(x, 1));
    }
    return a;
});

And finally calling this give you this result:

Job Types Results

Simple, sort of...

Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • Are you getting around the problem by ordering by date, then ignoring the date and just aggregating on the Job Type? Sorry, I'm not completely clear how your approach is working – Basic Jul 03 '12 at 13:26
  • @Basic - I'm just assuming that you've already queried your data and have it in date order. The grouping that you're doing only requires and enumerable of job types. – Enigmativity Jul 03 '12 at 13:58
  • Thanks Enigmativity, I eventually went with David's answer but your answer certainly expanded my understanding of aggregators. +1 – Basic Jul 04 '12 at 07:12
2

This updated version uses a subroutine to do the same thing as before, but doesn't need the extra internal field. (I have kept my earlier version, which, to avoid using a Zip routine, needed the extra OrDer field.)

Option Explicit On
Option Strict On
Option Infer On
Imports so11310237.JobTypes
Module so11310237
 Enum JobTypes
  JobType1
  JobType2
  JobType3
 End Enum
Sub Main()
 Dim data = {New With{.CO=#11:00#, .JT=JobType1, .OD=0},
  New With{.CO=#11:03#, .JT=JobType2, .OD=0},
  New With{.CO=#11:05#, .JT=JobType3, .OD=0},
  New With{.CO=#11:02#, .JT=JobType2, .OD=0},
  New With{.CO=#11:06#, .JT=JobType1, .OD=0},
  New With{.CO=#11:01#, .JT=JobType2, .OD=0},
  New With{.CO=#11:04#, .JT=JobType2, .OD=0},
  New With{.CO=#11:07#, .JT=JobType1, .OD=0}}

 ' Check that there's any data to process
 If Not data.Any Then Exit Sub

 ' Both versions include a normal ordering first.
 Dim odata = From q In data Order By q.CO

 ' First version here (and variables reused below):

 Dim ljt = odata.First.JT

 Dim c = 0
 For Each o In odata
  If ljt <> o.JT Then
   ljt = o.JT
   c += 1
  End If
  o.OD = c
 Next

 For Each p In From q In data Group By r=q.JT, d=q.OD Into Count()
  Console.WriteLine(p)
 Next

 Console.WriteLine()

 ' New version from here:

 ' Reset variables (still needed :-()
 ljt = odata.First.JT
 c = 0 
 For Each p In From q In odata Group By r=q.JT, d=IncIfNotEqual(c,q.JT,ljt) Into Count()
  Console.WriteLine(p)
 Next

End Sub

Function IncIfNotEqual(Of T)(ByRef c As Integer, ByVal Value As T, ByRef Cmp As T) As Integer
 If Not Object.Equals(Value, Cmp) Then
  Cmp = Value
  c += 1
 End If
 Return c 
End Function

End Module
Mark Hurd
  • 10,665
  • 10
  • 68
  • 101
  • Thanks Mark but while I appreciate the brevity, it's not very maintainable. It also forces me to either use an intermediate type or modify my existing class to have a `.OD` property which wouldn't make sense elsewhere in my code. – Basic Jul 04 '12 at 07:17
  • Updated to not need either, just the two local variables that would need to be reset (well the `c` wouldn't _have_ to be reset) if the enumeration needed to be enumerated multiple times. – Mark Hurd Jul 05 '12 at 15:45
1

The Non-Linq answer

Given

public enum JobTypes
{
    JobType1,
    JobType2,
    JobType3
}

public class Job
{
    public JobTypes JobType { get; set; }
    public DateTime CreatedOn { get; set; }
}

public class JobSummary
{
    public JobSummary(JobTypes jobType, long count)
    {
        this.JobType = jobType;
        this.Count = count;
    }

    public JobTypes JobType { get; set; }
    public long Count { get; set; }
}

then you could

private List<JobSummary> GetOrderedSummary(List<Job> collection)
{
    var result = new List<JobSummary>();
    if (!collection.Any())
    {
        return result;
    }
    var orderedCollection = collection.OrderBy(j => j.CreatedOn);
    var temp = orderedCollection.First();
    var count = 1;

    foreach (var job in orderedCollection.Skip(1))
    {
        if (temp.JobType == job.JobType)
        {
            count++;
            continue;
        }

        result.Add(new JobSummary(temp.JobType, count));
        temp = job;
        count = 1;
    }

    result.Add(new JobSummary(temp.JobType, count));

    return result;
}

using

private void DoSomething()
{
    var collection = new List<Job>
    {
        new Job{JobType = JobTypes.JobType1, CreatedOn = DateTime.Now},
        new Job{JobType = JobTypes.JobType2, CreatedOn = DateTime.Now.AddSeconds(1)},
        new Job{JobType = JobTypes.JobType2, CreatedOn = DateTime.Now.AddSeconds(2)},
        new Job{JobType = JobTypes.JobType2, CreatedOn = DateTime.Now.AddSeconds(3)},
        new Job{JobType = JobTypes.JobType2, CreatedOn = DateTime.Now.AddSeconds(4)},
        new Job{JobType = JobTypes.JobType3, CreatedOn = DateTime.Now.AddSeconds(5)},
        new Job{JobType = JobTypes.JobType3, CreatedOn = DateTime.Now.AddSeconds(6)},
        new Job{JobType = JobTypes.JobType1, CreatedOn = DateTime.Now.AddSeconds(7)},
        new Job{JobType = JobTypes.JobType1, CreatedOn = DateTime.Now.AddSeconds(8)},
    };

    var summary = GetOrderedSummary(collection);

}
G2Mula
  • 184
  • 1
  • 1
  • 9
  • Thanks for the suggestion but this is more verbose than I was hoping for and as mentioned, isn't a LINQ solution. That said, welcome to SO and hopefully see you around in future.+ – Basic Jul 03 '12 at 13:25
  • you're welcome oh, also came across http://tomasp.net/blog/custom-linq-grouping.aspx might be interesting – G2Mula Jul 03 '12 at 13:28
1

The Linq Answer

public enum JobTypes
{
    JobType1,
    JobType2,
    JobType3
}

static void Main(string[] args)
{
    var collection = new[]
    {
        new {JobType = JobTypes.JobType1, CreatedOn = DateTime.Now},
        new {JobType = JobTypes.JobType2, CreatedOn = DateTime.Now.AddSeconds(1)},
        new {JobType = JobTypes.JobType2, CreatedOn = DateTime.Now.AddSeconds(2)},
        new {JobType = JobTypes.JobType2, CreatedOn = DateTime.Now.AddSeconds(3)},
        new {JobType = JobTypes.JobType2, CreatedOn = DateTime.Now.AddSeconds(4)},
        new {JobType = JobTypes.JobType3, CreatedOn = DateTime.Now.AddSeconds(5)},
        new {JobType = JobTypes.JobType1, CreatedOn = DateTime.Now.AddSeconds(7)},
        new {JobType = JobTypes.JobType1, CreatedOn = DateTime.Now.AddSeconds(8)}
    };

    var orderedCollection = collection.OrderBy(job => job.CreatedOn);
    var temp = orderedCollection.First().JobType;
    var identifier = 0;
    var summary = orderedCollection.Select(job =>
    {
        if (job.JobType == temp)
        {
            return new { JobType = job.JobType, Id = identifier };
        }

        temp = job.JobType;
        return new { JobType = job.JobType, Id = ++identifier };
    }).GroupBy(job => new { job.JobType, job.Id }).Select(job => new { JobType = job.Key.JobType, Count = job.Count() });

    foreach (var sum in summary)
    {
        Console.WriteLine("JobType: {0}, Count: {1}", sum.JobType, sum.Count);
    }

    Console.ReadLine();
}
G2Mula
  • 184
  • 1
  • 1
  • 9
  • +1 That's a nice approach, thanks. I've already implemented David B's method so will leave that as accepted but it's a nice alternative, thanks – Basic Jul 04 '12 at 10:28