2

Here is very simplified version of code that i have:

class PrintJob : IEntity
{
    public string UserName { get; set; }
    public string Departmen { get; set; }
    public int PagesPrinted { get; set; }
}

class PrintJobReportItem
{
    public int TotalPagesPrinted { get; set; }
    public int AveragePagesPrinted { get; set; }
    public int PercentOfSinglePagePrintJobs { get; set; }
}

class PrintJobByUserReportItem : PrintJobReportItem
{
    public string UserName { get; set; }
}

class PrintJobByDepartmenReportItem : PrintJobReportItem
{
    public string DepartmentName { get; set; }
    public int NumberOfUsers { get; set; }
}

Then i have 2 queries:

var repo = new Repository(...);

var q1 = repo.GetQuery<PrintJob>()
    .GroupBy(pj => pj.UserName)
    .Select(g => new PrintJobByUserReportItem
    {
    #region this is PrintJobReportItem properties
        TotalPagesPrinted = g.Sum(p => p.PagesPrinted),
        AveragePagesPrinted = g.Average(p => p.PagesPrinted),
        PercentOfSinglePagePrintJobs = g.Count(p => p.PagesPrinted == 1) / (g.Count(p => p.PagesPrinted) != 0 ? g.Count(p => p.PagesPrinted) : 1) * 100,
    #endregion    
        UserName = g.Key
    });

var q2 = repo.GetQuery<PrintJob>()
    .GroupBy(pj => pj.Departmen)
    .Select(g => new PrintJobByDepartmenReportItem
    {
    #region this is PrintJobReportItem properties
        TotalPagesPrinted = g.Sum(p => p.PagesPrinted),
        AveragePagesPrinted = g.Average(p => p.PagesPrinted),
        PercentOfSinglePagePrintJobs = g.Count(p => p.PagesPrinted == 1) / (g.Count(p => p.PagesPrinted) != 0 ? g.Count(p => p.PagesPrinted) : 1) * 100,
    #endregion    
        DepartmentName = g.Key,
        NumberOfUsers = g.Select(u => u.UserName).Distinct().Count()
    });

What would be suggestions for extracting parts where i assign values to TotalPagesPrinted, AveragePagesPrinted and PercentOfSinglePagePrintJobs out from those 2 queries, so that it can be reused and would follow DRY principle.

I'm using EF 4.1 code only approach and switching to another technology or approach is not an option. Also i cannot materialize that data, i need to keep it as query, because my grid component will add more things to query later, so i can't switch to Linq to Object.

Andrej Slivko
  • 1,236
  • 2
  • 12
  • 27

2 Answers2

2

I would create a new class CLASSNAME that has two properties

  • PrintJobReportItem type
  • GROUPING IEnumerable<IGrouping<TKey, TSource>>

Then create an extension method

public static IQueryable<CLASSNAME> EXTENSIONNAME<TKey, TSource>(this IEnumerable<IGrouping<TKey, TSource>> source)
{
  return from g in source
         select new CLASSNAME
         {
           PrintJobReportItem = new PrintJobReportItem
                                {
                                  TotalPagesPrinted = g.Sum(p => p.PagesPrinted),
                                  AveragePagesPrinted = etc...,
                                  PercentOfSinglePagePrintJobs = etc...,
                                },
           GROUPING = g
         };
}

Then use like so, I haven't tested but I think it would work

var q1 = repo.GetQuery<PrintJob>()
    .GroupBy(pj => pj.UserName)
    .EXTENSIONNAME()
    .Select(g => new PrintJobByDepartmenReportItem
                 {
                    PrintJobReportItem = g.PrintJobReportItem,
                    DepartmentName = g.GROUPING.Key,
                    NumberOfUsers = g.GROUPING.Select(u => u.UserName).Distinct().Count()

                 });
Aducci
  • 26,101
  • 8
  • 63
  • 67
  • If that would work then my ReportItems would have complex property types like PrintJobReportItem, and my grid component only deals with simple types. But your code gave me some ideas, i will try it out tomorrow at work. – Andrej Slivko Jun 08 '11 at 18:44
  • this worked, what i did different is i derived CLASSNAME from PrintJobReportItem so i had to put only Group property there, but then of course i had to repeat all properties in select expression like TotalPagesPrinted = g.TotalPagesPrinted, AveragePagesPrinted = g.AveragePagesPrinted ... This is better then what i had, at least in c# code, but that creates 2 nested selects (with joins) in sql where before it was only one, so it's doing 2 transformations instead of one. – Andrej Slivko Jun 09 '11 at 08:38
  • @qrow - I think your rdbms will be able to optimize the query and the performance will be the same – Aducci Jun 09 '11 at 15:27
  • I'm accepting this answer for now, I didn't figure out better way to do it. Maybe in later versions of EF some other ways of reusing transformations will be supported. – Andrej Slivko Jun 13 '11 at 06:11
0

The most straightforward thing I could think to do is create a PrintJobByDepartmenReportItem constructor that accepts a single IEnumerable<IGrouping<string, PrintJob>> parameter (which I believe should be the type of variable g in your sample). Keep in mind this also requires a parameter-less constructor definition, and your inherited classes would also need to implement a constructor prototype to call the base class constructor with the parameter:

Constructor

public PrintJobReportItem()
{
}

public PrintJobReportItem(IEnumerable<IGrouping<string, PrintJob>> g)
{
    this.TotalPagesPrinted = g.Sum(i => i.GetEnumerator().Current.PagesPrinted);
    this.AveragePagesPrinted = g.Average(i => i.GetEnumerator().Current.PagesPrinted);
    this.PercentOfSinglePagePrintJobs = g.Count(i => i.GetEnumerator().Current.PagesPrinted == 1) * 100 / g.Count(i => i.GetEnumerator().Current.PagesPrinted > 1);
}

Inherited Constructor

public PrintJobByDepartmentReportItem(IEnumerable<IGrouping<string, PrintJob>> g) : base(g)
{
    this.DepartmentName = g.First().Key;
    this.NumberOfUsers = g.Select(i => i.GetEnumerator().Current.UserName).Distinct().Count();
}

Queries

var q1 = repo.GetQuery<PrintJob>()
    .GroupBy(pj => pj.UserName)
    .Select(g => new PrintJobByUserReportItem(g));

var q2 = repo.GetQuery<PrintJob>()
    .GroupBy(pj => pj.Department)
    .Select(g => new PrintJobByDepartmentReportItem(g));

This does have the one downside of assuming you will always be grouping by a string member, but you could presumably GroupBy(i => i.MyProperty.ToString()) when appropriate or possibly change the prototype to accept IEnumerable<IGrouping<object, PrintJob>>.

lsuarez
  • 4,952
  • 1
  • 29
  • 51
  • Sorry for the edit stream. My C# is rusty as hell. Tend to develop in VB.NET but the concepts are the same. :] – lsuarez Jun 08 '11 at 18:57
  • In L2E you can only use a parameterless constructor – Aducci Jun 08 '11 at 20:02
  • PrintJobReportItem doesn't look like an entity class to me... does that really rule it out here? – lsuarez Jun 08 '11 at 21:52
  • PrintJob is the only entity here. All other classes is for view only – Andrej Slivko Jun 09 '11 at 06:24
  • In that case, you should have no problems implementing the constructor-based solution. I didn't have any issues when building the project using the prototypes provided in the sample with my edits. Looks like it would clean up the code pretty handily as well. – lsuarez Jun 09 '11 at 14:26
  • I think this `.Select(g => new PrintJobByDepartmentReportItem(g));` won't be valid, Linq2Entity will not let use different constructor then parameterless. Dosen't matter that it's not Entity, it still goes to ef linq provider because it's query and not materialized object – Andrej Slivko Jun 09 '11 at 14:54
  • I see your point. I suppose by the time the deferred select command is executed it may not work. – lsuarez Jun 09 '11 at 15:22