12

Consider the code below:

StockcheckJobs = 
     (from job in (from stockcheckItem in MDC.StockcheckItems
                   where distinctJobs.Contains(stockcheckItem.JobId)
                   group stockcheckItem by new { stockcheckItem.JobId, stockcheckItem.JobData.EngineerId } into jobs
                   select jobs).ToList()
      let date = MJM.GetOrCreateJobData(job.Key.JobId).CompletedJob.Value
      orderby date descending 
      select new StockcheckJobsModel.StockcheckJob()
      {
          JobId = job.Key.JobId,
          Date = date,
          Engineer = (EngineerModel)job.Key.EngineerId,
          MatchingLines = job.Count(sti => sti.Quantity == sti.ExpectedQuantity),
          DifferingLines = job.Count(sti => sti.Quantity != sti.ExpectedQuantity)
      }).ToList()

There is a ToList() in the middle because the GetOrCreateJobData method can't be translated into sql.

As a result I've had to surround the first part of my query in brackets to do this, then I've used an outer query to finish up.

I know I could split this into two variables, but I don't want to do that (this is within an object initialiser too).

Is there some other syntax I can use to increase readability, preferably removing the need for an outer an inner query, when I have to do a ToList (or otherwise get to linq-to-objects) in the middle of a linq query?


In an ideal world I'd like something like this (as close as is possible anyway):

StockcheckJobs =
     from stockcheckItem in MDC.StockcheckItems
     where distinctJobs.Contains(stockcheckItem.JobId)
     group stockcheckItem by new { stockcheckItem.JobId, stockcheckItem.JobData.EngineerId } into jobs
     MAGIC_DO_BELOW_AS_LINQ-TO-OBJECTS_KEYWORD_OR_SYNTAX
     let date = MJM.GetOrCreateJobData(jobs.Key.JobId).CompletedJob.Value
     orderby date descending 
     select new StockcheckJobsModel.StockcheckJob()
     {
         JobId = jobs.Key.JobId,
         Date = date,
         Engineer = new ThreeSixtyScheduling.Models.EngineerModel() { Number = jobs.Key.EngineerId },
         MatchingLines = jobs.Count(sti => sti.Quantity == sti.ExpectedQuantity),
         DifferingLines = jobs.Count(sti => sti.Quantity != sti.ExpectedQuantity)
     };
George Duckett
  • 31,770
  • 9
  • 95
  • 162
  • 2
    "I know I could split this into two variables, but I don't want to do that" Well, it's probably a good idea anyway. Trying to do too much on one line leads to problems. – Servy Sep 20 '12 at 15:15
  • Will it help to say `MDC.StockcheckItems.AsEnumerable()` to force that "source" to be queried as Linq-To-Objects? – Jeppe Stig Nielsen Sep 20 '12 at 15:15
  • @Servy: It's not in one line though, the query is spread out neatly (IMO) over multiple lines (like an sql query is). – George Duckett Sep 20 '12 at 15:16
  • @JeppeStigNielsen: I don't want the whole thing done in linq to objects, I want the main bit of it (the filtering etc) to be done on the sql server. – George Duckett Sep 20 '12 at 15:16
  • @GeorgeDuckett I don't mean one texual line, I mean one executable line. In any event, since you're interested in readability, I'd say that if you're going to be performing two different queries it would be beneficial for each trip to the DB to be separated out, rather than having an outer/inner query on a single assignment that results in two distinct DB queries. – Servy Sep 20 '12 at 15:18
  • Doing too much on one executable line will make stacktraces' line numbers not very helpful. Good luck debugging a `NullReferenceException` in that code. – Jesse Webb Sep 20 '12 at 15:19
  • My point is, I feel it would be readable had I not had to add the `ToList` and do an inner/outer query. The point is that conceptually it's one query, the only reason I need to separate it is because the syntax doesn't allow me to do part on the server and another part in linq to objects. – George Duckett Sep 20 '12 at 15:20
  • No, there isn't a way. All the LINQ query syntax results in `IQueryable` or `IEnumerable`. You have to invoke the extension method to get something like `List`. Kind of the side effect of having local code and database code... You have to take the hit somewhere. – Peter Ritchie Sep 20 '12 at 15:20
  • @JesseWebb: I realise i'm fighting a bit of a loosing battle here, but... you can put breakpoints on different parts of a query (e.g. the where clause, select clause, group by etc. both of the inner and outer ones). It's really not difficult. – George Duckett Sep 20 '12 at 15:21
  • I did not realize that, I've never bothered trying. I always though a breakpoint could only be placed on one line, ending in a semi-colon. – Jesse Webb Sep 20 '12 at 15:23
  • Maybe you can edit your question to show what you hope your query will look like without an "outer and inner query"? Just write `MAGIC_WORD_HERE` or something, at the part where you need to do a `ToList` equivalent. – Jeppe Stig Nielsen Sep 20 '12 at 15:24
  • @JeppeStigNielsen:Edited – George Duckett Sep 20 '12 at 15:35

4 Answers4

5

You can fix the issue of GetOrCreateJobData not being translatable to SQL.

By implementing a custom query translator for the specified method call expression, you can gain control over how LINQ-to-SQL interprets the method. There is a good article explaining this procedure and linking to relevant resources available at: http://www.codeproject.com/Articles/32968/QueryMap-Custom-translation-of-LINQ-expressions

Alternatively, you could refactor the GetOrCreateJobData method to an extension method which builds the same logic with expressions, so that LINQ-to-SQL can interpret it naturally. Depending on the complexity of the method, this may be more or less feasible than my first suggestion.

smartcaveman
  • 41,281
  • 29
  • 127
  • 212
  • This is the best idea, as long as `GetOrCreateJobData` actualy involves the database. – Jodrell Sep 20 '12 at 15:33
  • Great link, thanks for that info. I've always used 2 properties, one returning an `Expression` and another for the actual value. Nice to know of a way to combine the two. – George Duckett Sep 20 '12 at 15:38
3

I find that using method syntax makes things clearer, but that's just personal preference. It certainly makes the top half of the query better, but using a let, while possible in method syntax, is a bit more work.

var result = stockcheckItem in MDC.StockcheckItems
    .Where(item => distinctJobs.Contains(item.JobId))
    .GroupBy(item => new { item.JobId, item.JobData.EngineerId })
    .AsEnumerable() //switch from Linq-to-sql to Linq-to-objects
    .Select(job => new StockcheckJobsModel.StockcheckJob()
    {
        JobId = job.Key.JobId,
        Date = MJM.GetOrCreateJobData(job.Key.JobId).CompletedJob.Value,
        Engineer = (EngineerModel)job.Key.EngineerId,
        MatchingLines = job.Count(sti => sti.Quantity == sti.ExpectedQuantity),
        DifferingLines = job.Count(sti => sti.Quantity != sti.ExpectedQuantity)
    })
    .Orderby(item => item.Date)
    .ToList()
Servy
  • 202,030
  • 26
  • 332
  • 449
  • Gotta say I do like query syntax a bit more (if only for fewer `item =>` bits, and the `let`) but this does look pretty readable. Only disadvantage of this over the query syntax is that would be one long breakpoint. Still, +1 – George Duckett Sep 20 '12 at 15:40
  • @GeorgeDuckett You specifically said you wanted it on one line. I still support breaking it up into multiple lines. Also note that while there are a number of item => here, there is a lot of redundancy in query syntax too. There are a lot of identity selects for example, and the group by takes more than this syntax. – Servy Sep 20 '12 at 15:42
  • NB at this stage you can cut out an anonymous object and all the `item.`s by moving the `OrderBy` after the final `Select` and initializing the date inside that `Select`. (Also your `item` and `stockcheckItem` disagree.) (But +1 because I like method syntax.) – Rawling Sep 20 '12 at 15:44
  • One conceptual line yes. I'm just pointing out the disadvantage of this solution over a (probably non-existent) ideal solution/syntax. – George Duckett Sep 20 '12 at 15:48
3

I would raise two points with the question:

  1. I really don't think there's any readability issue with introducing an extra variable here. In fact, I think it makes it more readable as it separates the "locally executing" code from the code executing on the database.
  2. To simply switch to LINQ-To-Objects, AsEnumerable is preferable to ToList.

That said, here's how you can stay in query-land all the way without an intermediate AsEnumerable() / ToList() on the entire query-expression : by tricking the C# compiler into using your custom extension methods rather than the BCL. This is possible since C# uses a "pattern-based" approach (rather than being coupled with the BCL) to turn query-expressions into method-calls and lambdas.

Declare evil classes like these:

public static class To
{
    public sealed class ToList { }

    public static readonly ToList List;

    // C# should target this method when you use "select To.List"
    // inside a query expression.
    public static List<T> Select<T>
        (this IEnumerable<T> source, Func<T, ToList> projector)
    {
        return source.ToList();
    }
}

public static class As
{
    public sealed class AsEnumerable { }

    public static readonly AsEnumerable Enumerable;

    // C# should target this method when you use "select As.Enumerable"
    // inside a query expression.
    public static IEnumerable<T> Select<T>
        (this IEnumerable<T> source, Func<T, AsEnumerable> projector)
    {
        return source;
    }
}

And then you can write queries like this:

List<int> list = from num in new[] { 41 }.AsQueryable()
                 select num + 1 into result
                 select To.List;

IEnumerable<int> seq = from num in new[] { 41 }.AsQueryable()
                       select num + 1 into result
                       select As.Enumerable into seqItem
                       select seqItem + 1; // Subsequent processing

In your case, your query would become:

StockcheckJobs =
     from stockcheckItem in MDC.StockcheckItems
     where distinctJobs.Contains(stockcheckItem.JobId)
     group stockcheckItem by new { stockcheckItem.JobId, stockcheckItem.JobData.EngineerId } into jobs
     select As.Enumerable into localJobs // MAGIC!
     let date = MJM.GetOrCreateJobData(localJobs.Key.JobId).CompletedJob.Value
     orderby date descending 
     select new StockcheckJobsModel.StockcheckJob()
     {
         JobId = localJobs.Key.JobId,
         Date = date,
         Engineer = new ThreeSixtyScheduling.Models.EngineerModel() { Number = localJobs.Key.EngineerId },
         MatchingLines = localJobs.Count(sti => sti.Quantity == sti.ExpectedQuantity),
         DifferingLines = localJobs.Count(sti => sti.Quantity != sti.ExpectedQuantity)
     };

I really don't see this as any sort of improvement, though. Rather, it's pretty heavy abuse of a language feature.

Ani
  • 111,048
  • 26
  • 262
  • 307
0

One option is to do all the SQL compatible work up front into an anonymous type,

var jobs = 
     (from job in (from stockcheckItem in MDC.StockcheckItems
        where distinctJobs.Contains(stockcheckItem.JobId)
        group stockcheckItem by new 
             { stockcheckItem.JobId, stockcheckItem.JobData.EngineerId } 
         into jobs
        select new 
             {
                JobId = job.Key.JobId,
                Engineer = (EngineerModel)job.Key.EngineerId,
                MatchingLines = 
                    job.Count(sti => sti.Quantity == sti.ExpectedQuantity),
                DifferingLines = 
                    job.Count(sti => sti.Quantity != sti.ExpectedQuantity)
             }
      ).AsEnumerable()

StockcheckJobs = jobs.Select(j => new StockcheckJobsModel.StockcheckJob
    {
         JobId = j.JobId,
         Date = MJM.GetOrCreateJobData(j.JobId).CompletedJob.Value,
         Engineer = j.EngineerId,
         MatchingLines = j.MatchingLines,
         DifferingLines = j.DifferingLines
    }).OrderBy(j => j.Date).ToList();

Obviously not tested, but you get the idea.

Jodrell
  • 34,946
  • 5
  • 87
  • 124