23

In SQL you can express multiple aggregates in a single database query like this:

SELECT MIN(p.x), MAX(p.x), MIN(p.y), MAX(p.y)
FROM   Places p JOIN Logs l on p.Id = l.PlaceId
WHERE  l.OwnerId = @OwnerId

Is it possible to do an equivalent thing using Linq-to-Entities? I found a similar question that suggests it's not possible for Linq-to-SQL but I would like to think I won't have to do the above using four round trips to the DB.

Community
  • 1
  • 1
Drew Noakes
  • 300,895
  • 165
  • 679
  • 742
  • +1 for interesting question, I haven't looked into it myself but it would be weird if it didn't work. Can't you use the `let` syntax? – Tomas Jansson Dec 13 '10 at 17:53
  • @Thomas, as far as I can see it's not possible to assign a set using `let`, but rather a single value for an item in the set. I have been using EF for a few months now and was surprised to hit this barrier today. I really don't think it's possible! – Drew Noakes Dec 13 '10 at 18:08
  • @hunter's deleted answer works. Did you try it? Also, `let` does indeed work with sets. Have you tried that? I have production code which does both. – Craig Stuntz Dec 15 '10 at 18:14
  • @Craig, actually @hunter's deleted answer doesn't achieve what the original question asks for which is to make a single DB call. Running that with SQL Profiler shows that four separate DB calls are made, which is reasonable given that the compiler emits code that invokes the IQueryable four times. Can you give an example of using `let` with sets that doesn't involve subqueries? – Drew Noakes Dec 15 '10 at 20:50
  • @Craig and @hunter, if you haven't already, you should check out Costas's answer. – Drew Noakes Jan 30 '11 at 23:56
  • @Drew, well, OK. Sure, it's a good solution. In fact, it's exactly the same idea as the second query I gave you in December. – Craig Stuntz Jan 31 '11 at 03:34
  • @Craig, your solution was similar but not quite the same. It missed the idea of grouping by a constant which was essential for my scenario as the only discriminators were present in the where clause, not as grouping keys. Hence it didn't seem a candidate at the time. – Drew Noakes Jan 31 '11 at 09:40
  • In that case I would choose to use the same column in the grouping as in the Where predicate. Although a constant will work, it looks odd to the reader. The results will be the same in either case, though. – Craig Stuntz Jan 31 '11 at 11:53

3 Answers3

27

Suppose you have the following SQL statement:

  select sum(A), max(B), avg(C) from TBL group by D

Try this in C#:

  from t in table
  group t by D
  into g
  select new {
     s = g.Sum(x => x.A),
     m = g.Max(x => x.B),
     a = g.Average(x => x.C)
  }

-- or in VB: --

  from t in TBL
  group t by key = D
  into g = group
  select s = g.Sum(function(x) x.A),
       m = g.Max(function(x) x.B),
       a = g.Average(function(x) x.C)

The obvious, which in VB would be:

  aggregate t in TBL into s = Sum(t.A), m = Max(t.B), a = Average(t.C)

though it will give the same results, it has a higher cost as it issues multiple SQL select statements, one for each aggregate function, i.e. it will run in multiple passes. The first syntax, gives a single (fairly complex, but efficient) SQL statement which does a single pass against the database.

PS. If you don't have a key by which to group by (i.e. you need a single row as the result, covering the whole data set), use a constant as in:

  from t in TBL
  group t by key = 0
  into g = group
  select s = g.Sum(function(x) x.A),
       m = g.Max(function(x) x.B),
       a = g.Average(function(x) x.C)
Costas
  • 286
  • 2
  • 3
  • This seems to work really well. Thanks for solving this with a much more elegant and robust solution than using Entity SQL. I found using `group t by 1 into g` worked well for me. Thanks again! – Drew Noakes Jan 30 '11 at 23:53
  • The proposed solution is using grouping, altough in the question grouping was not mentioned. . What if there is no grouping involved, I just want to execute this sql: select sum(A), max(B), avg(C) from TBL? – Goran Jun 20 '13 at 23:23
  • @Goran That's covered by the PS -- you group by a constant. (Which feels a bit weird, but seems to be the canonical answer. I've seen it done in raw SQL queries as well.) – starwed Mar 14 '16 at 12:50
  • Unfortunately, even with the more efficient option, I'm seeing it generate overly complex SQL that uses `SELECT sum(x) FROM (SELECT sum(y) FROM ...)` nested subselects, where each subselect aggregates just one column at a time instead of it all being a single grouping with all the needed aggregates. Still looks like there's no option that gives you kind of clean, simple SQL you'd write yourself. – Jacob Stamm Nov 10 '21 at 22:22
2

I don't have your DB, but this (using a "default" EDMX model of Northwind.mdb -- no changes after running the new model wizard) runs as one query in LINQPad:

var one = from c in Customers 
          where c.PostalCode == "12209"
          select new
          {
              Id = c.Orders.Max(o => o.OrderID),
              Country = c.Orders.Min(o => o.ShipCountry)
          };          

one.Dump();

Updated, per your comment:

var two = from c in Customers 
      where c.PostalCode == "12209"
      from o in c.Orders
      group o by o.Customer.PostalCode into g
      select new
      {
          PostalCode = g.Key,
          Id = g.Max(o => o.OrderID),
          Country = g.Min(o => o.ShipCountry)
      };          

two.Dump();
Craig Stuntz
  • 125,891
  • 12
  • 252
  • 273
  • Hi @Craig. I don't have Northwind on my machine, but by eyeball I think this query will produce one entry for each customer having that post code. What I'd like to do is the equivalent of determining the max OrderId for any customer having that post code. Is that possible? I'd simplified my original question perhaps too much. I'll update it to make it more closely match my actual schema. – Drew Noakes Dec 16 '10 at 03:20
  • OK, see updated answer. This could probably be simplified, but I chose to follow the pattern of the first query. – Craig Stuntz Dec 16 '10 at 14:00
  • in case you're interested, I found a solution that works, though it's not the prettiest. – Drew Noakes Jan 12 '11 at 08:17
0

Unfortunately the answer appears to be no.

If someone can prove otherwise I'll be happy to grant them the accepted answer.


EDIT

I found a way to do this using Entity SQL. I'm not sure it's the greatest way, but as it seems to be the only way then it may be the greatest by default :)

var cmdText = "SELECT MIN(p.x), MAX(p.x), MIN(p.y), MAX(p.y) " +
              "FROM Places AS p JOIN Logs AS l ON p.Id = l.PlaceId " +
              "WHERE l.OwnerId==123";
var results = CreateQuery<DbDataRecord>(cmdText)
var row = results.First();
var minX = (double)row[0];
var maxX = (double)row[1];
var minY = (double)row[2];
var maxY = (double)row[3];

The above isn't exactly the code I am working with. For a simpler case without a join, here's the generated SQL, showing that only one trip to the DB is made:

SELECT 
1 AS [C1], 
[GroupBy1].[A1] AS [C2], 
[GroupBy1].[A2] AS [C3], 
[GroupBy1].[A3] AS [C4], 
[GroupBy1].[A4] AS [C5]
FROM ( SELECT 
    MIN([Extent1].[X1]) AS [A1], 
    MAX([Extent1].[X1]) AS [A2], 
    MAX([Extent1].[Y1]) AS [A3], 
    MIN([Extent1].[Y1]) AS [A4]
    FROM [dbo].[Edges] AS [Extent1]
    WHERE [Extent1].[PlaceId] = 123
)  AS [GroupBy1]

If someone finds a more elegant solution to this problem, I'll grant them the accepted answer.


EDIT 2

Thanks to Costas who found a great solution to this problem that uses pure Linq.

Community
  • 1
  • 1
Drew Noakes
  • 300,895
  • 165
  • 679
  • 742