9

I have list with:

EMP_ID | UPDATED_DATE | MARK
------ | ------------ | ----
111    | 01/01/2015   | 99  
111    | 01/01/2013   | 85  
111    | 01/01/2017   | 80  
222    | 01/01/2011   | 70  
222    | 01/01/2015   | 55  
222    | 01/01/2002   | 60  

I have to select one row for each ID, with the latest UPDATED_DATE, In our etc:

EMP_ID | UPDATED_DATE | MARK
------ | ------------ | ----
111    | 01/01/2017   | 80  
222    | 01/01/2015   | 55  

This is the code for order:

empMarksList.OrderBy(x=>x.EMP_ID).ThenBy(y=>y.UPDATED_DATE)
user1012506
  • 2,048
  • 1
  • 26
  • 45

4 Answers4

11

Use GroupBy:

var items = empMarksList
                   .GroupBy(e => e.EMP_ID)
                   .Select(grp => grp.OrderByDescending(v => v.UPDATED_DATE).First());

Or if you want a Dictionary:

var dict = empMarksList
              .GroupBy(e => e.EMP_ID)
              .ToDictionary(grp => grp.Key,
                            grp => grp.OrderByDescending(v => v.UPDATED_DATE).First());
Amir Popovich
  • 29,350
  • 9
  • 53
  • 99
  • 7
    @arekzyla - In this code `.First()` is perfectly fine as the `.GroupBy` always guarantees at least one result. – Enigmativity Jul 30 '18 at 13:28
  • @Enigmativity I get an exception: 'The method 'First' can only be used as a final query operation. Consider using the method 'FirstOrDefault' in this instance instead'. For Linq-To-Entities it won't work. Additionally `GROUP BY` won't be generated from this query but only `OUTER APPLY` with sth like `WHERE inner.Id = outer.Id`. – arekzyla Jul 30 '18 at 13:29
  • @arekzyla The query shouldnt matter and [Here is a working example implementation](https://tio.run/##pZPRTsIwFIavt6c44WqL2LAZRCSYAENDhEgA44UaUsfJbBwdtgWzkD07tozoDKgXNGlP2v7n69/mNJSnYSJws1lKxiMYp1LhvGEXZ6TP@HvDtsOYSglDkUSCzmFtg26L5UvMQpCKKh1WCZvBgDLuSCU04PEZqIikq8XWigrA@WJAxZvsM6mawPEDtCIHmWaZlTV0B8NpL4AmeJ5Xhvth0Jp0g6kZIU8KqMIJm6PjV7xqGSqe6W4ZBq3RrZbU61n5KObZPvOieiSzdoBZ@YPp@/6/TG@fWTuWeeA9q9XjmBV/n3me@7QyXVfb0oi1XiqdRfVusU6@TyY3Ilku2qmD0LwCJLkHtyAYY4yhciKxMAodyJ2YoWinAcoQ@UyXpLMyWytStOySayakclyXTJKWEDR1XOOrk3CZxEgeBFPo2JZVGqH@LDPoJEuuLqEEJwXfpI88Uq96zSifeC/YU@yO2Tn/Uu6o5sl@Tfnht7G9c2Znm80n). – Marie Jul 30 '18 at 17:53
  • @AmirPopovich I am not 100% sure but checking Max should be faster than OrderByDescending shouldnt it? E.g. `grp.First(v => v.UPDATED_DATE == grp.Max(g => g.UPDATED_DATE))` – Marie Jul 30 '18 at 18:10
  • @Marie But your example uses *Linq-to-Objects* and OP uses EntityFramework (as you can see in tags) so *Linq-To-Entities* provider is used in which you can *only* use `First` as a final query operation. – arekzyla Jul 30 '18 at 21:49
3

I prefer this variant, but it's the same thing as Amir's answer:

var query =
    empMarksList
        .GroupBy(x => x.EMP_ID)
        .SelectMany(x => x.OrderByDescending(y => y.UPDATED_DATE).Take(1));
Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • Preferred, as simply Take 1 rather than executing complete result and then return 1st (first) item. – Incredible Jul 30 '18 at 13:18
  • 1
    @ItiTyagi - Both the `.First()` and the `.Take(1)` only return the first item - neither executes the complete result. `.Take(1)` is better because if the query before it returns an empty list then it won't throw. – Enigmativity Jul 30 '18 at 13:27
  • What do you mean by "because if the query before it returns"? – Incredible Jul 30 '18 at 13:29
  • 2
    If you are concerned about an empty list you can just use `FirstOrDefault` which has the same result but is more correct semantically. In this case your enumeration is returned by GroupBy and I am 99% sure that GroupBy cannot return an empty list. that wouldnt make any sense. – Marie Jul 30 '18 at 18:09
  • @ItiTyagi - I made the remark that "if the query before it returns an empty list" only as a hypothetical comment. In this case it can't, but if a `.Where` were inserted then it would be possible and it's then best to get in to a good habit. `.FirstOrDefault` would then provide a different semantic. – Enigmativity Jul 31 '18 at 02:05
  • @Marie - `.FirstOrDefault` would provide a different semantic than `.Take(1)` in my query if the part before the `.Take(1)` produced no values. It would just take a simple insertion of a `.Where` clause to break the code. – Enigmativity Jul 31 '18 at 02:06
3

Another option is:

var items = context.EmpMarks
    .GroupBy(e => e.EMP_ID, (k, g) => g
        .FirstOrDefault(e => g.Max(v => v.UPDATED_DATE) == e.UPDATED_DATE));

Which actually should generate GROUP BY in SQL.

arekzyla
  • 2,878
  • 11
  • 19
1

You can use something like this:

var result = empMarksList.GroupBy(x => x.Id)
    .Select(g => 
        g.Aggregate((a, x) => a == null || a.UPDATRED_DATE < x.UPDATRED_DATE ? x : a));

It's a bit more cumbersome than using OrderBy, but in this way you won't order all subcollections, which is a bit of an overkill here and uses more resources.

EDIT: After an answer by @arekzyla I realized that my option also could be written like this:

var items = empMarksList.GroupBy(
   x => x.Id,
   (k, g) => g.Aggregate((a, x) => a == null || a.UPDATRED_DATE < x.UPDATRED_DATE ? x : a));

It less readable but will have one collection walkthrough on sub collections instead of two, which is negligible in most cases.

I'm not sure in what case SQL generated would be more optimal so might be worth to check.

Andrey Tretyak
  • 3,043
  • 2
  • 20
  • 33