2

I have a large table with 6000000 record like this format(Acc,sDate,Serial,Amount,...) Acc,date,serial is PKey.

To show my problem, created small code

public class Cheque 
{
    public string Account{ get; set; }
    public string Serial{ get; set; }   
    public string StartDate { get; set; }
    // ... public string Amount { get; set; }    ...
}

var list = new List<Cheque>();
list.Add(new Cheque() { Account= "1", Serial = "1", StartDate = "20080120"});
list.Add(new Cheque() { Account= "1", Serial= "2", StartDate = "20080120" });
list.Add(new Cheque() { Account= "1", Serial= "3", StartDate = "20080120" }); 
list.Add(new Cheque() { Account= "1", Serial= "4", StartDate = "20080120" }); 
// each acc have 100 to 300 record per date ,for simplicity 3 obj added

list.Add(new Cheque() { Account= "1", Serial= "1", StartDate = "20110120" });
list.Add(new Cheque() { Account= "1", Serial= "2", StartDate = "20110120" });

list.Add(new Cheque() { Account= "1", Serial= "1", StartDate = "20120120" });
list.Add(new Cheque() { Account= "1", Serial= "2", StartDate = "20120120" });
list.Add(new Cheque() { Account= "1", Serial= "3", StartDate = "20120120" });

list.Add(new Cheque() { Account= "2", Serial= "1", StartDate = "20100417" });
list.Add(new Cheque() { Account= "2", Serial= "2", StartDate = "20100417" });

list.Add(new Cheque() { Account= "2", Serial= "1", StartDate = "20120314" });

list.Add(new Cheque() { Account= "2", Serial= "1", StartDate = "20070301" });
list.Add(new Cheque() { Account= "2", Serial= "1", StartDate = "20070301" });
list.Add(new Cheque() { Account= "2", Serial= "1", StartDate = "20070301" });

expected list only tow set with nearest date from each Account

Acc Serial Date

"1", "1", "20120120"   //first resultSet with Account= 1 
"1", "2", "20120120" 
"1", "3", "20120120"
"1", "1", "20110120"  //second resultset with Account=  1 
"1", "2", "20110120" 
"2", "1", "20120314"  //first resultSet with Account= 2 
"2", "1", "20100417" //second resultset with Account=  2 
"2", "2", "20100417" 

plz help me how can query this with linq how group by (or distinct ) and take tow first set, like this

AliKarimi
  • 1,849
  • 1
  • 13
  • 8

4 Answers4

2

The trick is to group by Account and Serial. Take the top two dates and then flatten the list again by SelectMany:

list.GroupBy(x => new {x.Account, x.Serial})
.Select(g => new { FirstTwo = g
                   .GroupBy(x => x.StartDate).Select(x => x.FirstOrDefault())
                   .OrderByDescending(x => x.StartDate).Take(2)
                 })
.SelectMany(g => g.FirstTwo)
.OrderBy(x => x.Account)
    .ThenByDescending(x => x.StartDate)
        .ThenBy(x => x.Serial)

Result:

1   1   20120120
1   2   20120120
1   3   20120120
1   1   20110120
1   2   20110120
1   3   20110120
2   1   20120314
2   2   20120314
2   1   20100417
2   2   20100417
Gert Arnold
  • 105,341
  • 31
  • 202
  • 291
  • but for large number record not work.I have 100 record for each account : 1 1 20120120 ; 1 2 20120120 ; ... 1 100 2012020; 2 1 20100101; 2 2 20100101 ... 2 100 20100101; ... for these your code result is diffrent ; – AliKarimi Sep 15 '13 at 15:46
  • Please show (by editing your question) what you expect to see with larger numbers. Specifically explain the words "nearest date". – Gert Arnold Sep 15 '13 at 19:35
  • ok just add this record to list (list.Add(new Cheque() { Account= "1", Serial= "4", StartDate = "20080120" }); ) --- > your result show it in output. – AliKarimi Sep 16 '13 at 04:35
  • OK, I see, I added a second GroupBy. – Gert Arnold Sep 16 '13 at 06:59
  • thank u @Arnold. your query produce this result : {1 1 20120120 ; 1 2 20120120 ; 1 3 20120120 ; 1 1 20110120 ; 1 2 20110120 ; 1 3 20080120 ; 1 4 20080120 ; 2 1 20120314 ; 2 1 20100417 ; 2 2 20100417 } but (1 3 20080120 1 4 20080120 ) is wrong – AliKarimi Sep 16 '13 at 10:23
  • I use below code and it work : group account & date then group them by acc and select tow first list (to take keys of group i want) then join it by orginal list make desire result. again thank u for your consult. – AliKarimi Sep 16 '13 at 10:35
  • 1
    I think it should be possible to do it in one statement, but I can't distill the logic from a restricted set of input and output rows only. – Gert Arnold Sep 16 '13 at 11:09
  • ok, this table have cheque books collection. each account has many recovered cheque book and each cheque book has 100 or 200or300 sheet. 4 example account 1 has many cheque books and each cheque book is recognized by created date. user want to watch only tow last cheque book of all accounts. tow cheque book of account 1 tow for acc 2 and... – AliKarimi Sep 16 '13 at 11:38
  • unfortunately the design of db is very bad (20 years ago designed and implemented) i have to develop a new program by these old data. each cheque book have not id for it, and only date and account is unique for each cheque book. – AliKarimi Sep 16 '13 at 11:50
0

After searching and reading stackoverflow, make desired result with this code.

    var groupedList = (from t in list
                       group t by new { t.Account, t.StartDate } into g
                       select new
                       {                              
                          g.Key.Account,
                          g.Key.StartDate
                        });

    var filteredList = groupedList.GroupBy(x => x.Account)
            .SelectMany(g => (from t in g orderby t.StartDate descending select t)
                     .Take(2).ToList() );

    var Result = (from c in list
                  join k in filteredList on
                  new { c.StartDate, c.Account } equals
                  new { k.StartDate, k.Account } //into j
                  select c).ToList();
        /*  or lambda method chain
        var groupedList = list.GroupBy(t => new {t.StartDate, t.Account})
            .Select(g => new { g.Key.StartDate,g.Key.Account})
            .GroupBy(x => x.Account)
            .SelectMany(g => (from t in g orderby t.StartDate descending select t)
                        .Take(2).ToList() );

          var result = (list.Join(inner: groupedList, 
            outerKeySelector: c => new {c.StartDate, c.Account}, 
            innerKeySelector: k => new {k.StartDate, k.Account},
            resultSelector: (c, k) => c))
            .OrderByDescending(e =>e.StartDate).OrderBy(e =>e.Account).ToList(); 

        */

      Console.WriteLine(Result);  

Thanks a lot LINQPAD(the best tool for linq) and all friends in stackoverflow (the best and professional developers in the world)

but i guess my code is very complex (3 level filtering) and have not best performance. :)

who have a better offer, please let me know.

I'd love to get some improvements!

AliKarimi
  • 1,849
  • 1
  • 13
  • 8
0

In order to get the top two from the group the query would be like this: Update But in this case, the combination of the Account ID and the Start Date must be unique.

.ToList().GroupBy(x=>new{x.Account,x.StartDate}).SelectMany(y=>y.OrderByDescending(z=>z.StartDate).Take(2));

I am using the similar one in my code and know this works fine.

Incredible
  • 3,495
  • 8
  • 49
  • 77
  • take(2) get only top two record of any group and your return result is like : {1 20080120 ; 1 2 20080120 ;1 1 20110120 ;1 2 20110120 ; 1 1 20120120 ;1 2 20120120 ;2 1 20090320 ;2 2 20090320 ;2 1 20100417 ;2 2 20100417 ;2 1 20120314 ;2 2 20120314;} but Desired list has "1", "3", "20120120" and has not "1 1 20080120 " – AliKarimi Sep 15 '13 at 12:34
  • Unable to get your returned result, And the combination of the Account ID and Start Date must be unique. – Incredible Sep 15 '13 at 12:36
  • There must be something in the records that would be unique, just like the primary key. – Incredible Sep 15 '13 at 12:39
0

At last i find one statement which produce desired result.

var result = (from cheque in list.OrderBy(a => a.Account).ThenByDescending(a => a.StartDate)
                            group cheque by new { cheque.Account, cheque.StartDate } into gr
                            //from ids in gr
                            group gr by gr.Key.Account into secondGrouping
                            from second in secondGrouping.Distinct().Take(2)
                                  from Cheque f in second
                                  select f 
                            ).ToList<Cheque>();
AliKarimi
  • 1,849
  • 1
  • 13
  • 8