39

How can I employ Linq to select Top value from each group

when I have a code segment like :

var teams = new Team[]
 { 
  new Team{PlayerName="Ricky",TeamName="Australia", PlayerScore=234},
  new Team{PlayerName="Hussy",TeamName="Australia", PlayerScore=134},
  new Team{PlayerName="Clark",TeamName="Australia", PlayerScore=334},

  new Team{PlayerName="Sankakara",TeamName="SriLanka", PlayerScore=34},
  new Team{PlayerName="Udana",TeamName="SriLanka", PlayerScore=56},
  new Team{PlayerName="Jayasurya",TeamName="SriLanka", PlayerScore=433},

 new Team{PlayerName="Flintop",TeamName="England", PlayerScore=111},
 new Team{PlayerName="Hamirson",TeamName="England", PlayerScore=13},
 new Team{PlayerName="Colingwood",TeamName="England", PlayerScore=421}
 };

Desired Result :


Team Name         Player Name     Score

Srilanka          Jayasurya        433

England           colingwood       421

Australia         Clark            334 
Bridge
  • 29,818
  • 9
  • 60
  • 82
user190560
  • 3,459
  • 4
  • 20
  • 15

6 Answers6

28

My answer is similar to Yuriy's, but using MaxBy from MoreLINQ, which doesn't require the comparison to be done by ints:

var query = from player in players
            group player by player.TeamName into team
            select team.MaxBy(p => p.PlayerScore);

foreach (Player player in query)
{
    Console.WriteLine("{0}: {1} ({2})",
        player.TeamName,
        player.PlayerName,
        player.PlayerScore);
}

Note that I've changed the type name from "Team" to "Player" as I believe it makes more sense - you don't start off with a collection of teams, you start off with a collection of players.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
27

The following code gets the desired value:

foreach (Team team in teams
    .GroupBy(t => t.TeamName)
    .Select(ig => ig.MaxValue(t => t.PlayerScore)))
{
    Console.WriteLine(team.TeamName + " " + 
        team.PlayerName + " " + 
        team.PlayerScore);
}

It requires the following extension that I wrote earlier today:

public static T MaxValue<T>(this IEnumerable<T> e, Func<T, int> f)
{
    if (e == null) throw new ArgumentException();
    using(var en = e.GetEnumerator())
    {
        if (!en.MoveNext()) throw new ArgumentException();
        int max = f(en.Current);
        T maxValue = en.Current;
        int possible = int.MaxValue;
        while (en.MoveNext())
        {
            possible = f(en.Current);
            if (max < possible)
            {
                max = possible;
                maxValue = en.Current;
            }
        }
        return maxValue;
    }
}

The following gets the answer without the extension, but is slightly slower:

foreach (Team team in teams
    .GroupBy(t => t.TeamName)
    .Select(ig => ig.OrderByDescending(t => t.PlayerScore).First()))
{
    Console.WriteLine(team.TeamName + " " + 
        team.PlayerName + " " + 
        team.PlayerScore);
}
Yuriy Faktorovich
  • 67,283
  • 14
  • 105
  • 142
13

This will require you to group by team name then select the max score.

The only tricky part is getting the corresponding player, but its not too bad. Just select the player with the max score. Of coarse, if its possible for more than one player to have identical scores do this using the First() function as shown below rather than the Single() function.

var x =
    from t in teams
    group t by t.TeamName into groupedT
    select new
    {
        TeamName = groupedT.Key,
        MaxScore = groupedT.Max(gt => gt.PlayerScore),
        MaxPlayer = groupedT.First(gt2 => gt2.PlayerScore == 
                    groupedT.Max(gt => gt.PlayerScore)).PlayerName
    };

FYI - I did run this code against your data and it worked (after I fixed that one, little data mistake).

Michael La Voie
  • 27,772
  • 14
  • 72
  • 92
9

I would use this Lambda expression:

IEnumerable<Team> topsScores = 
teams.GroupBy(x => x.TeamName).Select(t => t.OrderByDescending(c => c.PlayerScore).FirstOrDefault());
Romano Zumbé
  • 7,893
  • 4
  • 33
  • 55
chris castle
  • 173
  • 2
  • 10
0

The implementation proposed by The Lame Duck is great, but requires two O(n) passes over the grouped set to figure out the Max. It would benefit from calculating MaxScore once and then reusing. This is where SelectMany (the let keyword in C#) comes in handy. Here is the optimized query:

var x = from t in teams 
        group t by t.TeamName into groupedT 
        let maxScore = groupedT.Max(gt => gt.PlayerScore)
        select new 
        { 
           TeamName = groupedT.Key,
           MaxScore = maxScore, 
           MaxPlayer = groupedT.First(gt2 => gt2.PlayerScore == maxScore).PlayerName 
        };
Drew Marsh
  • 33,111
  • 3
  • 82
  • 100
-1

I would suggest you first implement an extension method on the IEnumerbale class called Top For example:

IEnumerable<T,T1> Top(this IEnumerable<T> target, Func<T1> keySelector, int topCount)
{
    return target.OrderBy(i => keySelector(i)).Take(topCount);
}

Then you can write:

teams.GroupBy(team => team.TeamName).Top(team => team.PlayerScore, 1).

There might be some slight modifications to make it compile.

Vitaliy
  • 8,044
  • 7
  • 38
  • 66