Thursday, December 22, 2011

C# LINQ To Objects: Using GroupBy with more control

The previous post demonstrated simple use of GroupBy to collect aggregates on some property of an object. This post will demonstrate an example of another overload of GroupBy method which will allow us to group the objects in a more flexible way.

For this example too, we will use the same List of Score objects, which we used in previous example. However, the objective of grouping will be quite different now.

So, the objective is to retrieve a table of subject-wise highest scores and name of students who attained this highest score in corresponding subject. The structure of output can be visualized as below-

Subject Top Score Top Scorer's Name

To achieve this form of grouping, we will need to use the GroupBy method in such a way that it allows us to define what form of result we need as an output of grouping. One of the eight overloads of GroupBy extension method provides a flexibility to define the result type as an argument to itself. Below is the overload we are looking for-

public static IEnumerable<IGrouping<TKey, TSource>> GroupBy<TSource, TKey, TElement, TResult>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    Func elementSelector, 
    Func<TKey, IEnumerable<TSource>, TResult> resultSelector) 
Here are the various arguments and their meanings:

The first argument this IEnumerable<TSource> source is the input sequence itself and as it is an extension method, the first argument is never actually passed into the method.

The second argument Func<TSource, TKey> keySelector, is a delegate to apply to each element in the input sequence to obtain a key (type TKey). The key is what decides which group the element is associated with. We want our final output to be grouped on SubjectName property of Score object. So, our obvious choice as keySelector will be-

groupingKey => groupingKeySubjectName, //keySelector

The third argument Func elementSelector is a delegate to apply to each element (type TElement) to obtain the value which should be part of the relevant group. Once we have grouped our list by SubjectName, we will need to identify the students with their marks in each group.

So, as an element selector we create an anonymous type with properties Name and Marks

elementSelector => new { Name = elementSelector.StudentName, Marks = elementSelector.MarksObtained },   //elementSelector 

Now the last argument Func<TKey, IEnumerable<TSource>, TResult> resultSelector), is a delegate to apply to each grouping to produce a final result (type TResult). As an output, we want a sequence of objects with properties having subject name, highest scores and name of student who attained highest score.

So, we create another anonymous type with properties SubjectName, HighestScore and HighestScorerName. SubjectName will be same as the key of grouping operation. HighestScore of a subject can be determined from elementSelector of the corresponding group and HighestScorerName is the Name of student who has Marks equal to HighestScore. So, here is how our resultSelector will look like-

(groupingKey, elementSelector) => new { //resultSelector
                SubjectName = groupingKey,
                HighestScore = elementSelector.Max(t => t.Marks),
                HighestScorerName = elementSelector.Where(t => t.Marks == elementSelector.Max(f => f.Marks)).Select(t => t.Name).SingleOrDefault()}

This is it. Lets assemble the GroupBy method on the instance of List from previous example and see what we get

var topScorers = examResult.GroupBy(
            groupingKey => groupingKey.SubjectName, //keySelector
            elementSelector => new { Name = elementSelector.StudentName, Marks = elementSelector.MarksObtained },   //elementSelector 
            (groupingKey, elementSelector) => new { //resultSelector
                SubjectName = groupingKey,
                HighestScore = elementSelector.Max(t => t.Marks),
                HighestScorerName = elementSelector.Where(t => t.Marks == elementSelector.Max(f => f.Marks)).Select(t => t.Name).SingleOrDefault()
            }).Select(resultSelector => resultSelector).ToList();

When you look at the output sequence, notice that this overload of GroupBy does not return the IEnumerable of IGrouping type but it returns the IEnumerable of anonymous type defined in the third parameter (resultSelector) of the GroupBy method. This is all the fun of it.

No comments:

Post a Comment