Saturday, October 22, 2011

C# LINQ To Objects: Simple Grouping using GroupBy

We face number of scenarios in day to day programming when we need to group a number of records based on a key and calculate aggregates like SUM, AVG, MAX, MIN etc on these groups. We do it more frequently in SQL but we can also do it easily in C# using LINQ.

Static class Enumerable in System.Linq namespace defines an extension method- GroupBy with one (simpletest) of the available overloads
public static IEnumerable<IGrouping<TKey, TSource>> GroupBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector) 
This method does to an IEnumerable exactly what GROUP BY in SQL does to a number of records. Lets explore it by means of an example. First we need a class on IEnumerable of which we can apply GroupBy method.
public class Score {
    public string StudentName { get; set; }
    public string SubjectName { get; set; }
    public float MaxMarks { get; set; }
    public float MarksObtained { get; set; }
}
Class Score is a simple class which represents score of a student in a particular subject. There are two float type properties MaxMarks and MarksObtained. These two properties can be used to calculate aggregate on. Lets suppose we have a collection of objects of Score type, each represents the score of a student in a subject and we need to calculate Average marks obtained for students in a particular subject.
Lets first create a list of Score objects
class Program {
    static void Main(string[] args) {
        List<Score> examResult = new List<Score>();
        examResult.Add(new Score() { StudentName = "Steve", SubjectName = "Maths", MaxMarks = 100, MarksObtained = 90 });
        examResult.Add(new Score() { StudentName = "Steve", SubjectName = "Physics", MaxMarks = 100, MarksObtained = 86 });
        examResult.Add(new Score() { StudentName = "Steve", SubjectName = "Chemistry", MaxMarks = 100, MarksObtained = 72 });
        examResult.Add(new Score() { StudentName = "Steve", SubjectName = "Computer Science", MaxMarks = 100, MarksObtained = 91 });

        examResult.Add(new Score() { StudentName = "Sarah", SubjectName = "Maths", MaxMarks = 100, MarksObtained = 85 });
        examResult.Add(new Score() { StudentName = "Sarah", SubjectName = "Physics", MaxMarks = 100, MarksObtained = 76 });
        examResult.Add(new Score() { StudentName = "Sarah", SubjectName = "Chemistry", MaxMarks = 100, MarksObtained = 92 });
        examResult.Add(new Score() { StudentName = "Sarah", SubjectName = "Computer Science", MaxMarks = 100, MarksObtained = 92 });


        examResult.Add(new Score() { StudentName = "David", SubjectName = "Maths", MaxMarks = 100, MarksObtained = 74 });
        examResult.Add(new Score() { StudentName = "David", SubjectName = "Physics", MaxMarks = 100, MarksObtained = 82 });
        examResult.Add(new Score() { StudentName = "David", SubjectName = "Chemistry", MaxMarks = 100, MarksObtained = 85 });
        examResult.Add(new Score() { StudentName = "David", SubjectName = "Computer Science", MaxMarks = 100, MarksObtained = 89 });

        Console.ReadLine();
    }
}
Now we have is a list of type Score having 12 objects of Score for 3 students and 4 subjects and we are all set to determine the average marks obtained by these students in each subject. So, our key to group these records should be SubjectName property and we want to calculate average on MarksObtained property. Here is how it is done-
var avgResults = examResult.GroupBy(rec => rec.SubjectName).
            Select(rec => new { SubjectName = rec.Key, AVGMarks = rec.Average(t => t.MarksObtained) }).ToList();

foreach (var item in avgResults) {
     Console.WriteLine("Subject Name: {0},   Average Marks:{1}", item.SubjectName, item.AVGMarks);
}
Notice that in Select method, we create an Anonymous type with two properties SubjectName and AVGMarks. Finally we get a collection (avgResults) of this Anonymous type.

No comments:

Post a Comment