Saturday, October 22, 2011

C# LINQ To Objects: Simple Grouping using GroupBy

We face number of scenarios in day to day programming when we need to group a number of records based on a key and calculate aggregates like SUM, AVG, MAX, MIN etc on these groups. We do it more frequently in SQL but we can also do it easily in C# using LINQ.

Static class Enumerable in System.Linq namespace defines an extension method- GroupBy with one (simpletest) of the available overloads
public static IEnumerable<IGrouping<TKey, TSource>> GroupBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector) 
This method does to an IEnumerable exactly what GROUP BY in SQL does to a number of records. Lets explore it by means of an example. First we need a class on IEnumerable of which we can apply GroupBy method.
public class Score {
    public string StudentName { get; set; }
    public string SubjectName { get; set; }
    public float MaxMarks { get; set; }
    public float MarksObtained { get; set; }
}
Class Score is a simple class which represents score of a student in a particular subject. There are two float type properties MaxMarks and MarksObtained. These two properties can be used to calculate aggregate on. Lets suppose we have a collection of objects of Score type, each represents the score of a student in a subject and we need to calculate Average marks obtained for students in a particular subject.
Lets first create a list of Score objects
class Program {
    static void Main(string[] args) {
        List<Score> examResult = new List<Score>();
        examResult.Add(new Score() { StudentName = "Steve", SubjectName = "Maths", MaxMarks = 100, MarksObtained = 90 });
        examResult.Add(new Score() { StudentName = "Steve", SubjectName = "Physics", MaxMarks = 100, MarksObtained = 86 });
        examResult.Add(new Score() { StudentName = "Steve", SubjectName = "Chemistry", MaxMarks = 100, MarksObtained = 72 });
        examResult.Add(new Score() { StudentName = "Steve", SubjectName = "Computer Science", MaxMarks = 100, MarksObtained = 91 });

        examResult.Add(new Score() { StudentName = "Sarah", SubjectName = "Maths", MaxMarks = 100, MarksObtained = 85 });
        examResult.Add(new Score() { StudentName = "Sarah", SubjectName = "Physics", MaxMarks = 100, MarksObtained = 76 });
        examResult.Add(new Score() { StudentName = "Sarah", SubjectName = "Chemistry", MaxMarks = 100, MarksObtained = 92 });
        examResult.Add(new Score() { StudentName = "Sarah", SubjectName = "Computer Science", MaxMarks = 100, MarksObtained = 92 });


        examResult.Add(new Score() { StudentName = "David", SubjectName = "Maths", MaxMarks = 100, MarksObtained = 74 });
        examResult.Add(new Score() { StudentName = "David", SubjectName = "Physics", MaxMarks = 100, MarksObtained = 82 });
        examResult.Add(new Score() { StudentName = "David", SubjectName = "Chemistry", MaxMarks = 100, MarksObtained = 85 });
        examResult.Add(new Score() { StudentName = "David", SubjectName = "Computer Science", MaxMarks = 100, MarksObtained = 89 });

        Console.ReadLine();
    }
}
Now we have is a list of type Score having 12 objects of Score for 3 students and 4 subjects and we are all set to determine the average marks obtained by these students in each subject. So, our key to group these records should be SubjectName property and we want to calculate average on MarksObtained property. Here is how it is done-
var avgResults = examResult.GroupBy(rec => rec.SubjectName).
            Select(rec => new { SubjectName = rec.Key, AVGMarks = rec.Average(t => t.MarksObtained) }).ToList();

foreach (var item in avgResults) {
     Console.WriteLine("Subject Name: {0},   Average Marks:{1}", item.SubjectName, item.AVGMarks);
}
Notice that in Select method, we create an Anonymous type with two properties SubjectName and AVGMarks. Finally we get a collection (avgResults) of this Anonymous type.

Sunday, October 16, 2011

Serializing .NET Classes into XML (C#)

Serialization: The process of converting an object into a stream of bytes. This stream of bytes can be persisted in form of a physical file like XML.
In .NET framework, the namespace System.Xml.Serialization provides all the necessary functionality to help convert a "Serializable" object into stream of bytes and then System.IO helps with all the necessary tools to write those stream of bytes into a physical file.
Here is an example where we serialize an object into a stream and then save the stream into a physical XML

Model classes: First we will create a few simple classes which we want to be represented in form of an XML

public class University {
 public University() { }

    public string Name { get; set; }
    public string Address { get; set; }
    public short Rating { get; set; }
    public List<Institute> AffiliatedInstitutes = new List<Institute>();
}

public class Institute {
 public Institute() { }

    public string Name { get; set; }
    public string Address { get; set; }
    public short Rating { get; set; }
    public List<Student> Students = new List<Student>();
 }

public class Course {
    public Course() { }

    public string Name { get; set; }
    public short DurationInMonths { get; set; }
    public string CourseType { get; set; }
}

public class Student {
    public Student() { }

    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string Address { get; set; }
    public string EnrollmentNumber { get; set; }
    public Course Course { get; set; }
 }

In the example above, we are going to serialize the class University which must include properties of all the types (Institute, Course, Student). To enable these classes to be serialized we need to decorate them with some attributes. Lets take a look at the definitions of same classes-
using System.Collections.Generic;
using System.Xml.Serialization;
using System.IO;
using System;
...
[XmlRoot()]
public class University {
 public University() { }

    public string Name { get; set; }
    public string Address { get; set; }
    public short Rating { get; set; }
    public List<Institute> AffiliatedInstitutes = new List<Institute>();
}

[XmlInclude(typeof(Institute))]
public class Institute {
 public Institute() { }

    public string Name { get; set; }
    public string Address { get; set; }
    public short Rating { get; set; }
    public List<Student> Students = new List<Student>();
 }

[XmlInclude(typeof(Course))]
public class Course {
    public Course() { }

    public string Name { get; set; }
    public short DurationInMonths { get; set; }
    public string CourseType { get; set; }
}

[XmlInclude(typeof(Student))]
public class Student {
    public Student() { }

    public string FirstName { get; set; }
    public string LastName { get; set; }
    public string Address { get; set; }
    public string EnrollmentNumber { get; set; }
    public Course Course { get; set; }
 }

In the class definitions above, there are a few things worth noticing.
XmlRoot() attribute: The class definition following this attribute is chosen to be the root node of the finally serialized XML.
XmlInclude(TYPE) attribute: The class definitions following this attribute are marked to be serialized, if there exist any class member of its type in the root object.
By default all the unmarked properties of a class are treated as child elements of parent object(XmlElement). If we want a property to appear like an attribute of its parent node, we need to add an attribute XmlAttribute("attributeName").
Finally we define a method in our University class which actually does the job to convert an object of its own type into stream of bytes (i.e. serializes its object) and then writes those stream of bytes to a physical fine.
public class University {
    public University() {}

    public string Name { get; set; }
    public string Address { get; set; }
    public short Rating { get; set; }
    public List<Institute> AffiliatedInstitutes = new List<Institute>();

    public bool SaveToXML(string filePath) {
  try {
   //Instantiate an object of XmlSerializer class specifying the root object type (i.e. University)
   XmlSerializer serializer = new XmlSerializer(typeof(University));
   
            //Instantiate an object of memory stream which we will use as a continer for the serialized stream
   MemoryStream ms = new MemoryStream();

            using (ms) {
                //Run Serialize method on current instance of University class
    serializer.Serialize(ms, this);
    
    //Read the memory stream into a string object, 
    //though we don't need to read it but we will do so, so that we can debug it and see the XML first before we write it
                ms.Position = 0;
    string data = string.Empty;
    StreamReader reader = new StreamReader((Stream) ms);
                using (reader) {
     data = reader.ReadToEnd();
                }

    //Now write the string to the specified path
                File.WriteAllText(filePath, data);
    reader.Dispose();
    return true;
            }
        } catch (Exception) {
            return false;
  }
    }
}

Now, lets test the code above-

      
static void Main(string[] args) {
 Course mastersBusiness = new Course() {
  CourseType = "PG-Degree",
        DurationInMonths = 24,
        Name = "Masters of Business Administration"
    };

    Course bachelorEngineering = new Course() {
        CourseType = "Graguate-Degree",
        DurationInMonths = 48,
        Name = "Bachelor of Engineering"
    };

    Student steveRichards = new Student() {
        FirstName = "Steve",
        LastName = "Richards",
        Course = bachelorEngineering,
        Address = string.Empty,
        EnrollmentNumber = "BE20111234"
    };

    Student davidBaker = new Student() {
        FirstName = "David",
        LastName = "Baker",
        Course = mastersBusiness,
        Address = string.Empty,
        EnrollmentNumber = "MB20111234"
    };

    Institute rafaelInstitute = new Institute() {
        Name = "St. Rafael Institute for Higher Studies",
        Address = "123, Orleans Dr., Santa Clara, CA 94902",
        Rating = 3
    };

    rafaelInstitute.Students.Add(steveRichards);
    rafaelInstitute.Students.Add(davidBaker);

    University testUniversity = new University() {
        Name = "State University of California",
        Address = "Palo Alto, CA, 92033",
        Rating = 5
    };

    testUniversity.AffiliatedInstitutes.Add(rafaelInstitute);
 testUniversity.SaveToXML(@"F:\" + testUniversity.Name + ".xml");
}

When we are good so far, here is the XML you should already have written somewhere on your disk-

<?xml version="1.0"?>
<University xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <AffiliatedInstitutes>
    <Institute>
      <Students>
        <Student>
          <FirstName>Steve</FirstName>
          <LastName>Richards</LastName>
          <Address />
          <EnrollmentNumber>BE20111234</EnrollmentNumber>
          <Course>
            <Name>Bachelor of Engineering</Name>
            <DurationInMonths>48</DurationInMonths>
            <CourseType>Graguate-Degree</CourseType>
          </Course>
        </Student>
        <Student>
          <FirstName>David</FirstName>
          <LastName>Baker</LastName>
          <Address />
          <EnrollmentNumber>MB20111234</EnrollmentNumber>
          <Course>
            <Name>Masters of Business Administration</Name>
            <DurationInMonths>24</DurationInMonths>
            <CourseType>PG-Degree</CourseType>
          </Course>
        </Student>
      </Students>
      <Name>St. Rafael Institute for Higher Studies</Name>
      <Address>123, Orleans Dr., Santa Clara, CA 94902</Address>
      <Rating>3</Rating>
    </Institute>
  </AffiliatedInstitutes>
  <Name>State University of California</Name>
  <Address>Palo Alto, CA, 92033</Address>
  <Rating>5</Rating>
</University>