Jakka .NET

I am Suresh Jakka, software architect at GeoCue Corp, Madison, Alabama. I have been reading serveral tech blogs and learning a lot. It is about time for me to start a blog on .NET in general so that others can learn from my experiences. Here I will be blogging about obstacles that I encounter in my day-to-day programming world and about the solutions I found.

Friday, March 26, 2010

Better performance with "yield"

I guess, “yield” is a hidden treasure in .Net. It’s been there from .Net 2.0 but we seldom use it. It can improve the performance significantly if you are working with huge collections and returning huge collections.


If your application has to return a huge collection by reading another huge collection, it has to allocate the memory for the return collection and return it to the caller. And caller has to free that memory when it is done with it.

Here is how we do it normally.

Server method:

private static List<myobject> MyHugeList = new List();
public List GetMyHugeCollection()
{
             ListList<myobject> myReturnList = new List();
             foreach (myobject obj in MyHugeList)
            {
                 if (true) //Some condition
                 {
                        myReturnList.Add(obj);
                 }
           }
            return myReturnList;
}

Client Method:

public void MyClientMethod()
{
            int index = 0;
            foreach(object obj in GetMyHugeCollection())
           {
                  //Do some thing and return after you processed 10 object
                  if (index == 10)
                  {
                         break;
                   }
                   index++;
           }
 }

In this, couple of performance issues,

• Caller has to wait until “GetMyHugeCollection()” returns back by looping through all the members in the huge list.
• Callee (GetMyHugeCollection()) , has to allocate the memory for myReturnList (it could be significant if you are dealing with huge collections)
• Even if the caller needs a few of the items from the huge list (based on some condition), the callee has to loop through all the members in the huge list and also have to allocate the memory for the return list.

The above issues can be eliminated using “yield” and “yield break”. I will show you how to do it.

Server method:

private static ListList<myobject> MyHugeList = new List();

public IEnumerable<myobject> GetMyHugeCollection()
{
            foreach (myobject obj in MyHugeList)
           {
              if (true) //Some condition
             {
                   yield return obj;
              }
           }

          yield break;
}

Client Method:

public void MyClientMethod()
{
              int index = 0;
              foreach(object obj in GetMyHugeCollection())
              {
                      //Do some thing and return after you processed 10 object
                       if (index == 10)
                      {
                               break;
                       }
                        index++;
               }
}



In this, “GetMyHugeCollection()” method returns the first object as soon as it finds the “yield” keyword. Then caller, can process that first object immediately. Next when “foreach” at caller tries to get the second element, the callee is invoked to process where it left and returns as soon as it finds another “yield” statement. The magic is, .Net framework spits out the code to save the state of the function (callee) when it encounters “yield”. So, at runtime, it knows how to resume the callee method to get the second object. This is called delayed execution that they introduced in .Net 2.0. This concept is not new, you can achieve this by writing bunch of delegates.

The advantages of this approach are

• You do not have to allocate the memory for “MyReturnList” (no need for it) and so do not have to worry whether client frees it or not.
• Client can start processing the elements as soon as it gets the first item (a huge performance gain if you are dealing with really huge collections)
• If client needs only 10 items from the list, the callee has to loop to find only first 10 elements and no need to loop through all the elements in the huge collection.

This is little bit confusing in the beginning, but if you run a small example, you will really appreciate the power of “yield” keyword in C#.

Happy coding!!!

0 Comments:

Post a Comment

<< Home