Friday, January 21, 2011

Internals of loops (While, For and ForEach)

Practically speaking, a loop is the primary building block of a program. We use loop to repeat a set of actions for a certain interval. Now if you think of these intervals, it could be a traversal from one number (called as start index) to another number (called as end index). Very often or probably out of 10 such loop eight times you loop a collection such that you start from 0 and loop until you point to the end of the sequence.

In C# (and VB.NET) we use while, do-while, for and foreach loop to loop through a set of instructions. In this post I will try to demonstrate the basic loops for a while and later on take on a bit about foreach loop and its requirement and finally go deep into its internals.

If you want to see all my Internal Series, you are welcome to follow the link :
Internals to .NET



The Basics

C# comes with three basic types of loops.

1. While Loop : This kind of loop runs while the condition is satisfied.

while (x >= 0)
{
}

2. do-while : This loop runs at least once while the condition is satisfied.

do
{

} while (x >= 10);

3. for : This loop has three sections, index declaration, condition and increment/decrement section. For each call to the instruction the index is incremented and checked with the condition.
for (int i = 0; i < 10; i++)
 {
 }

These are the most basic loops and I hope you already know them and came across with it in your life. .NET introduces another loop called Foreach loop which specially works on an IEnumerable. Each collection in .NET is somehow implements IEnumerable and hence can be used as a part of foreach loop. Hence every sequence (or whatever class) when implements an IEnumerable will have the provision to enumerate through values using foreach.

var enumerable = Enumerable.Range(1, 100);
 foreach (var e in enumerable)
 {

 }

In the above code the instruction within the scope will loop through until the enumerable has value. Now you might wonder why this interface is at all required? Why we need to implement it ? What exactly happens inside these loops. Lets demonstrate these points.

The Internals

Before going further with the internals of loop let me clear out one fact which you always remember. Goto is a construct(keyword) in C# that allows you to unconditionally transfer the control from one place to another. The IL equivalent for goto statement is br.s which takes just an instruction line number to transfer the control to a place. Another small instruction brtrue.s which evaluates the loaded object and moves the control to the instruction line only when the evaluated value turns out to be true.

Hence to summarize :
br.s -> Unconditionally moves the control.
brtrue.s -> Moves the control if loaded value is true.

Now lets start looking at the internals one by one :


While, do - while Loop :
int x=10;
while (x >= 0)
{
}
If you look into the IL instruction set it creates an integer variable x (which we have declared in code) and a placeholder for boolean variable. The L_0004 instruction is actually a goto equivalent which moves the control over to the location specified. It evaluates and stores the result of equality of x with 0; clt and  ceq specifies the less than and equal respectively.
Based on the result,  L_0011 is evaluated and control moves to specified instruction line accordingly.

So basically while loop is a sequence of goto statements between an instruction statements to perform a loop. The instruction between L_0005 - L_0008 is the actual action (one inside curl braces of the loop).

Do-while loop is very similar to while loop, with the only difference being the absence of instruction L_0004, and hence lets it to pass through the action.

For Loop :
for (int i = 0; i < 10; i++)
 {
 }
The instruction set for the for loop is also very similar to that of while. The for loop creates the index in the first place and stores in zero'th local (L_0002). The goto statement ensures the condition is evaluated and hence moves to L_000b. After the condition is evaluated the L_0012 sends back the control to L_0005. The L_0005 - L_0006 represents the action block(one inside the curl braces) and eventually after that it increments the index  element by 1 and continues.

We kept the instruction set simple using no code inside the action block. The only difference of  for loop with that of while is the extra instruction to store local index and increment it after the execution of action each time.




For-Each Loop:
To understand foreach loop you first need to understand the basics of IEnumerable. In .NET, a collection implementing IEnumerable must implement a method called GetEnumerator which returns an IEnumerator.

IEnumerator actually represents the cursor which points to a certain logical index of the collection. Lets see the IEnumerator interface.

public interface IEnumerator<out T>
{
    T Current { get; }
    bool MoveNext();
    void Reset();
}

Hence the IEnumerator actually keeps track of the state of the collection, which allows you to use the method MoveNext to point to the next location, Current holds the current value of the collection and Reset allows you to re-initialize the cursor.

IEnumerable on the other hand wraps the enumerator into it such that any collection which implements it, can use the foreach loop. I will discuss in-depth about how to generate Enumerable in my next post.

Now let us take an example to demonstrate what is happening inside of a foreach loop :


IEnumerable<int> enumerable = Enumerable.Range(1, 100);
 foreach (int e in enumerable)
 {

 }

The above code actually generates a IEnumerable (collection) of 99 elements from 1 through 100. We loop through the enumerable to get each integer and perform the action on it. Now this is not that simple. Foreach loop is actually broken into the following code after it is been compiled to IL.

var enumerable = Enumerable.Range(1, 100);
IEnumerator<int> enumerator = enumerable.GetEnumerator();
try
{
    while (enumerator.MoveNext())
    {
        int element = enumerator.Current;
        //here goes your action instructions
    }
}
finally
{
    IDisposable disposable = enumerator as System.IDisposable;
    if (disposable != null) disposable.Dispose();
}

I think this looks great to you. Yaah, a foreach loop is actually an abstraction to what we see above. It first finds the enumerator using GetEnumerator from the enumerable, uses MoveNext to loop through the instructions and in the finally block, it tries to dispose the enumerator.


Similar to what I showed in the code above you can see the IL to correspond to the same rule. The try / finally ensures that enumerator is disposed after the foreach loop is complete or even an exception occurs.

Conclusion

So, as you saw most of the IL instructions are goto statements, it is not a good idea or not recommended to write goto statement in your code. As suggested, it is fine with goto as soon as the compiler generates this for you, but it will be horrible if the language does consists a large number of unconditional jumps. So beware.

I hope you like this post, I will continue with IEnumerable in my next post. Stay tune.

Thank you for reading.

16 comments:

  1. Is the code sample for while{} supposed to say "int x = 10;" instead of "int i = 10;"?

    ReplyDelete
  2. "but it will be horrible if the language does consists a large number of unconditional jumps" - Why? Its compiler produced code!

    ReplyDelete
  3. @Sohan

    No with that line I tried to mention, it is fine if unconditional jumps are written from the compiler and not written by us in languages.

    :)

    ReplyDelete
  4. You should inspect the IL without debugging info.

    ReplyDelete
  5. You said 1 "through" 100 is 99 elements. Actually that's 100 elements. 1 "TO" 100 is 99 elements. (Where "TO" in this case means "up to but not including".)
    "Through" is equivalent to
    for(int i = 1; i <= 100; ++i){}
    and "TO" is equvalent to replacing "<=" with "<"

    ReplyDelete
  6. Technically mate, there is another way of doing a loop. It's not one I'd recommend, but you could use a goto statement to simulate a loop. Obviously, anybody who does this deserves to be hunted down and slapped around with a mouldy old kipper, but you could do it.

    ReplyDelete
  7. @Pete O\'Hanlon

    Yes budd, GoTo generate br.s, and loops too. But if there is a condition, loops has a better option to use brtrue.s for loaded var.

    I dont mind to use GoTo, as most of the IL stuffs I see is goto equivalent. Even I saw the new Async CTP and it has lots of Goto inside it. Even iterators has goto..

    But as MSDN says, it is better to avoid goto as most of people are not aware of its pros and cons.

    :)

    ReplyDelete
  8. @Anonymous
    My bad. Thanks for letting me know. Will fix it shortly

    ReplyDelete
  9. It is interesting that the compiler generated GOTO is fine while programmers should avoid GOTO. Is the compiler programmed by programmers?

    ReplyDelete
  10. Another form of looping is recursion.

    ReplyDelete
  11. "Practically speaking, a loop is the primary building block of a program."

    That leaves out many programs that do something only once. Most programs I write do not have a loop because they need to do the job only once. The program can be called repeatedly, such as automating a job every day by use of a task scheduler. The task scheduler may have a loop, but the program I wrote does not.

    ReplyDelete
  12. @Anonymous

    Makes sense... Actually we theoretically take their code to be better than ours, and we are more prone to errors than the Compiler programmers.

    :)

    ReplyDelete
  13. @Anonymous
    By the line I tried to mean the primary unit of programming. Loop is one of them.

    :)

    ReplyDelete
  14. To add the the building blocks comment.

    All programs can be broken down into at most three things: Direction, Iteration, Decision. In other words, programs have a beginning, middle and end. They may have some form of looping (actually, in Windows they all will even if you aren't coding it directly - the internal message pump is handled by a honking great loop). They will have some form of decision making, e.g. if this condition then do this, else do that.

    ReplyDelete
  15. This is such a great resource that you are providing and you give it away for free. I enjoy seeing websites that understand the value of providing a prime resource for free. I truly loved reading your post. Thanks!

    ReplyDelete

Please make sure that the question you ask is somehow related to the post you choose. Otherwise you post your general question in Forum section.

Author's new book

Abhishek authored one of the best selling book of .NET. It covers ASP.NET, WPF, Windows 8, Threading, Memory Management, Internals, Visual Studio, HTML5, JQuery and many more...
Grab it now !!!