Tag Archives: Csharp

Indexers as Extension Methods?

So I’ve had this nagging issue for a little while. It’s not necessarily a huge issue because I have a workaround, but that said, it still nags at me now and again. That issue is that I cannot create an indexer as an extension method. This isn’t possible for a number of reasons that make sense, but I thought I’d blog about it anyway and solicit thoughts on the idea.

First, let’s talk about what we can do. I can create an extension method for any existing class. Let’s say I have the following class already created:

public class SomeClass{
   public int SomeProperty { get; set; }
}

 

I can then extend a List of SomeClass pretty easily like this:

public static class Extensions{
   public static string SomeListExtensionProperty(this List<SomeClass> classes)   {
      //… provide implementation here …
   }
}

 

Notice that I’ve scoped my extension to only extend a generic List of SomeClass. For instance I can do this:

var sc = new List<SomeClass>();string s = sc.SomeListExtensionProperty();

 

but I cannot do this:

var soc = new List<SomeOtherClass>();
string s = soc.SomeListExtensionProperty();

 

So what would happen if I tried to extend the List of SomeClass with an indexer like this:

public static string this[int index](this List<SomeClass> classes){
 return classes[i].SomeProperty;
}

 

We wouldn’t get past compilation. First of all, the word “this” and “static” don’t mix or tend to make sense together in most cases. That is because “this” refers to an instance of a class while “static” refers to type itself. In general, mixing these two wouldn’t make sense. That said, we already mix these two keywords when we create extension methods. So that isn’t the only reason this wouldn’t work. The next reason is that the generic List class already provides an indexer. Since you can’t override existing members with extensions, you are left without the ability to create an indexer with an extension.

Our only recourse would be to provide extension methods that provide the same functionality as a method.

public static string GetSomeProperty(this List<SomeClass> classes, int index){
   return classes[index].SomeProperty;
}

 

We can then call something like this:

var sc = new List<SomeClass>();
string s = sc.GetSomeProperty(index);

 

This isn’t quite as abbreviated as an indexer and in fact doesn’t save me anything over what I would get with the out-of-the-box generic List indexer:

var sc = new List<SomeClass>();
string s = sc[index].SomeProperty;

 

That said, the “nagging issue” is more of a request for a solution looking for a problem. Obviously indexers are great shorthand that ‘can’ provide a ‘default property’ so-to-speak. However, it is very easy to get what you want without much work.

Fun with default compilation: VB.NET vs C#

I’m not a language elitist by any means. I came from a Visual Basic background and avoided C++ when I could. These days, however, I’m more of a C# guy. I feel like there is sufficient gain in using the language without much, if any development-time penalty associated with using it. While I don’t mind someone stating their reasons for using one language over the other, I will not tolerate the argument that these languages compile to the same IL and work just as well. The VB.NET and C# compilers were written by two different teams. They emitted MSIL completely different from one another by default, and with good reason on both sides. As such, there are some major differences in the IL that is output. I pointed one out the other day when talking about delegate implementation in C# vs VB.NET. Today, I’m noting another.


VB and C# Specific Language


Let’s take a look at the following scenario. I’ve created two projects in both languages and implemented the same code in both. In the form’s class, I have an array list with 100 items in it, a button control and a label control. For the button’s click event handler, I added the following code:

Private Sub Button1_Click(ByVal sender As System.Object, _
   ByVal e As System.EventArgs) _
   Handles Button1.Click

   For i As Int32 = 1 To list.Count
     Me.lblOutput.Text += list(i - 1).ToString()
   Next
End Sub

This code simply loops through all of the items in the array list and outputs them to the label. Notice that I’m not caching the results of list.Count first. Instead, I’m evaluating this each time in the loop. This raises the hairs on the backs of several people’s necks every time I talk about it. I’ll explain my decision to do this later though. In the mean time, let’s look at what the C# output looks like:

for( int i = 1; i <= list.Count ; i++ ) {
    this.lblOutput.Text += list[i-1].ToString();
}

Again, this whole section has the least optimal of coding decisions, but it is done in an attempt to make these two pieces of code do the same work.


The IL Comparison


These look about the same, but the problem is evident when you look at the IL. Yesterday I posted an example where the IL in VB.NET was using virtual dispatch for delegates whereas C# was using instance dispatch. Virtual dispatches are much more expensive. So what is the problem here? Let’s start by looking at the IL for the C# and VB.NET versions of this code. In C# we have a cool 34 lines of IL:

.method private hidebysig instance void button1_Click(object sender,
       class [mscorlib]System.EventArgs e) cil managed
{
  // Code size  64 (0x40)
  .maxstack  5
  .locals init ([0] int32 i)
  IL_0000:  ldc.i4.1
  IL_0001:  stloc.0
  IL_0002:  br.s  IL_0031
  IL_0004:  ldarg.0
  IL_0005:  ldfld class [System.Windows.Forms]System.Windows.Forms.Label CSharpLoop.Form1::lblOutput
  IL_000a:  dup
  IL_000b:  callvirt
            instance string [System.Windows.Forms]System.Windows.Forms.Control::get_Text()
  IL_0010:  ldarg.0
  IL_0011:  ldfld class [mscorlib]System.Collections.ArrayList CSharpLoop.Form1::list
  IL_0016:  ldloc.0
  IL_0017:  ldc.i4.1
  IL_0018:  sub
  IL_0019:  callvirt instance object [mscorlib]System.Collections.ArrayList::get_Item(int32)
  IL_001e:  callvirt instance string [mscorlib]System.Object::ToString()
  IL_0023:  call  string [mscorlib]System.String::Concat(string, string)
  IL_0028:  callvirt instance void [System.Windows.Forms]System.Windows.Forms.Control::set_Text(string)
  IL_002d:  ldloc.0
  IL_002e:  ldc.i4.1
  IL_002f:  add
  IL_0030:  stloc.0
  IL_0031:  ldloc.0
  IL_0032:  ldarg.0
  IL_0033:  ldfld  class [mscorlib]System.Collections.ArrayList CSharpLoop.Form1::list
  IL_0038:  callvirt  instance int32 [mscorlib]System.Collections.ArrayList::get_Count()
  IL_003d:  ble.s  IL_0004
  IL_003f:  ret
} // end of method Form1::button1_Click

But in VB.NET, take a look at the IL hit here — 44 lines of code.

.method private instance void  Button1_Click(object sender,
          class [mscorlib]System.EventArgs e) cil managed
{
  // Code size  72 (0x48)
  .maxstack  5
  .locals init ([0] int32 i,
      [1] class [System.Windows.Forms]System.Windows.Forms.Label _Vb_t_ref_0,
      [2] int32 _Vb_t_i4_0)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  ldarg.0
  IL_0003:  ldfld  class [mscorlib]System.Collections.ArrayList VBLoop.Form1::list
  IL_0008:  callvirt  instance int32 [mscorlib]System.Collections.ArrayList::get_Count()
  IL_000d:  stloc.2
  IL_000e:  stloc.0
  IL_000f:  br.s  IL_0042
  IL_0011:  ldarg.0
  IL_0012:  callvirt instance class [System.Windows.Forms]System.Windows.Forms.Label VBLoop.Form1::get_lblOutput()
  IL_0017:  stloc.1
  IL_0018:  ldloc.1
  IL_0019:  ldloc.1
  IL_001a:  callvirt instance string [System.Windows.Forms]System.Windows.Forms.Control::get_Text()
  IL_001f:  ldarg.0
  IL_0020:  ldfld  class [mscorlib]System.Collections.ArrayList VBLoop.Form1::list
  IL_0025:  ldloc.0
  IL_0026:  ldc.i4.1
  IL_0027:  sub.ovf
  IL_0028:  callvirt instance object [mscorlib]System.Collections.ArrayList::get_Item(int32)
  IL_002d:  callvirt  instance string [mscorlib]System.Object::ToString()
  IL_0032:  call  string [mscorlib]System.String::Concat(string,string)
  IL_0037:  callvirt  instance void [System.Windows.Forms]System.Windows.Forms.Control::set_Text(string)
  IL_003c:  nop
  IL_003d:  nop
  IL_003e:  ldloc.0
  IL_003f:  ldc.i4.1
  IL_0040:  add.ovf
  IL_0041:  stloc.0
  IL_0042:  ldloc.0
  IL_0043:  ldloc.2
  IL_0044:  ble.s  IL_0011<
  IL_0046:  nop
  IL_0047:  ret
} // end of method Form1::Button1_Click

There are 10 additional IL lines for the same exact code — four of which appear to be unnecessary “nop” fields. The MSDN help for nop states: “Fills space if opcodes are patched. No meaningful operation is performed although a processing cycle can be consumed.” So despite the fact that these instructions do nothing, they can be wasting important processing cycles, which, if executed in a loop such as ours can be quite expensive over the long run.

Let’s break this down a bit further to see exactly what’s happening in these functions. Let’s evaluate the C# side first so I can pick on VB.NET’s IL later and show you where it’s going, well, wrong 🙂


C# IL Code Analysis


First off, the C# header is pretty standard.

.method private hidebysig instance void button1_Click(object sender,
        class [mscorlib]System.EventArgs e) cil managed
{
  // Code size  64 (0x40)
  .maxstack   5

Next we see our locally scoped declarations. In this instance, we only have one variable that is initialized as an int32. This will be the variable that we use in our loop counter.

  .locals init ([0] int32 i)

We then push an int32 value of 1 onto the evaluation stack and immediately pop it back off into the local variable we declared above.

  IL_0000:  ldc.i4.1
  IL_0001:  stloc.0

The next step is fairly interesting for anyone that hasn’t seen IL before. We are transferring control to the instruction at the IL_0031 label. We’ll cover this in a minute.

  IL_0002:  br.s  IL_0031

We are now going to load the first argument onto the evaluation stack (stack size +1). This happens to be the “sender” parameter of the click event. We use the ldfld call to push the label on the stack. The IL is just preparing the text value to be concatenated in the loop.

  IL_0004:  ldarg.0
  IL_0005:  ldfld  class [System.Windows.Forms]System.Windows.Forms.Label CSharpLoop.Form1::lblOutput

Having fun so far? Good, let’s keep on trucking. Next, we are doing something fairly interesting. We are going to duplicate topmost item on the evaluation stack and push it onto the stack as well. The way this happens is fairly complicated for such a small field name. First, the value is pushed onto the stack, popped off the stack and duplicated, then pushed back onto the stack. Once we’ve performed three operations to get that done, we perform one more to push the duplicated value on the stack. Remember out += operator in the code? Sound like something a += operator might want to do? I hope so :). Next, we are going to use callvirt to push the return value of the label’s Text property onto the evaluation stack.

  IL_000a:  dup
  IL_000b:  callvirt  instance string [System.Windows.Forms]System.Windows.Forms.Control::get_Text()

We’ve seen the next series before so I’ll spare you the detailed explaination. We are pushing our array list onto the stack. We need this so we can get our count later.

  IL_0010:  ldarg.0
  IL_0011:  ldfld  class [mscorlib]System.Collections.ArrayList CSharpLoop.Form1::list

Remember way back at the start of this method, we initialized a local variable? Well now we are going to load it. We then pushes an int32 value of 1 onto the evaluation stack again.

  IL_0016:  ldloc.0
  IL_0017:  ldc.i4.1

We do our subtraction that is done in the array indexer (“i-1”).

  IL_0018:  sub

Next we use callvirt again to get the item at the index provided by the result of the sub field’s return value. Without delay, we call that object’s ToString method and store the return value on the stack.

  IL_0019:  callvirt  instance object [mscorlib]System.Collections.ArrayList::get_Item(int32)
  IL_001e:  callvirt  instance string [mscorlib]System.Object::ToString()

Of course, our call is concatenating strings so we call the String.Concat method using our Control’s text value again. Notice when we called concat we are using “call” instead of callvirt? This is because we know the object is a native type, String. We used callvirt against our controls because these objects can be recast at run-time. Therefore the callvirt method is used to call the method against the runtime type of the object, not the compile time type.

  IL_0023:  call  string [mscorlib]System.String::Concat(string,string)
  IL_0028:  callvirt  instance void [System.Windows.Forms]System.Windows.Forms.Control::set_Text(string)

Here we go again with this sequence! This time we are loading the values from our first parameter so we can increment the value with the “add” field. This adds the value 1 to the value stored at location 0. Make sense? Good, because we then have to call stloc again to store the value back into location 0.

  IL_002d:  ldloc.0
  IL_002e:  ldc.i4.1
  IL_002f:  add
  IL_0030:  stloc.0

Now we have to load the value stored in our first initialized value again so we can do our “for(;;)” comparison. We compare this value against the value returned from the ArrayList’s Count method.

  IL_0031:  ldloc.0
  IL_0032:  ldarg.0
  IL_0033:  ldfld  class [mscorlib]System.Collections.ArrayList CSharpLoop.Form1::list
  IL_0038:  callvirt  instance int32 [mscorlib]System.Collections.ArrayList::get_Count()

Remember back near the beginning of all of this, we transferred control to an instruction at IL_0031 ? Well now we are transferring control back to IL_0004. If this sounds like loop control instructions to you, you’d be right. Lastly, this method ends with a return call. Since this method is void, we don’t return any values.

  IL_003d:  ble.s  IL_0004
  IL_003f:  ret
  } // end of method Form1::button1_Click

Was this code really necessary for a simple loop!? Sadly, yes. But before you get upset, lets remember that we have 10 extra lines in our VB.NET compiled IL! I wont rehash all of the VB.NET code like we did with C#, but I will concentrate on the differences because that’s what causes our performance hit in VB.NET.


VB.NET IL Code Analysis


Take a look at the first difference — the initialized variables.Notice that we have 3 variables declared here instead of 1. The first is the variable that we use in the loop, and the second is a reference to the label that we will use in the loop. The third is used to cache the value of the Count() method later on. Rather than evaluate this within the loop, we will cache this value in a local variable and use this value in our loop comparison.

 .locals init ([0] int32 i,
      [1] class [System.Windows.Forms]System.Windows.Forms.Label _Vb_t_ref_0,
      [2] int32 _Vb_t_i4_0)

The next thing I’ll touch on is those nasty nop calls. These are annoying but are there for debugging support. These allow you to place break points in code that have no execution. These are just placeholders, but as said, they do waste instructions. Try compiling these in release mode instead of debug mode to get rid of these. You should be doing this anyway when you are testing the performance of your code.

The IL that follows is representative of what we’ve already said. The label will be loaded as a local variable along with a cached value of the variable. These values are obtained and stored in location 0 and 2 respectively. (stloc.0 and stloc.2 fields)

  L_0001: ldc.i4.1
  L_0002: ldarg.0
  L_0003: ldfld [mscorlib]System.Collections.ArrayList VBLoop.Form1::list
  L_0008: callvirt instance int32 [mscorlib]System.Collections.ArrayList::get_Count()
  L_000d: stloc.2
  L_000e: stloc.0

We do the same loop control we had before with the br.s call and then move in to load the first argument (our label) and put it on the stack. We load the value stored in the local variable stored in location 1 twice (ldloc.1). Notice this is different than the C# version of “dup”.

  L_0011: ldarg.0
  L_0012: callvirt instance [System.Windows.Forms]System.Windows.Forms.Label VBLoop.Form1::get_lblOutput()
  L_0017: stloc.1
  L_0018: ldloc.1
  L_0019: ldloc.1

Moving forward we get our text value of the output label and start our loop through the array list.

  L_001a: callvirt instance string [System.Windows.Forms]System.Windows.Forms.Control::get_Text()
  L_001f: ldarg.0
  L_0020: ldfld [mscorlib]System.Collections.ArrayList VBLoop.Form1::list

We load our first local variable and do our subtraction again. But notice our subtraction call is different.

  L_0025: ldloc.0
  L_0026: ldc.i4.1
  L_0027: sub.ovf

This version of sub.ovf is used to do overflow checking on the subtraction operation. This is turned on by default in VB.NET compilation, but turned off in C# compilation.

We take the same steps to load the item in our array list, get the result of the ToString call, concatenate the value, and then set the text value of our label that we did in the C# version.

  L_0028: callvirt instance object [mscorlib]System.Collections.ArrayList::get_Item(int32)
  L_002d: callvirt instance string object::ToString()
  L_0032: call string string::Concat(string, string)
  L_0037: callvirt instance void [System.Windows.Forms]System.Windows.Forms.Control::set_Text(string)

As we draw near the end of our IL, again go through the steps to increment our loop counter. After we put some more of those annoying little nop calls. Just as the subtraction call did overflow checking, the add method does overflow checking, hence the change from add to add.ovf.

  L_003c: nop
  L_003d: nop
  L_003e: ldloc.0
  L_003f: ldc.i4.1
  L_0040: add.ovf
  L_0041: stloc.0

We end our IL nicely by reloading our variables in position 0 and 2 before returning to the front of our loop.

  L_0042: ldloc.0
  L_0043: ldloc.2
  L_0044: ble.s L_0011
  L_0046: nop
  L_0047: ret
}

Summary


This has really just been an informational look at how the compilers differ by default. Its entirely possible to change a few parameters in the compilers to get the output to look a bit more alike. Its important to know these differences, though, so that you can make educated decisions on how you should optimize your code based on what language you are working with. Happy hacking.

Delegates in C# vs VB.NET down to the IL

So I found a post on a blog the other day from one of my friends. He is a recent convert from VB.NET to C# and he asked where his event handlers were. Anyone that has been coding in both VB.NET and C# knows exactly what the problem is. Many people tend to misunderstand where the .NET runtime ends and language syntax begins. Let’s remember that both languages must compile to Intermediate Language (IL). The way each language chooses to implement .NET’s features in sytax is totally up to the language designer. This post will attempt to show the differences in these languages and break down how this all works behind the scenes. I’ll show you how events are hooked up in VB.NET, then in C#. I’ll show you how these are compiled down to IL and then provide references where you can find more information if you really want it.

First off, lets look at how we implement event handlers for controls. When I double click on a control in a VB.NET Windows Forms project, the development environment drops me into VB.NET code and a newly created event handler. The event handler is hooked up as follows:

Private Sub Button1_Click(ByVal sender As System.Object, _
  ByVal e As System.EventArgs) Handles Button1.Click
End Sub

VB.NET knows that this event (Button1_Click) is handling the Click event of the Button1 object by the very aptly named “Handles” keyword followed by the dotted notation of Object.EventName. This, of course, follows the EventHandler delegate method signature. When I compile this code, the VB.NET compiler outputs the following IL:

IL_0037:  ldarg.0
IL_0038:  dup
IL_0039:  ldvirtftn
          instance void VBHandlers.Form1::Button1_Click(object,
             class [mscorlib]System.EventArgs)
IL_003f:  newobj
          instance void [mscorlib]System.EventHandler::.ctor(object,
             native int)
IL_0044:  callvirt
          instance void [System.Windows.Forms]System.Windows.Forms.Control::add_Click(
             class [mscorlib]System.EventHandler)

This IL is fiarly simple, the ldvirtftn field is pushing the pointer to the Button1_Click function’s implementation onto the stack. This first pushes the Button1 object onto the stack. The object instance is then popped from the stack and the address of the entry point to Button1_Click is retreived. That pointer is then pushed onto the stack. Next the event handler is created with the newobj call, and finally, the callvirt method adds the handler to the Click event. Notice that the click event implements an add_Click syntax. This is because there is a little known feature of .NET events which provide event accessors. Much like you can use Get and Set accessors for properties, you can implement add and remove accessors for events. You might also recognize the prefixed operation on the event is similar to the way that operator overloading looks in IL (i.e. op_Equal).

In C# we have the same convenience of double clicking a control to quickly implement the default event for that control. When we double click the same button in a C# Windows Forms project, we are aslo presented in a C# code window that has the following code:

private void button1_Click(object sender, System.EventArgs e) {
}

You can see that we have our method implemented that matches the delegate method signature, but where is our handler? How does the button know to hook this up? What happens if I rename my button? Does it just know to handle this method when the event is raised based on the method name? Not hardly. The development environment also adds a handler to this method, but it does it in a more object oriented manner. That is, it hooks up the event by an assignment operator. If you look at your C# windows forms code again, look for the “Windows Form Designer generated code” code region and expand it. You’ll notice in the InitializeComponent method, there is a section of this method dedicated to setting the values for the button1 button instance. At the end of this section is code similar to the following:

this.button1.Click += new System.EventHandler(this.button1_Click);

This code tells the event to append an EventHandler to the Click event. That event handler just happens to be our button1_Click method. When we compile our C# code, we get nearly the same IL output:

IL_005a:  ldarg.0
IL_005b:  ldftn
              instance void CSharpHandlers.Form1::button1_Click(object,
             class [mscorlib]System.EventArgs)
IL_0061:  newobj
              instance void [mscorlib]System.EventHandler::.ctor(object,
              native int)
IL_0066:  callvirt
              instance void [System.Windows.Forms]System.Windows.Forms.Control::add_Click(
              class [mscorlib]System.EventHandler)

Did you catch the difference between the C# and VB.NET IL output? We are no longer calling ldvirtftn, but instead, we are simply calling ldftn. This is the difference between VB.NET using a virtual dispatch sequence and C# using an instance dispatch sequence. Both are valid according to Ecma-335 standards. My guess is that if you don’t understand delegates, you won’t get the difference between these two fields. The main thrust of what most need to see is that both languages implement the same functionaility differently through their language syntax. It’s up to the compiler to interpret the syntax and output the appropriate IL code — which more often than not is nearly identical regardless of what language you use. For some languages it may make more sense to use a different IL construct within the same category, but the output is essentially the same. If you attempt to use a different .NET compatible language and you don’t see a feature that you were expecting to see, don’t give up. The feature is most likely implemented in a different way.