Author Archives: Tobint

Thread-Safe Generic Dictionary

The following question was posted on MSDN forums last night and I thought I’d take a little time to answer this considering my apparent threading affinity (*cough*). Here was the question:

“Could someone give me a pointer as to how I might implement a thread safe wrapper around a genric dictionary? I’ve written thread-safe dictionaries in C# 1.x by inheriting from the DictionaryBase but I’m a bit stumped as to how to acheive this using Generics.”

I can certainly understand the complaint here. It comes from the fact that Microsoft apparently didn’t think anyone would need thread-safe generic collections — or if you did, you should do your own work on this. Think I’m wrong? Consider this link, which states:

A System.Collections.Generic.Dictionary<,> can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread safe procedure. To guarantee thread safety during enumeration, you can lock the collection during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

What this means is that you have some limited options. 1) Create a sub-class of System.Collections.Generic.Dictionary, 2) Create a utility class for modifying the collection safely, or 3) Creating your own thread-safe dictionary from scratch.

Lets consider these one by one.

The first option: Sub-Class
This seems a bit rediculous. Your methods might look something like the following:

using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;

namespace ThreadedGenerics {
  class ThreadSafeDictionary : Dictionary {
    public new void Add(T key, U value) {
      // Add your preferred thread locking mechanism here
       base.Add(key, value);
      // unlocking here
    }
    // TODO : Remaining Method Implementations
  }
}

or in VB.NET :

Imports System.Threading
Imports System.Collections.Generic
Public Class ThreadSafeDictionary(Of T, U)
    Inherits Dictionary(Of T, U)
    Public Shadows Sub Add(ByVal key As T, ByVal value As U)
        ' Add your preferred thread locking mechanism here
        MyBase.Add(key, value)
        ' unlocking here
    End Sub
    ' TODO : Remaining Method Implementations
End Class

This is obviously silly because we have to override all of our functionality anyway only to wrap the call in a critical section. You should lock the underlying collection as well, but that’s up to you to decide which method you prefer.

Additionally you have to qualify the method ‘overrides’ with “New” (Shadows in VB.NET) instead of override because Dictionary<,> didn’t mark the functions as virtual. Lets go ahead and scratch this option off the list. It just doesn’t make a lot of sense and violates OO.

The second option: Create a Utility Class

This method seems to make a bit more sense. You simply create a utility class to do all the work for you and do the wrapping. Its cleaner than the “inheritance” method. (I hesitate to call anything inheritance when you can’t override the functionality but rather follow a hide-recall pattern.) This would look something like this:

using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;

namespace NewWinFormsFeatures
{
    public class ThreadSafeDictionaryUtility
    {
       Dictionary dict;
       public ThreadSafeDictionaryUtility() {
            dict = new Dictionary();
        }
        public void SafeAdd(T key, U value) {
            // Add your preferred thread locking mechanism here
            dict.Add(key, value);
            // unlocking here
        }
        // TODO : Remaining Method Implementations
    }
}

or in VB.NET

Imports System.Threading
Imports System.Collections.Generic
Public Class ThreadSafeDictionaryUtility(Of T, U)
    Private dict As Dictionary(Of T, U)
    Public Sub New()
        Me.dict = New Dictionary(Of T, U)
    End Sub
    Public Sub SafeAdd(ByVal key As T, ByVal value As U)
        ' Add your preferred thread locking mechanism here
        Me.dict.Add(key, value)
        ' unlocking here
    End Sub
End Class

This should work pretty well but it isn’t the most efficient way to get what you want. It does, however, prevent you from performing much implementation. You basically only provide the implementation methods you want and you simply wrap them into a critical section or object lock.

The third option: Rolling your own

This last option requires you to create your own implementation from scratch. This would require you to create a class that implemented the IDictionary<,> interface (and any others that you want to implement such as ISerializable). I wont get into details here because to do this right, you really need to do a lot of work and its already 3am here 🙂 What I will say is that Microsoft does state that DictionaryBase is not thread safe. What it does use is the BeginCriticalRegion and EndCriticalRegion methods to let the host know that exceptions in portions of the “add” code may be damaging to other code in the AppDomain. After waking up this morning and reviewing this code (per a very kind individual who bluntly pointed out my error) I feel its neccessary to write an explaination of what those methods do and when its not appropriate. BeginCriticalSection and EndCriticalSection around the ENTIRE Dictionary.Add method would be inappropriate. Where it would be appropriate to use these functions is in this third option — rolling your own. You want it around the very small portion of the code that does the array shifting and modifying of the underlying value. Otherwise, in the case of wrapping it around the Dictionary.Add method, something as simple as an invalid argument passed into your thread-safe class may cause the entire appdomain to shut down.

HTH.

Moving Tomorrow — Feedmap Changing

Tomorrow, I am moving from Charlotte, NC to Greenville, SC. I’ll be relocating my feedmap as well. As such, I want to wish my fellow Charlotte bloggers a fond fairwell, but not a goodbye. I may be moving, but I’ll be watching you all, so no funny business!

On another note, welcome Greenville bloggers! It’s good to be back 🙂

Mike Downen has started blogging!!!

YES! Another good resource for security information has just started blogging. Mike Downen is the program manager for security in the CLR. Man, I really wish he had come with the other CLR team members to Atlanta. In any case, point your aggregators to this blog. It has the potential to provide very valuable security information.

Thread Switching Overhead

I was reading a post by Sue Loh today. It gives a very cool insight into some threading problems with an expedited thread quantum on CE. Essentially, since threads have half the time to do their actual work, the thread context switching overhead percentage provides a bigger hit on performance. Putting this in a real-world situation , I state it like this. I am currently driving 1.5 hours to work and back (each way) so I spend a tremendous amount of time on the road. Of course, I put in 9 hour days or so. Most people feel this is a lot of distance to drive, and I agree (hense why I’m moving to Greenville, SC this weekend). However, imagine if I had to drive to work in the morning, put in 4 or 5 hours of work, then drive home and back and put in another 4 or 5 hours of work. I’d be spending just about as much time going back and forth to work as I did actually working. This is the same problem with the expedited thread quantum. Sue sites the difference of 100 milliseconds that is set by default in CE to 50 miliseconds that an OEM might change this value to. Keep in mind that if you change the amount of work that can be done in a single pass, there is some overhead in the context switching that will be much more evident with the actual work cycles being truncated.

Beating the dead horse – Wireless networks again

I hate to keep beating a dead horse here, but I want to get my point across concerning wireless networking. Sans has now issued a paper about the basic insecurities of wireless. Check it out and at least use this paper as input for threat modeling your networks and wireless applications.

Singularity

Here’s another quick post. Microsoft Research is writing an operating system in C#! I’ve been talking about wanting to do this for a while. There was even some mention of this from JasonZ when the CLR team came to Atalanta a while ago. Definitely looks like a fun project to keep up with.

Who’s reading my blog? Wall Street Journal staff?

A few days ago, I blogged about wireless communications and the fact that “Friends don’t let friends use wireless“. Not two days after my post, the wall street journal posted an article about the very same thing (requires WSJ.com subscription). Granted they actually did the reporting well by researching the topic. They probably haven’t ever seen my blog, but its nice to see a national publication renouned for reporting precisely about most topics (other than political opinion) confirm my thoughts on the matter.

.NET Code Access Security – The fast version

I spoke at this months GSP Developers Guild meeting as the “short presenter”. We typically have two presentations each month — one is a short presentation and one is a long presentation. I had to cut my Code Camp slides in half, but I managed to only overrun a 30 minute presentation by say, 15 minutes 🙂 Glen Gordon was our “long presenter” today. He gave a great presentation on ASP.NET Mobile Controls. I knew the presentation material, but its always great to see people respond to the technology like they did. Glen is a beast — he gave a 4 hour long presentation at the Greenville MSDN event today too. From 1pm to 5pm today he discussed Web Services, SQL Server 2005 with end point registration, Infopath consumption of the web services and end points, ClickOnce deployment and more. He then took a quick drive over to the guild meeting to give another hour-long talk on mobile web development. It was a geek decathalon!

Thanks for the great day guys. I had a blast giving my presentation again. And I really enjoyed watching your presentations again Glen.

Fun with default compilation: VB.NET vs C#

I’m not a language elitist by any means. I came from a Visual Basic background and avoided C++ when I could. These days, however, I’m more of a C# guy. I feel like there is sufficient gain in using the language without much, if any development-time penalty associated with using it. While I don’t mind someone stating their reasons for using one language over the other, I will not tolerate the argument that these languages compile to the same IL and work just as well. The VB.NET and C# compilers were written by two different teams. They emitted MSIL completely different from one another by default, and with good reason on both sides. As such, there are some major differences in the IL that is output. I pointed one out the other day when talking about delegate implementation in C# vs VB.NET. Today, I’m noting another.


VB and C# Specific Language


Let’s take a look at the following scenario. I’ve created two projects in both languages and implemented the same code in both. In the form’s class, I have an array list with 100 items in it, a button control and a label control. For the button’s click event handler, I added the following code:

Private Sub Button1_Click(ByVal sender As System.Object, _
   ByVal e As System.EventArgs) _
   Handles Button1.Click

   For i As Int32 = 1 To list.Count
     Me.lblOutput.Text += list(i - 1).ToString()
   Next
End Sub

This code simply loops through all of the items in the array list and outputs them to the label. Notice that I’m not caching the results of list.Count first. Instead, I’m evaluating this each time in the loop. This raises the hairs on the backs of several people’s necks every time I talk about it. I’ll explain my decision to do this later though. In the mean time, let’s look at what the C# output looks like:

for( int i = 1; i <= list.Count ; i++ ) {
    this.lblOutput.Text += list[i-1].ToString();
}

Again, this whole section has the least optimal of coding decisions, but it is done in an attempt to make these two pieces of code do the same work.


The IL Comparison


These look about the same, but the problem is evident when you look at the IL. Yesterday I posted an example where the IL in VB.NET was using virtual dispatch for delegates whereas C# was using instance dispatch. Virtual dispatches are much more expensive. So what is the problem here? Let’s start by looking at the IL for the C# and VB.NET versions of this code. In C# we have a cool 34 lines of IL:

.method private hidebysig instance void button1_Click(object sender,
       class [mscorlib]System.EventArgs e) cil managed
{
  // Code size  64 (0x40)
  .maxstack  5
  .locals init ([0] int32 i)
  IL_0000:  ldc.i4.1
  IL_0001:  stloc.0
  IL_0002:  br.s  IL_0031
  IL_0004:  ldarg.0
  IL_0005:  ldfld class [System.Windows.Forms]System.Windows.Forms.Label CSharpLoop.Form1::lblOutput
  IL_000a:  dup
  IL_000b:  callvirt
            instance string [System.Windows.Forms]System.Windows.Forms.Control::get_Text()
  IL_0010:  ldarg.0
  IL_0011:  ldfld class [mscorlib]System.Collections.ArrayList CSharpLoop.Form1::list
  IL_0016:  ldloc.0
  IL_0017:  ldc.i4.1
  IL_0018:  sub
  IL_0019:  callvirt instance object [mscorlib]System.Collections.ArrayList::get_Item(int32)
  IL_001e:  callvirt instance string [mscorlib]System.Object::ToString()
  IL_0023:  call  string [mscorlib]System.String::Concat(string, string)
  IL_0028:  callvirt instance void [System.Windows.Forms]System.Windows.Forms.Control::set_Text(string)
  IL_002d:  ldloc.0
  IL_002e:  ldc.i4.1
  IL_002f:  add
  IL_0030:  stloc.0
  IL_0031:  ldloc.0
  IL_0032:  ldarg.0
  IL_0033:  ldfld  class [mscorlib]System.Collections.ArrayList CSharpLoop.Form1::list
  IL_0038:  callvirt  instance int32 [mscorlib]System.Collections.ArrayList::get_Count()
  IL_003d:  ble.s  IL_0004
  IL_003f:  ret
} // end of method Form1::button1_Click

But in VB.NET, take a look at the IL hit here — 44 lines of code.

.method private instance void  Button1_Click(object sender,
          class [mscorlib]System.EventArgs e) cil managed
{
  // Code size  72 (0x48)
  .maxstack  5
  .locals init ([0] int32 i,
      [1] class [System.Windows.Forms]System.Windows.Forms.Label _Vb_t_ref_0,
      [2] int32 _Vb_t_i4_0)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  ldarg.0
  IL_0003:  ldfld  class [mscorlib]System.Collections.ArrayList VBLoop.Form1::list
  IL_0008:  callvirt  instance int32 [mscorlib]System.Collections.ArrayList::get_Count()
  IL_000d:  stloc.2
  IL_000e:  stloc.0
  IL_000f:  br.s  IL_0042
  IL_0011:  ldarg.0
  IL_0012:  callvirt instance class [System.Windows.Forms]System.Windows.Forms.Label VBLoop.Form1::get_lblOutput()
  IL_0017:  stloc.1
  IL_0018:  ldloc.1
  IL_0019:  ldloc.1
  IL_001a:  callvirt instance string [System.Windows.Forms]System.Windows.Forms.Control::get_Text()
  IL_001f:  ldarg.0
  IL_0020:  ldfld  class [mscorlib]System.Collections.ArrayList VBLoop.Form1::list
  IL_0025:  ldloc.0
  IL_0026:  ldc.i4.1
  IL_0027:  sub.ovf
  IL_0028:  callvirt instance object [mscorlib]System.Collections.ArrayList::get_Item(int32)
  IL_002d:  callvirt  instance string [mscorlib]System.Object::ToString()
  IL_0032:  call  string [mscorlib]System.String::Concat(string,string)
  IL_0037:  callvirt  instance void [System.Windows.Forms]System.Windows.Forms.Control::set_Text(string)
  IL_003c:  nop
  IL_003d:  nop
  IL_003e:  ldloc.0
  IL_003f:  ldc.i4.1
  IL_0040:  add.ovf
  IL_0041:  stloc.0
  IL_0042:  ldloc.0
  IL_0043:  ldloc.2
  IL_0044:  ble.s  IL_0011<
  IL_0046:  nop
  IL_0047:  ret
} // end of method Form1::Button1_Click

There are 10 additional IL lines for the same exact code — four of which appear to be unnecessary “nop” fields. The MSDN help for nop states: “Fills space if opcodes are patched. No meaningful operation is performed although a processing cycle can be consumed.” So despite the fact that these instructions do nothing, they can be wasting important processing cycles, which, if executed in a loop such as ours can be quite expensive over the long run.

Let’s break this down a bit further to see exactly what’s happening in these functions. Let’s evaluate the C# side first so I can pick on VB.NET’s IL later and show you where it’s going, well, wrong 🙂


C# IL Code Analysis


First off, the C# header is pretty standard.

.method private hidebysig instance void button1_Click(object sender,
        class [mscorlib]System.EventArgs e) cil managed
{
  // Code size  64 (0x40)
  .maxstack   5

Next we see our locally scoped declarations. In this instance, we only have one variable that is initialized as an int32. This will be the variable that we use in our loop counter.

  .locals init ([0] int32 i)

We then push an int32 value of 1 onto the evaluation stack and immediately pop it back off into the local variable we declared above.

  IL_0000:  ldc.i4.1
  IL_0001:  stloc.0

The next step is fairly interesting for anyone that hasn’t seen IL before. We are transferring control to the instruction at the IL_0031 label. We’ll cover this in a minute.

  IL_0002:  br.s  IL_0031

We are now going to load the first argument onto the evaluation stack (stack size +1). This happens to be the “sender” parameter of the click event. We use the ldfld call to push the label on the stack. The IL is just preparing the text value to be concatenated in the loop.

  IL_0004:  ldarg.0
  IL_0005:  ldfld  class [System.Windows.Forms]System.Windows.Forms.Label CSharpLoop.Form1::lblOutput

Having fun so far? Good, let’s keep on trucking. Next, we are doing something fairly interesting. We are going to duplicate topmost item on the evaluation stack and push it onto the stack as well. The way this happens is fairly complicated for such a small field name. First, the value is pushed onto the stack, popped off the stack and duplicated, then pushed back onto the stack. Once we’ve performed three operations to get that done, we perform one more to push the duplicated value on the stack. Remember out += operator in the code? Sound like something a += operator might want to do? I hope so :). Next, we are going to use callvirt to push the return value of the label’s Text property onto the evaluation stack.

  IL_000a:  dup
  IL_000b:  callvirt  instance string [System.Windows.Forms]System.Windows.Forms.Control::get_Text()

We’ve seen the next series before so I’ll spare you the detailed explaination. We are pushing our array list onto the stack. We need this so we can get our count later.

  IL_0010:  ldarg.0
  IL_0011:  ldfld  class [mscorlib]System.Collections.ArrayList CSharpLoop.Form1::list

Remember way back at the start of this method, we initialized a local variable? Well now we are going to load it. We then pushes an int32 value of 1 onto the evaluation stack again.

  IL_0016:  ldloc.0
  IL_0017:  ldc.i4.1

We do our subtraction that is done in the array indexer (“i-1”).

  IL_0018:  sub

Next we use callvirt again to get the item at the index provided by the result of the sub field’s return value. Without delay, we call that object’s ToString method and store the return value on the stack.

  IL_0019:  callvirt  instance object [mscorlib]System.Collections.ArrayList::get_Item(int32)
  IL_001e:  callvirt  instance string [mscorlib]System.Object::ToString()

Of course, our call is concatenating strings so we call the String.Concat method using our Control’s text value again. Notice when we called concat we are using “call” instead of callvirt? This is because we know the object is a native type, String. We used callvirt against our controls because these objects can be recast at run-time. Therefore the callvirt method is used to call the method against the runtime type of the object, not the compile time type.

  IL_0023:  call  string [mscorlib]System.String::Concat(string,string)
  IL_0028:  callvirt  instance void [System.Windows.Forms]System.Windows.Forms.Control::set_Text(string)

Here we go again with this sequence! This time we are loading the values from our first parameter so we can increment the value with the “add” field. This adds the value 1 to the value stored at location 0. Make sense? Good, because we then have to call stloc again to store the value back into location 0.

  IL_002d:  ldloc.0
  IL_002e:  ldc.i4.1
  IL_002f:  add
  IL_0030:  stloc.0

Now we have to load the value stored in our first initialized value again so we can do our “for(;;)” comparison. We compare this value against the value returned from the ArrayList’s Count method.

  IL_0031:  ldloc.0
  IL_0032:  ldarg.0
  IL_0033:  ldfld  class [mscorlib]System.Collections.ArrayList CSharpLoop.Form1::list
  IL_0038:  callvirt  instance int32 [mscorlib]System.Collections.ArrayList::get_Count()

Remember back near the beginning of all of this, we transferred control to an instruction at IL_0031 ? Well now we are transferring control back to IL_0004. If this sounds like loop control instructions to you, you’d be right. Lastly, this method ends with a return call. Since this method is void, we don’t return any values.

  IL_003d:  ble.s  IL_0004
  IL_003f:  ret
  } // end of method Form1::button1_Click

Was this code really necessary for a simple loop!? Sadly, yes. But before you get upset, lets remember that we have 10 extra lines in our VB.NET compiled IL! I wont rehash all of the VB.NET code like we did with C#, but I will concentrate on the differences because that’s what causes our performance hit in VB.NET.


VB.NET IL Code Analysis


Take a look at the first difference — the initialized variables.Notice that we have 3 variables declared here instead of 1. The first is the variable that we use in the loop, and the second is a reference to the label that we will use in the loop. The third is used to cache the value of the Count() method later on. Rather than evaluate this within the loop, we will cache this value in a local variable and use this value in our loop comparison.

 .locals init ([0] int32 i,
      [1] class [System.Windows.Forms]System.Windows.Forms.Label _Vb_t_ref_0,
      [2] int32 _Vb_t_i4_0)

The next thing I’ll touch on is those nasty nop calls. These are annoying but are there for debugging support. These allow you to place break points in code that have no execution. These are just placeholders, but as said, they do waste instructions. Try compiling these in release mode instead of debug mode to get rid of these. You should be doing this anyway when you are testing the performance of your code.

The IL that follows is representative of what we’ve already said. The label will be loaded as a local variable along with a cached value of the variable. These values are obtained and stored in location 0 and 2 respectively. (stloc.0 and stloc.2 fields)

  L_0001: ldc.i4.1
  L_0002: ldarg.0
  L_0003: ldfld [mscorlib]System.Collections.ArrayList VBLoop.Form1::list
  L_0008: callvirt instance int32 [mscorlib]System.Collections.ArrayList::get_Count()
  L_000d: stloc.2
  L_000e: stloc.0

We do the same loop control we had before with the br.s call and then move in to load the first argument (our label) and put it on the stack. We load the value stored in the local variable stored in location 1 twice (ldloc.1). Notice this is different than the C# version of “dup”.

  L_0011: ldarg.0
  L_0012: callvirt instance [System.Windows.Forms]System.Windows.Forms.Label VBLoop.Form1::get_lblOutput()
  L_0017: stloc.1
  L_0018: ldloc.1
  L_0019: ldloc.1

Moving forward we get our text value of the output label and start our loop through the array list.

  L_001a: callvirt instance string [System.Windows.Forms]System.Windows.Forms.Control::get_Text()
  L_001f: ldarg.0
  L_0020: ldfld [mscorlib]System.Collections.ArrayList VBLoop.Form1::list

We load our first local variable and do our subtraction again. But notice our subtraction call is different.

  L_0025: ldloc.0
  L_0026: ldc.i4.1
  L_0027: sub.ovf

This version of sub.ovf is used to do overflow checking on the subtraction operation. This is turned on by default in VB.NET compilation, but turned off in C# compilation.

We take the same steps to load the item in our array list, get the result of the ToString call, concatenate the value, and then set the text value of our label that we did in the C# version.

  L_0028: callvirt instance object [mscorlib]System.Collections.ArrayList::get_Item(int32)
  L_002d: callvirt instance string object::ToString()
  L_0032: call string string::Concat(string, string)
  L_0037: callvirt instance void [System.Windows.Forms]System.Windows.Forms.Control::set_Text(string)

As we draw near the end of our IL, again go through the steps to increment our loop counter. After we put some more of those annoying little nop calls. Just as the subtraction call did overflow checking, the add method does overflow checking, hence the change from add to add.ovf.

  L_003c: nop
  L_003d: nop
  L_003e: ldloc.0
  L_003f: ldc.i4.1
  L_0040: add.ovf
  L_0041: stloc.0

We end our IL nicely by reloading our variables in position 0 and 2 before returning to the front of our loop.

  L_0042: ldloc.0
  L_0043: ldloc.2
  L_0044: ble.s L_0011
  L_0046: nop
  L_0047: ret
}

Summary


This has really just been an informational look at how the compilers differ by default. Its entirely possible to change a few parameters in the compilers to get the output to look a bit more alike. Its important to know these differences, though, so that you can make educated decisions on how you should optimize your code based on what language you are working with. Happy hacking.

Where’s the liberal/conservative outrage now?

Things I haven’t heard on Air America and America Right that I thought for sure I would:

“Newsweek lied, people died”

“Guns don’t kill people, Newsweek Kills People”

I guess I shouldn’t be suprised. Both republicans and democrats have tricked most of the nation into taking one side or the other without question. Citizens don’t seem to be able to find fault when there is definite blame to be placed. They can’t give credit, when there is definite kudos to be delivered. There is no co-operation for fear that the “other side” might appear to be making progress.

Why in the world can we not disagree civily anymore? Why does everything have to be a political tool? As intelligent citizens of this nation, can we not at least shame ourselves into seeing things for what they are? Newsweek went to print with this story because of this political hatred that has been spread from one end of the nation to the other. There was a day and age when common sense dictated that although there might be a story somewhere, some stories just shouldn’t be told for the sake of the nation. In this case, it was a story that was just flat out wrong, and all because an entire news organization is hell bent on killing this nation in favor of increasing a subscriber base.

This nation belongs to every citizen herein. Can we not find common ground and attack those areas that we agree on and forgive those that we cannot? I’m not asking you to drop your leftist/rightwing agendas, but why does everything have to be a federal law? Why can’t we let the states and local municipalities decide what is right for their communities? It leaves far more people happy than not. Its what the framers of our government intended.

I feel like I’m rambling, and believe me, I could ramble forever on this topic. But I won’t. I’ll just post this message as a prayer to the nation’s citizenry that we stop the bickering, and start fixing things that desperately need our attention.