It was DevWeek 2005 and I was in a session with Jeff Richter about low level .NET things and I foolishly asked him a question. He looked at me as if I had asked the most dumb question ever asked by anyone, so I decided not to follow up on my question, even though I didn't feel he'd really given me the answer I was looking for.
So what was the question? Well he was describing boxing of value types and how it can cause performance problems so it was best to avoid it where possible, even though it's not always clear when boxing is occurring. I'd asked didn't boxing need to occur for any method call on a value type. On reflection I'm not sure this was a particularly dumb question but I've never really got fully to the bottom of it, mainly because it's never really been much of an issue to me. But here's my take on it, which may or may not be accurate. Boxing is only going to happen if the method call is a virtual method where the value type doesn't override the base object implementation. Now it might be boxing would also be required if the value type did override the base method (assuming boxing is required to get the virtual method table), if value types could be inherited from. But they can't so the discussion is kind of irrelevant. This may well be why value types can't be inherited from, but this is all frankly getting way too complicated for me to understand, so I'll quickly move on.
Anyway to illustrate the point, here's a little test C# application.
namespace ConsoleApplication2 { struct ValTypeTest { int val; public ValTypeTest(int val) { this.val=val; } public override string ToString() { return val.ToString(); } } class Class1 { [STAThread] static void Main(string[] args) { ValTypeTest thing = new ValTypeTest(34); Console.WriteLine(thing.ToString()); Console.WriteLine(thing.GetHashCode()); int number = 34; Console.WriteLine(number.ToString()); Console.WriteLine(number.GetHashCode()); Console.ReadLine(); } } }
If you look at the IL code for this in Reflector using .NET 1.1 (I'll explain why I'm using .NET 1.1 shortly), you'll see this -
.method private hidebysig static void Main(string[] args) cil managed { .custom instance void [mscorlib]System.STAThreadAttribute::.ctor() .entrypoint .maxstack 2 .locals init ( [0] valuetype ConsoleApplication2.ValTypeTest thing, [1] int32 number) L_0000: ldloca.s thing L_0002: ldc.i4.s 0x22 L_0004: call instance void ConsoleApplication2.ValTypeTest::.ctor(int32) L_0009: ldloca.s thing L_000b: call instance string ConsoleApplication2.ValTypeTest::ToString() L_0010: call void [mscorlib]System.Console::WriteLine(string) L_0015: ldloc.0 L_0016: box ConsoleApplication2.ValTypeTest L_001b: callvirt instance int32 [mscorlib]System.ValueType::GetHashCode() L_0020: call void [mscorlib]System.Console::WriteLine(int32) L_0025: ldc.i4.s 0x22 L_0027: stloc.1 L_0028: ldloca.s number L_002a: call instance string [mscorlib]System.Int32::ToString() L_002f: call void [mscorlib]System.Console::WriteLine(string) L_0034: ldloca.s number L_0036: call instance int32 [mscorlib]System.Int32::GetHashCode() L_003b: call void [mscorlib]System.Console::WriteLine(int32) L_0040: call string [mscorlib]System.Console::ReadLine() L_0045: pop L_0046: ret }
As you can see, the call to GetHashCode causes the value type to be boxed, whereas the call to ToString doesn't, because ToString has been overridden whereas GetHashCode hasn't been. But if we look at the IL code in .NET 2, it looks like this
.method private hidebysig static void Main(string[] args) cil managed { .custom instance void [mscorlib]System.STAThreadAttribute::.ctor() .entrypoint .maxstack 2 .locals init ( [0] valuetype ConsoleApplication2.ValTypeTest thing, [1] int32 number) L_0000: nop L_0001: ldloca.s thing L_0003: ldc.i4.s 0x22 L_0005: call instance void ConsoleApplication2.ValTypeTest::.ctor(int32) L_000a: nop L_000b: ldloca.s thing L_000d: constrained ConsoleApplication2.ValTypeTest L_0013: callvirt instance string [mscorlib]System.Object::ToString() L_0018: call void [mscorlib]System.Console::WriteLine(string) L_001d: nop L_001e: ldloca.s thing L_0020: constrained ConsoleApplication2.ValTypeTest L_0026: callvirt instance int32 [mscorlib]System.Object::GetHashCode() L_002b: call void [mscorlib]System.Console::WriteLine(int32) L_0030: nop L_0031: ldc.i4.s 0x22 L_0033: stloc.1 L_0034: ldloca.s number L_0036: call instance string [mscorlib]System.Int32::ToString() L_003b: call void [mscorlib]System.Console::WriteLine(string) L_0040: nop L_0041: ldloca.s number L_0043: call instance int32 [mscorlib]System.Int32::GetHashCode() L_0048: call void [mscorlib]System.Console::WriteLine(int32) L_004d: nop L_004e: call string [mscorlib]System.Console::ReadLine() L_0053: pop L_0054: ret }
Now it's no longer clear whether boxing occurs or not, because both calls use the IL constrained opcode. It would appear this opcode has been added for a variety of reasons, but one of them is to help with binary compatibility, so if a value type changes so it adds an override for a virtual method or removes an override, it will still work without any changes to the calling app. The downside of this is that boxing is even more hard to spot than it was before.
Saying that, worrying about boxing is often not really worth the trouble. It smells of premature optimization and in most cases isn't likely to cause problems. Saying that, it does suggest if you're writing your own value types, you're probably going to want to override most of object's base methods, particularly GetHashCode, which is used in quite a lot of places.
2 comments:
Hello. I've read msdn article about boxing and constrained opcode but it was not explained clearly there on my opinion. You provided really nice explanation on this subtle difference, thank you.
Yevgen
Actually, if you read Mr Richters book CLR Via C# its explained there quite decently. Brilliant book - big Richter fan.
Post a Comment