Quantcast
Channel: : software-engineering
Viewing all articles
Browse latest Browse all 19

The Case of the Null Coalescence Illusion

$
0
0

Nothing marks you out as a rookie programmer faster than blaming the compiler when your code doesn’t work, but sometimes, just sometimes, the compiler actually is at fault. Today, I discovered one such situation.

Consider this little console application. The meat of this reproduction was provided to me by Matt Johnson, in response to my question on Stack Overflow.

C#

   1:namespace NullCoalescenceIllusion
   2:     {
   3:internalclass Program
   4:         {
   5:staticvoid Main(string[] args)
   6:             {
   7:             var test = new FooHarness();
   8:             test.Test();
   9:             }
  10:         }
  11:publicclass FooHarness
  12:         {
  13:publicvoid Test()
  14:             {
  15:             TestFoo();
  16:             }
  17:         Foo _foo;
  18:void TestFoo(Foo foo = null)
  19:             {
  20:             _foo = foo ?? new Foo();
  21://Foo something = _foo;
  22:             }
  23:         }
  24:publicclass Foo {}
  25:     }
 

Compile this as ‘Debug, x86’ and set a breakpoint on line 20, just before _foo gets a value. Step over that line. What value does _foo have? As you might expect, it has a reference to an instance of class Foo.

Nothing unusual so far. Just in case you haven’t seen the Null Coalescing Operator (??) before, it works like this: If the value on the left side is non-null, then return it; otherwise return the value on the right side. So the code above might read as: Set _foo to the value passed in the foo parameter, unless it is null, in which case set it to a new instance of class Foo. The point being, after line 20, _foo cannot be null. OK? So now let’s try something interesting.

Set your build target to x64 and repeat the above debugging exercise. Break at line 20, step over it. What’s the value of _foo? It can’t be null, surely? The debugger thinks otherwise though.

A further interesting experiment is to uncomment line 21 and repeat the debug session. This time, as you step over line 20, _foo will still be null. Then, as you step over line 21, both _foo and something magically get a value! WTF?

Well, it is very unlikely that the compiler is actually generating bad code, or there would be a lot of very buggy programs out there. So what is actually going on here? Using ILDASM.exe, one can examine the MSIL instructions – the intermediate language – that the compiler actually emits. Using ILDASM on both the x86 and x64 versions of the assembly produce identical MSIL for the above code, as follows.

MSIL or CIL

   1: .method private hidebysig instance void  TestFoo([opt] class NullCoalescenceIllusion.Foo foo) cil managed
   2: {
   3:   .param [1] = nullref
   4:   // Code size       18 (0x12)
   5:   .maxstack  8
   6:   IL_0000:  nop
   7:   IL_0001:  ldarg.0
   8:   IL_0002:  ldarg.1
   9:   IL_0003:  dup
  10:   IL_0004:  brtrue.s   IL_000c
  11:   IL_0006:  pop
  12:   IL_0007:  newobj     instance void NullCoalescenceIllusion.Foo::.ctor()
  13:   IL_000c:  stfld      class NullCoalescenceIllusion.Foo NullCoalescenceIllusion.FooHarness::_foo
  14:   IL_0011:  ret
  15: } // end of method FooHarness::TestFoo

This is easy enough to follow if you’re used to looking at code, without really needing to understand the innards of MSIL. You can sort of see that there is a null test going on and then one of two values is stored into _foo, just as expected. So it doesn’t look as though the C# compiler is the issue here. The only other culprit could be the JIT (just-in-time) compiler than generates native machine code. Well it’s been a very long time since I looked at any assembly language, so let’s have a go! Assembly language can be viewed right in Visual Studio’s debugger. Here are the two versions:

x86 (32-bit)

   1: --- c:\Users\Tim\Documents\Visual Studio 2012\Projects\NullCoalescenceIllusion\NullCoalescenceIllusion\Program.cs 
   2:             {
   3: 00000000  push        ebp 
   4: 00000001  mov         ebp,esp 
   5: 00000003  push        edi 
   6: 00000004  push        esi 
   7: 00000005  push        ebx 
   8: 00000006  sub         esp,48h 
   9: 00000009  mov         esi,ecx 
  10: 0000000b  lea         edi,[ebp-54h] 
  11: 0000000e  mov         ecx,12h 
  12: 00000013  xor         eax,eax 
  13: 00000015  rep stos    dword ptr es:[edi] 
  14: 00000017  mov         ecx,esi 
  15: 00000019  xor         eax,eax 
  16: 0000001b  mov         dword ptr [ebp-1Ch],eax 
  17: 0000001e  mov         dword ptr [ebp-3Ch],ecx 
  18: 00000021  mov         dword ptr [ebp-40h],edx 
  19: 00000024  cmp         dword ptr ds:[055F0B84h],0 
  20: 0000002b  je          00000032 
  21: 0000002d  call        6F1354C6 
  22: 00000032  nop 
  23:             _foo = foo ?? new Foo();
  24:00000033  mov         eax,dword ptr [ebp-3Ch] 
  25: 00000036  mov         dword ptr [ebp-44h],eax 
  26: 00000039  mov         eax,dword ptr [ebp-40h] 
  27: 0000003c  mov         dword ptr [ebp-48h],eax 
  28: 0000003f  mov         eax,dword ptr [ebp-44h] 
  29: 00000042  mov         dword ptr [ebp-4Ch],eax 
  30: 00000045  mov         eax,dword ptr [ebp-48h] 
  31: 00000048  mov         dword ptr [ebp-50h],eax 
  32: 0000004b  cmp         dword ptr [ebp-40h],0 
  33: 0000004f  jne         00000072 
  34: 00000051  mov         ecx,55F1CE8h 
  35: 00000056  call        FFE5F9E0 
  36: 0000005b  mov         dword ptr [ebp-54h],eax 
  37: 0000005e  mov         ecx,dword ptr [ebp-54h] 
  38: 00000061  call        FFE7A078 
  39: 00000066  mov         eax,dword ptr [ebp-4Ch] 
  40: 00000069  mov         dword ptr [ebp-4Ch],eax 
  41: 0000006c  mov         eax,dword ptr [ebp-54h] 
  42: 0000006f  mov         dword ptr [ebp-50h],eax 
  43: 00000072  mov         edx,dword ptr [ebp-4Ch] 
  44: 00000075  mov         eax,dword ptr [ebp-50h] 
  45: 00000078  lea         edx,[edx+4] 
  46: 0000007b  call        6EDFE960 
  47:             //Foo something = _foo;
  48:            }
  49:00000080  nop
  50: 00000081  lea         esp,[ebp-0Ch] 
  51: 00000084  pop         ebx 
  52: 00000085  pop         esi 
  53: 00000086  pop         edi 
  54: 00000087  pop         ebp 
  55: 00000088  ret 

 

After stepping over the line with the breakpoint, the program counter comes to rest at line 49 (address 00000080) in the above code, on the NOP instruction. When running in debug mode, the code generator inserts NOP instructions at the start of each source line, so stepping over a line of C# code, you’d expect to arrive at a NOP. Park that thought for a moment. Let’s look at the 64-bit code.

x64 (64-bit)

   1: --- c:\Users\Tim\Documents\Visual Studio 2012\Projects\NullCoalescenceIllusion\NullCoalescenceIllusion\Program.cs 
   2:             {
   3: 00000000  mov         qword ptr [rsp+10h],rdx 
   4: 00000005  mov         qword ptr [rsp+8],rcx 
   5: 0000000a  push        rbp 
   6: 0000000b  sub         rsp,40h 
   7: 0000000f  mov         rbp,rsp 
   8: 00000012  mov         rax,7FFAB71235F8h 
   9: 0000001c  mov         eax,dword ptr [rax] 
  10: 0000001e  test        eax,eax 
  11: 00000020  je          0000000000000027 
  12: 00000022  call        000000005FA0953C 
  13: 00000027  nop 
  14:             _foo = foo ?? new Foo();
  15:00000028  mov         rax,qword ptr [rbp+58h] 
  16: 0000002c  mov         qword ptr [rbp+20h],rax 
  17: 00000030  mov         rax,qword ptr [rbp+50h] 
  18: 00000034  mov         qword ptr [rbp+28h],rax 
  19: 00000038  cmp         qword ptr [rbp+58h],0 
  20: 0000003d  jne         0000000000000068 
  21: 0000003f  lea         rcx,[00051380h] 
  22: 00000046  call        000000005F62DB60 
  23:             //Foo something = _foo;
  24:            }
  25:0000004b  mov         qword ptr [rbp+30h],rax
  26: 0000004f  mov         rax,qword ptr [rbp+30h] 
  27: 00000053  mov         qword ptr [rbp+38h],rax 
  28: 00000057  mov         rcx,qword ptr [rbp+38h] 
  29: 0000005b  call        FFFFFFFFFFEE8940 
  30: 00000060  mov         r11,qword ptr [rbp+38h] 
  31: 00000064  mov         qword ptr [rbp+20h],r11 
  32: 00000068  mov         rcx,qword ptr [rbp+28h] 
  33: 0000006c  add         rcx,8 
  34: 00000070  mov         rdx,qword ptr [rbp+20h] 
  35: 00000074  call        000000005F62D1C0 
  36: 00000079  jmp         000000000000007B 
  37:0000007b  nop
  38: 0000007c  lea         rsp,[rbp+40h] 
  39: 00000080  pop         rbp 
  40: 00000081  ret 
Wow! That looks completely different. The x64 JIT compiler was written much later than the x86 JIT compiler, so I suppose its not surprising that it works differently. Now, here’s the rub. Stepping over the line with the breakpoint brings the program counter to rest on line 25 (address 0000004b). It’s not a NOP instruction. The next NOP is all the way down at line 37 (address 0000007b). It looks like the problem here is that the debugging information is somehow wrong, so that the debugger actually stops in the middle of the instruction, before the value has been stored into the field. This explains why, when we uncommented that extra line, the value magically appears after stepping over the next line.

So, I’m not sure if the problem is in the 64-bit jitter, or if the C# compiler is producing dicey debugging metadata, but the grist of this problem is that the metadata and the generated code don’t agree. The generated code works, but the debugger gets confused.


Viewing all articles
Browse latest Browse all 19

Trending Articles