Boxing and unboxing - which is faster?

    image

    Interested in the issue of the speed of packing and unpacking operations in .NET, I decided to publish my small and extremely subjective observations and measurements on this topic.


    The sample code is available on github , so I invite everyone to report their measurement results in the comments.



    Theory


    The boxing packing operation is characterized by allocating memory in a managed heap for a value type object and further assigning a pointer to this memory location to a variable on the stack.


    Unpacking unboxing , on the contrary, allocates memory in the execution stack for the object derived from a managed heap using the pointer.


    It would seem that in both cases memory is allocated and there shouldn’t be much difference if it weren’t for one but- extremely important detail is the memory area.


    Remembering that garbage collector (Garbage Collector) is responsible for allocating memory in .NET in a managed heap, it is important to note that it does this nonlinearly, due to its possible fragmentation (presence of free memory areas) and the search for the necessary free area of ​​the required size.


    Update:


    As blanabrother noted in the comments, when allocating memory / copying the value in the managed heap, there is no process of searching for a free piece of memory and its possible fragmentation due to the incriminating pointer and its further compactification using GC. However, based on the following measurements of the speed of memory allocation in C ++, I dare to assume that the area (type) of memory is the main reason for this difference in performance.


    In the case of unpacking, memory is allocated in the execution stack, which contains a pointer to its end, which in combination is the beginning of a piece of memory for a new object.


    The conclusion from this I make is that the packing process should take much longer than unpacking, due to possible side effects associated with the GC and the slow speed of memory allocation / copying values ​​in the managed heap.


    Practice


    To verify this statement, I sketched 4 small functions: 2 for boxing and 2 for unboxing types int and struct.


    public class BoxingUnboxingBenchmark {
        private long LoopCount = 1000000;
        private object BoxedInt = 1;
        private object BoxedStruct = new ExampleStruct {
            Amount = 1000,
            Currency = "RUB"
        };
        [Benchmark]
        public object BoxingInt() {
            int unboxed = 1000;
            for (var i = 0; i < LoopCount; i++) {
                BoxedInt = (object) unboxed;
            }
            return BoxedInt;
        }
        [Benchmark]
        public int UnboxingInt() {
            int unboxed = 1000;
            for (var i = 0; i < LoopCount; i++) {
                unboxed = (int)BoxedInt;
            }
            return unboxed;
        }
        [Benchmark]
        public object BoxingStruct() {
            ExampleStruct unboxed = new ExampleStruct()
            {
                Amount = 1000,
                Currency = "RUB"
            };
            for (var i = 0; i < LoopCount; i++) {
                BoxedStruct = (object) unboxed;
            }
            return BoxedStruct;
        }
        [Benchmark]
        public ExampleStruct UnBoxingStruct() {
            ExampleStruct unboxed = new ExampleStruct();
            for (var i = 0; i < LoopCount; i++) {
                unboxed = (ExampleStruct) BoxedStruct;
            }
            return unboxed;
        }
    }

    To measure performance, the BenchmarkDotNet library was used in Release mode (I will be glad if DreamWalker tells me how to make these measurements more objective). The following is the measurement result:


    image

    image

    I must say right away that I can’t be firmly convinced of the absence of optimizations by the compiler of the final code, however, judging by the IL code, each of the functions contains a singular test operation.


    The measurements were carried out on several machines with different numbers of LoopCount, however, the unpacking speed from time to time exceeded the packaging by 3-8 times .


    Example IL code for int packaging
    .method public hidebysig instance object
    BoxingInt() cil managed
    {
    .custom instance void [BenchmarkDotNet.Core]BenchmarkDotNet.Attributes.BenchmarkAttribute::.ctor() = ( 01 00 00 00 )
    // Code size 43 (0x2b)
    .maxstack 2
    .locals init ([0] int32 unboxed,
    [1] int32 i)
    IL_0000: ldc.i4 0x3e8
    IL_0005: stloc.0
    IL_0006: ldc.i4.0
    IL_0007: stloc.1
    IL_0008: br.s IL_001a
    IL_000a: ldarg.0
    IL_000b: ldloc.0
    IL_000c: box [mscorlib]System.Int32
    IL_0011: stfld object ConsoleApp1.BoxingUnboxingBenchmark::BoxedInt
    IL_0016: ldloc.1
    IL_0017: ldc.i4.1
    IL_0018: add
    IL_0019: stloc.1
    IL_001a: ldloc.1
    IL_001b: conv.i8
    IL_001c: ldarg.0
    IL_001d: ldfld int64 ConsoleApp1.BoxingUnboxingBenchmark::LoopCount
    IL_0022: blt.s IL_000a
    IL_0024: ldarg.0
    IL_0025: ldfld object ConsoleApp1.BoxingUnboxingBenchmark::BoxedInt
    IL_002a: ret
    } // end of method BoxingUnboxingBenchmark::BoxingInt


    Example IL code for unpacking a struct
    .method public hidebysig instance valuetype ConsoleApp1.ExampleStruct
    UnBoxingStruct() cil managed
    {
    .custom instance void [BenchmarkDotNet.Core]BenchmarkDotNet.Attributes.BenchmarkAttribute::.ctor() = ( 01 00 00 00 )
    // Code size 40 (0x28)
    .maxstack 2
    .locals init ([0] valuetype ConsoleApp1.ExampleStruct unboxed,
    [1] int32 i)
    IL_0000: ldloca.s unboxed
    IL_0002: initobj ConsoleApp1.ExampleStruct
    IL_0008: ldc.i4.0
    IL_0009: stloc.1
    IL_000a: br.s IL_001c
    IL_000c: ldarg.0
    IL_000d: ldfld object ConsoleApp1.BoxingUnboxingBenchmark::BoxedStruct
    IL_0012: unbox.any ConsoleApp1.ExampleStruct
    IL_0017: stloc.0
    IL_0018: ldloc.1
    IL_0019: ldc.i4.1
    IL_001a: add
    IL_001b: stloc.1
    IL_001c: ldloc.1
    IL_001d: conv.i8
    IL_001e: ldarg.0
    IL_001f: ldfld int64 ConsoleApp1.BoxingUnboxingBenchmark::LoopCount
    IL_0024: blt.s IL_000c
    IL_0026: ldloc.0
    IL_0027: ret
    } // end of method BoxingUnboxingBenchmark::UnBoxingStruct


    Also popular now: