Improving performance: avoidable boxing in .NET

In our project, we are developing a server in C #. This server must withstand very high loads, for this reason we try to write the code as optimally as possible. C # is rarely associated with high performance, but if you approach the development wisely, you can achieve a very good level.

One of the expensive processes in terms of performance is boxing and unboxing. A reminder of what it is can be found here.. Recently, I decided to look at the entire IL code of our projects and look for box and unbox instructions. There were a lot of sites, boxing'a in which you can avoid a flick of the wrist. All cases leading to unnecessary boxing are obvious, and are allowed by carelessness at the moments of concentration on functionality, and not on optimization. I decided to write out the most common cases, so as not to forget about them, and then automate their correction. This article lists these cases.

I’ll immediately make a remark: most often, performance problems are at a higher level, and before editing all the extra boxing, you need to bring the code to a state where it will be worthwhile. Seriously thinking about things like boxing makes sense if you really want to get the most out of C #.

Forgive me, the Russian language, but later in the article I will use the word boxing, unexpected for him, so that the eye does not cling once again in an attempt to find a line of code.

Let's get started.

1. Passing the value type of variables to String.Format, String.Concat methods, etc.

The first place in the number of boxing is held by string operations. Fortunately, in our code this was mainly found in message formatting for exceptions. The basic rule to avoid boxing is to call ToString () on the value type of the variable before using it in String.Format methods or when adding lines.

Same thing, but in code. Instead:

var id = Guid.NewGuid();
var str1 = String.Format("Id {0}", id);
var str2 = "Id " + id;


IL_0000: call valuetype [mscorlib]System.Guid [mscorlib]System.Guid::NewGuid()
IL_0005: stloc.0
IL_0006: ldstr "Id {0}"
IL_000b: ldloc.0
IL_000c: box [mscorlib]System.Guid
IL_0011: call string [mscorlib]System.String::Format(string, object)
IL_0016: pop
IL_0017: ldstr "Id "
IL_001c: ldloc.0
IL_001d: box [mscorlib]System.Guid
IL_0022: call string [mscorlib]System.String::Concat(object, object)


Need to write:

var id = Guid.NewGuid();
var str1 = String.Format("Id {0}", id.ToString());
var str2 = "Id " + id.ToString();


IL_0000: call valuetype [mscorlib]System.Guid [mscorlib]System.Guid::NewGuid()
IL_0005: stloc.0
IL_0006: ldstr "Id {0}"
IL_000b: ldloca.s id
IL_000d: constrained. [mscorlib]System.Guid
IL_0013: callvirt instance string [mscorlib]System.Object::ToString()
IL_0018: call string [mscorlib]System.String::Format(string, object)
IL_001d: pop
IL_001e: ldstr "Id "
IL_0023: ldloca.s id
IL_0025: constrained. [mscorlib]System.Guid
IL_002b: callvirt instance string [mscorlib]System.Object::ToString()
IL_0030: call string [mscorlib]System.String::Concat(string, string)


As we can see, the constrained statement appears instead of box. It says here that the next callvirt call will be directly to the variable, provided that thisType is a value type, and there is an implementation of the method. If there is no method implementation, then boxing will happen anyway.

The unpleasant moment is that almost everyone has a Resharper, which suggests that calling ToString () is superfluous.

And also about the lines, or rather their addition. Sometimes I met code like:

var str2 = str1 + '\t';


There is a false feeling that it charwill add up with a string without problems, but it charis a value type, so boxing will also be here. In this case, it’s still better to write like this:

var str2 = str1 + "\t";


2. Calling methods on generic variables

Second place in the number of boxing hold generic methods. The fact is that any method call on a generic variable causes boxing, even if constraint is set class.

Example:

public static Boolean Equals(T x, T y)
	where T : class 
{
	return x == y;
}


Turns into:

IL_0000: ldarg.0
IL_0001: box !!T
IL_0006: ldarg.1
IL_0007: box !!T
IL_000c: ceq


In fact, not everything is so bad here, since this IL code will be optimized by JIT, but the case is interesting.

The positive point is that to call methods on generic variables we use the familiar constrained instruction, which allows you to call methods on value types without boxing. If the method works with both value types and reference types, then, for example, a null comparison is best written like this:

if (!typeof(T).IsValueType && value == null)
	// Do something


There is also a problem with the operator as. A typical practice is to immediately cast using an operator asinstead of checking for a type and casting to it. But in case you can have a value type, then it’s better to first check for a type, and then cast, because the operator asworks only with reference types, and boxing will happen first, and then the call isinst.

3. Calls to enumeration methods

Enumerations in C # are very sad. The problem is that any method call on an enumeration causes boxing:

[Flags]
public enum Flags
{
	First = 1 << 0,
	Second = 1 << 1,
	Third = 1 << 2
}
public Boolean Foo(Flags flags)
{
	return flags.HasFlag(Flags.Second);
}


IL_0000: ldarg.1
IL_0001: box HabraTests.Flags
IL_0006: ldc.i4.2
IL_0007: box HabraTests.Flags
IL_000c: call instance bool [mscorlib]System.Enum::HasFlag(class [mscorlib]System.Enum)


Moreover, even the GetHashCode () method calls boxing. Therefore, if you suddenly need a hash code from an enumeration, then first cast it to its underlying type. And also, if you suddenly use the enumeration as a key in the Dictionary, then make your own IEqualityComparer, otherwise there will be boxing with each call to GetHashCode ().

4. Enumerations in generic methods

A logical continuation of paragraphs 2 and 3 is the desire to see how the listing will behave in the generic method. On the one hand, if there is an implementation of a method on a value type, then generic methods can call interface methods on structures without boxing. On the other hand, all implementations of methods exist with the base class Enum, and not with our created enumerations. We will write a short test to understand what is happening inside.

Test code
public static void Main()
{
	Double intAverageGrow, enumAverageGrow;
	Int64 intMinGrow, intMaxGrow, enumMinGrow, enumMaxGrow;
	var result1 = Test(() => GetUlong(10), out intAverageGrow, out intMinGrow, out intMaxGrow);
	var result2 = Test(() => GetUlong(Flags.Second), out enumAverageGrow, out enumMinGrow, out enumMaxGrow);
	Console.WriteLine("Int32 memory change. Avg: {0}, Min: {1}, Max: {2}", intAverageGrow, intMinGrow, intMaxGrow);
	Console.WriteLine("Enum  memory change. Avg: {0}, Min: {1}, Max: {2}", enumAverageGrow, enumMinGrow, enumMaxGrow);
	Console.WriteLine(result1 + result2);
	Console.ReadKey(true);
}
public static UInt64 GetUlong(T value)
	where T : struct, IConvertible
{
	return value.ToUInt64(CultureInfo.InvariantCulture);
}
public static UInt64 Test(Func testedMethod, out Double averageGrow, out Int64 minGrow, out Int64 maxGrow)
{
	GCSettings.LatencyMode = GCLatencyMode.SustainedLowLatency;
	var previousTotalMemory = GC.GetTotalMemory(false);
	Int64 growSum = 0;
	minGrow = 0;
	maxGrow = 0;
	UInt64 sum = 0;
	for (var i = 0; i < 100000; i++)
	{
		sum += testedMethod();
		var currentTotalMemory = GC.GetTotalMemory(false);
		var grow = currentTotalMemory - previousTotalMemory;
		growSum += grow;
		if (minGrow > grow)
			minGrow = grow;
		if (maxGrow < grow)
			maxGrow = grow;
		previousTotalMemory = currentTotalMemory;
	}
	averageGrow = growSum / 100000.0;
	return sum;
}



Result:

Int32 memory change. Avg: 0, Min: 0, Max: 0
Enum  memory change. Avg: 3,16756, Min: -2079476, Max: 8192


As we can see, with enumerations here too everything is not thank God: boxing occurs with every call to the ToUInt64 () method. But on the other hand, it is clearly visible that calling the interface method on Int32 does not cause any boxing.

And in the end, and partly as a conclusion, I want to add that value types help a lot to increase productivity, but you need to carefully monitor how they are used, otherwise as a result of boxing their main advantage will be leveled.
In the next article, I would like to talk about places where global synchronization points are not obvious, and how to get around them. Stay tuned.

Also popular now: