Interesting sorting behavior in .NET
It all started with the fact that tests began to fall at a colleague’s work. And only she has one. I won’t go into details, but the point is that there were two List objects in the test. The first was the reference, and the second was returned by the test method. Then the sheets were compared element by element.
The reason for the test crash was quickly found out: for a colleague, the order of the elements in the resulting array was the reverse of the order in the reference array. The code of the test method used the standard List.Sort with our comparator, which always returns 0. It was on this test that the elements returned in the same order for the whole team, and exactly the opposite for one employee. It was quickly found out that the colleague had not had any updates for a long time and the version of mscorlib.dll she was very different from the one that the others had. One could calm down on this, but it became interesting to me, I decided to dig further and that's what I found out.
The contents of the sheet after sorting and before with the same elements may differ, because according to msdnlist.sort uses QuickSort, which, as we all know, is unstable. However, there is one interesting feature. About her further.
Take the following code:
In principle, nothing special. There is a Smth structure in which there are two fields. There is a comparer to this structure. We saturate the sheet with three structures identical to the comparator, display the contents of the sheet before sorting, then sort, then display the number of comparisons that were made, then display the contents of the sheet after sorting.
Now let's see what three different versions of the .NET Framework produce.
As you can see, in the older version of .NET, sorting is stable and only three comparisons are performed. In the latest version of .NET 4.0, sorting is unstable and seven (!) Comparisons are performed. In .NET 4.5, sorting is again robust and 3 comparisons are performed again.
Those. the new version of .NET 4.0 processes slower identical elements. To confirm this, I increased the number of identical elements to 100 and got 813 comparisons in the latest version of .NET 4.0, and 390 in .NET 4.0 of the old version and .NET 4.5.
Thus, it turns out that Microsoft did everything right, then incorrectly , then again right. Moreover, this problem can come back to me as an olympiad somewhere.
ZY: I wrote my own simplest quick sorting with the calculation of the number of comparisons:
And even she betrayed that there were only 744 comparisons. less than comparer calls for the latest version of .NET 4.0
The reason for the test crash was quickly found out: for a colleague, the order of the elements in the resulting array was the reverse of the order in the reference array. The code of the test method used the standard List.Sort with our comparator, which always returns 0. It was on this test that the elements returned in the same order for the whole team, and exactly the opposite for one employee. It was quickly found out that the colleague had not had any updates for a long time and the version of mscorlib.dll she was very different from the one that the others had. One could calm down on this, but it became interesting to me, I decided to dig further and that's what I found out.
The contents of the sheet after sorting and before with the same elements may differ, because according to msdnlist.sort uses QuickSort, which, as we all know, is unstable. However, there is one interesting feature. About her further.
Take the following code:
The code
using System;
using System.Collections.Generic;
namespace fkComparer {
internal struct Smth {
public int X;
public int Y;
}
internal class SmthComparer : IComparer {
#region Implementation of IComparer
public int Compare(Smth x, Smth y) {
Result.Count++;
if(x.Y < y.Y)
return -1;
if(x.Y > y.Y)
return 1;
return 0;
}
#endregion
}
internal static class Result {
public static int Count;
}
internal class Program {
static void Main() {
List list = new List {new Smth {X = 4, Y = 4}, new Smth {X = 5, Y = 4}, new Smth {X = 6, Y = 4}};
SmthComparer comparrer = new SmthComparer();
for (int i = 0; i < aaa.Count; i++) {
Console.WriteLine(list[i].X);
}
Console.WriteLine("***************");
aaa.Sort(comparrer);
Console.WriteLine(Result.Count);
Console.WriteLine("***************");
for(int i = 0; i < aaa.Count; i++) {
Console.WriteLine(list[i].X);
}
Console.ReadLine();
}
}
}
In principle, nothing special. There is a Smth structure in which there are two fields. There is a comparer to this structure. We saturate the sheet with three structures identical to the comparator, display the contents of the sheet before sorting, then sort, then display the number of comparisons that were made, then display the contents of the sheet after sorting.
Now let's see what three different versions of the .NET Framework produce.
.NET Framework 4.0 Version mscorlib.dll 4.0.30319.17929
4
5
6
****************
3
****************
4
5
6
5
6
****************
3
****************
4
5
6
.NET Framework 4.0 Version mscorlib.dll 4.0.30319.18047
4
5
6
****************
7
***************
6
5
4
5
6
****************
7
***************
6
5
4
.NET Framework 4.5
4
5
6
****************
3
****************
4
5
6
5
6
****************
3
****************
4
5
6
As you can see, in the older version of .NET, sorting is stable and only three comparisons are performed. In the latest version of .NET 4.0, sorting is unstable and seven (!) Comparisons are performed. In .NET 4.5, sorting is again robust and 3 comparisons are performed again.
Those. the new version of .NET 4.0 processes slower identical elements. To confirm this, I increased the number of identical elements to 100 and got 813 comparisons in the latest version of .NET 4.0, and 390 in .NET 4.0 of the old version and .NET 4.5.
Thus, it turns out that Microsoft did everything right, then incorrectly , then again right. Moreover, this problem can come back to me as an olympiad somewhere.
ZY: I wrote my own simplest quick sorting with the calculation of the number of comparisons:
the code
static void Sort(int left, int right) {
int l = left;
int r = right;
Smth m = list[left + (right - left) / 2];
while(true) {
Result.Count++;
while(list[l].Y < m.Y) {
l++;
Result.Count++;
}
Result.Count++;
while(list[r].Y > m.Y) {
r--;
Result.Count++;
}
if(l < r) {
Smth t = list[l];
list[l] = list[r];
list[r] = t;
l++;
r--;
}
if(l >= r)
break;
}
if(left < r)
Sort(left, r);
if(right > l)
Sort(l, right);
}
And even she betrayed that there were only 744 comparisons. less than comparer calls for the latest version of .NET 4.0