 August 25, 2016 at 15:13
 August 25, 2016 at 15:13Checking MSBuild source code using PVS-Studio
 While working on the development of the PVS-Studio static source code analyzer, we often encounter the need to check for errors of large open projects from eminent developers. The fact that even in such projects it is possible to find errors makes our work much more meaningful. Unfortunately, everyone makes mistakes. No matter how competently the quality control system of the released program code is built up, there is absolutely no way to avoid the features of the “human factor”. As long as people are developing programs, the relevance of using tools for finding errors in code, such as PVS-Studio, will not decrease. Today we will look for errors in the source code of MSBuild, which, alas, is also not ideal.
While working on the development of the PVS-Studio static source code analyzer, we often encounter the need to check for errors of large open projects from eminent developers. The fact that even in such projects it is possible to find errors makes our work much more meaningful. Unfortunately, everyone makes mistakes. No matter how competently the quality control system of the released program code is built up, there is absolutely no way to avoid the features of the “human factor”. As long as people are developing programs, the relevance of using tools for finding errors in code, such as PVS-Studio, will not decrease. Today we will look for errors in the source code of MSBuild, which, alas, is also not ideal.Introduction
Microsoft Build Engine ( MSBuild ) is a platform for automating the assembly of applications from Microsoft. MSBuild is usually used in conjunction with Microsoft Visual Studio, but is not dependent on it. MSBuild provides an XML schema for the project file (* .csproj, * .vbproj, * .vcxproj) that controls the processing and assembly of applications by the build platform. MSBuild is part of the .NET platform and is developed in the C # programming language.
Microsoft provides MSBuild source codes for free download on GitHub. Given the high standards of quality application development adopted by Microsoft, the task of finding errors in the source code of MSBuild is not easy even for a high-quality static analyzer. But the road will be overpowered by the walking one. Let's check the source code of MSBuild using PVS-Studio version 6.07.
Initial data and general verification statistics
The MSBuild solution consists of 14 projects, which, in turn, contain a total of 1256 files with source code in the C # programming language. The approximate number of lines of code is 440,000.
After checking MSBuild with the PVS-Studio static analyzer, 262 warnings were received. The general statistics of the check with a distinction according to the importance levels of the received warnings is:

The chart shows that 73 warnings were issued high, 107 medium and 82 low. The main emphasis should be on the study of messages with levels of High and Medium. Potentially dangerous constructs and real errors are contained here. Low level warnings also indicate errors, but they have a high percentage of false positives, and when writing articles we usually do not study them.
The analysis of the warnings received revealed that at the High and Medium levels there are about 45% of erroneous constructions (81 errors). The remaining warnings are not errors, but are simply constructions suspicious from the point of view of PVS-Studio and false positives. Some warnings were received for Unit tests or in those parts of the code where comments are present explaining that the construction is knowingly unsafe and is used to check for throwing exceptions. Nevertheless, the remaining analyzer warnings require close attention of the developers, since only the authors really know their code and are able to give an adequate assessment of the correctness of a warning.
With this in mind, PVS-Studio's error detection rate at High and Medium per thousand lines of code (error density) is only 0.184 (errors / 1 KLoc). This is not surprising for a product developed by Microsoft, and testifies to the high quality of MSBuild code.
Let us describe the results obtained in more detail. We note that the analysis of all warnings issued by the analyzer is beyond the scope of this article. We restrict ourselves to a description of the most interesting messages, grouping similar cases.
Validation Results
Error checking for null equality
PVS-Studio analyzer warning: V3019 Possibly an incorrect variable is compared to null after type conversion using 'as' keyword. Check variables 'obj', 'name'. AssemblyRemapping.cs 64
Perhaps it has already become a classic mistake that occurs in almost every project we check. After casting with the as operator , the wrong variable is checked for null :
AssemblyNameExtension name = obj as AssemblyNameExtension;
if (obj == null)  // <=
{
  return false;
}In this case, it is necessary to check the null variable for name . The correct version of the code:
AssemblyNameExtension name = obj as AssemblyNameExtension;
if (name == null)
{
  return false;
}Untrusted null equality check
PVS-Studio analyzer warning: V3095 The 'diskRoots' object was used before it was verified against null. Check lines: 2656, 2659. ToolLocationHelper.cs 2656
Pay attention to the diskRoots parameter . This is an object of class List, and its value may be null . However, this fact is checked only in the second if block after the diskRoots variable was used to insert values into the list:
private static void ExtractSdkDiskRootsFromEnvironment
(List diskRoots, string directoryRoots)
{
  if (!String.IsNullOrEmpty(directoryRoots))
  {
    ....
    diskRoots.AddRange(splitRoots);  // <=
  }
  if (diskRoots != null)             // <=
  ....
} In the MSBuild code, 8 more similar potentially unsafe constructs were found:
- V3095 The 'propertyValue' object was used before it was verified against null. Check lines: 2760, 2799. Expander.cs 2760
- V3095 The 'publicKeyToken' object was used before it was verified against null. Check lines: 232, 236. GenerateBindingRedirects.cs 232
- V3095 The 'searchLocation' object was used before it was verified against null. Check lines: 170, 178. Resolver.cs 170
- V3095 The 'assemblyName' object was used before it was verified against null. Check lines: 176, 194. Resolver.cs 176
- V3095 The 'searchLocation' object was used before it was verified against null. Check lines: 249, 264. Resolver.cs 249
- V3095 The 'ReferenceInfo' object was used before it was verified against null. Check lines: 87, 97. AxReference.cs 87
- V3095 The 'packageFileName' object was used before it was verified against null. Check lines: 1448, 1457. BootstrapperBuilder.cs 1448
- V3095 The 'metadataNames' object was used before it was verified against null. Check lines: 243, 253. CommandLineBuilderExtension.cs 243
Erroneous assumption about string length
PVS-Studio analyzer warning: V3057 The 'Substring' function could receive the '-1' value while non-negative value is expected. Inspect the second argument. Utilities.cs 328
The condition for entering the if block is a string of one or more characters. At the same time, an attempt is made to obtain a substring of the source string inside the block. Obviously, in a single-character string, the second parameter of the Substring method will be negative, and the method will throw an
ArgumentOutOfRangeException :
if (toolsVersionList.Length > 0)
{
  toolsVersionList = toolsVersionList.Substring(0,
    toolsVersionList.Length - 2);
}A safe version of this piece of code might look like this:
if (toolsVersionList.Length > 1)
{
  toolsVersionList = toolsVersionList.Substring(0,
    toolsVersionList.Length - 2);
}Similar errors in the code:
- V3057 The 'Substring' function could receive the '-1' value while non-negative value is expected. Inspect the second argument. SolutionFile.cs 1217
- V3057 The 'Substring' function could receive the '-1' value while non-negative value is expected. Inspect the second argument. XMake.cs 2929
- V3057 The 'Remove' function could receive the '-1' value while non-negative value is expected. Inspect the first argument. BootstrapperBuilder.cs 767
Cast with a loss of precision
Caution-Studio analyzer of PVS: V3041 of The expression WAS implicitly cast from 'long' of the type to 'float' of the type. Consider utilizing an explicit type cast to avoid the loss of a fractional part. An example: double A = (double) (X) / Y ;. CommunicationsUtilities.cs 593 The
variables now and s_lastLoggedTicks are of type long . Calculations are performed, the result of which should be a value of type float . However, since the division operation is performed on variables of the long type , and only after that the result of the expression is converted to the float type , accuracy will be lost:
float millisecondsSinceLastLog =
  (float)((now – s_lastLoggedTicks)/10000L);The correct version of this design:
float millisecondsSinceLastLog =
  (float)(now – s_lastLoggedTicks)/10000;You should always be careful about calculations that share integer values and floating point values.
Method that always returns true
PVS-Studio analyzer warning: V3009 It's odd that this method always returns one and the same value of 'true'. ComReference.cs 304
The GetTypeLibNameForITypeLib method returns true under any conditions:
internal static bool GetTypeLibNameForITypeLib(....)
{
  ....
  if (typeLib2 == null)
  {
    ....
    return true;  // <=
  }
  ....
  try
  {
    if (data == null || ...)
    {
      ....
      return true;  // <=
    }
    ....
  }
  catch (COMException ex)
  {
    ....
    return true;  // <=
  }
  return true;  // <=
}In this case, the value of the bool type returned by the GetTypeLibNameForITypeLib method is checked in the calling code. This behavior can lead to unpredictable consequences. It is necessary to refactor the code and fix the error.
Pointless comparison
PVS-Studio analyzer warning: V3022 Expression 'itemsAndMetadataFound.Metadata.Values.Count> 0' is always true. Evaluator.cs 1752
After the itemsAndMetadataFound.Metadata.Values.Count> 0 is checked in the if block , the same check, which is already meaningless, is performed inside the block:
if (itemsAndMetadataFound.Metadata != null && 
    itemsAndMetadataFound.Metadata.Values.Count > 0)
{
  ....
  if (itemsAndMetadataFound.Metadata.Values.Count > 0)  // <=
  {
    needToProcessItemsIndividually = true;
  }
  ....
}In addition to this, 7 more similar errors were found in the MSBuild code:
- V3022 Expression 'fixedPathInfo! = Null' is always true. FrameworkLocationHelper.cs 794
- V3022 Expression '_shutdownException! = Null' is always false. InProcNode.cs 527
- V3022 Expression 'proj! = Null' is always true. SolutionFile.cs 817
- V3022 Expression '_directMetadata == null' is always false. ProjectItem.cs 755
- V3022 Expression 'Constants.defaultToolsVersion == "2.0"' is always true. ToolsetReader.cs 194
- V3022 Expression '! IsQuotedTransform && functionCapture == null' is always true. ExpressionShredder.cs 281
- V3022 Expression '! IsQuotedTransform && functionCapture == null' is always true. ExpressionShredder.cs 414
Mutually exclusive comparisons
PVS-Studio analyzer warning: V3011 Two opposite conditions were encountered. The second condition is always false. Check lines: 2840, 2838. XMake.cs 2840
The condition for entering the if block is the null equality of the logger variable . However, already inside the block, the VerifyThrow method uses the null inequality check of the same variable. Thus, the verification performed for the VerifyThrow method will always be false:
if (logger == null)
{
  InitializationException.VerifyThrow(logger != null,  // <=
    "LoggerNotFoundError", unquotedParameter);
}It’s hard to say what the correct code should look like, but definitely not. Perhaps the use of the if statement in this case is not required at all.
Unused arguments when formatting a string.
PVS-Studio analyzer warning: V3025 Incorrect format. A different number of format items is expected while calling 'WriteLine' function. Arguments not used: 1st. Scheduler.cs 2216 The
error is contained in the second line of code. Apparently, it was obtained by copying the first line (the notorious copy-paste) and replacing the first parameter in it. At the same time, the second _schedulingData.EventTime.Ticks parameter, which became unnecessary , was forgotten to be deleted:
file.WriteLine("Scheduler state at timestamp {0}:",
  _schedulingData.EventTime.Ticks);
file.WriteLine("------------------------------------------------",
  _schedulingData.EventTime.Ticks);  // <=Thus, the second line of code mistakenly uses an overload of the WriteLine method (string format, object arg0) instead of the correct one.
Similar errors found:
- V3025 Incorrect format. A different number of format items is expected while calling 'Format' function. Arguments not used: resource. XmlUtil.cs 75
- V3025 Incorrect format. A different number of format items is expected while calling 'Format' function. Arguments not used: resource. XmlUtil.cs 82
- V3025 Incorrect format. A different number of format items is expected while calling 'Format' function. Arguments not used: resource. XmlUtil.cs 91
- V3025 Incorrect format. A different number of format items is expected while calling 'Format' function. Arguments not used: resource. XmlUtil.cs 112
Unused parameter
PVS-Studio analyzer warning: V3061 Parameter 'numericValue' is always rewritten in method body before being used. NodePacketTranslator.cs 320 The
list of formal method parameters contains the variable numericValue , the value of which is never used, since it is immediately replaced by a new value:
public void TranslateEnum(ref T value, int numericValue)
{
  numericValue = _reader.ReadInt32();  // <=
  Type enumType = value.GetType();
  value = (T)Enum.ToObject(enumType, numericValue);
} Perhaps the code was refactored, but the method signature (unlike its body) was impossible to change. Otherwise, it makes sense to adjust this method:
public void TranslateEnum(ref T value)
{
  int numericValue = _reader.ReadInt32();
  Type enumType = value.GetType();
  value = (T)Enum.ToObject(enumType, numericValue);
} Another similar warning:
- V3061 Parameter 'defaultToolsVersion' is always rewritten in method body before being used. ToolsetProvider.cs 118
Extra Assignment
PVS-Studio analyzer warning: V3005 The '_nextProjectId' variable is assigned to itself. LoggingService.cs 325
The analyzer detected a construct in which excess assignment is made for the _nextProjectId field . First, the value MaxCPUCount + 2 is calculated , which is added to the value _nextProjectId and assigned to it by the + = operator . And then the resulting value is once again assigned to the _nextProjectId field :
public int NextProjectId
{
  get
  {
    lock (_lockObject)
    {
      _nextProjectId = _nextProjectId += MaxCPUCount + 2;  // <=
      return _nextProjectId;
    }
  }
}In this case, there is no error, but the code looks strange. It makes sense to simplify this construction:
public int NextProjectId
{
  get
  {
    lock (_lockObject)
    {
      _nextProjectId += MaxCPUCount + 2;
      return _nextProjectId;
    }
  }
}Conclusion
In conclusion, I would like to note how useful it can be to regularly use static code analyzers, such as PVS-Studio, to search for potential and real errors, even in such high-quality projects as MSBuild.
You can always repeat the error search examples given in this article, as well as check your own projects using the demo version of PVS-Studio analyzer .

If you want to share this article with an English-speaking audience, then please use the link to the translation: Sergey Khrenov. Checking the Source Code of MSBuild with PVS-Studio .
Have you read the article and have a question?
Often our articles are asked the same questions. We collected the answers here: Answers to questions from readers of articles about PVS-Studio, version 2015 . Please see the list.