First article about checking a C # project
Now the PVS-Studio team is actively developing a static analyzer of C # code. We plan to release the first version of the analyzer by the end of 2015. In the meantime, my task is to write several articles in order to interest C # programmers in advance with the PVS-Studio tool. Today I was given an updated installer. Now it has become possible to install PVS-Studio with C # support and even check something. I was not slow to take advantage of this and decided to check the first thing that turns up by the arm. The Umbraco project was the first to turn up. Of course, while the current version of the analyzer can do little, but this is already enough for me to write a small article.
Umbraco
Umbraco is an open source content management system platform used to publish content worldwide. It is written in C #. Since the release of 4.5, the entire system has become available under the MIT license.
The project is medium in size. However, the part written in C # is not very large. Most of the project is written in JavaScript. In total, the project has about 3200 files with the extension ".cs", with a total size of 15 megabytes. Number of lines of C # code: 400 KLOC.
About PVS-Studio 6.00
Verification will be carried out using the alpha version of PVS-Studio 6.00 analyzer. There are two big changes in this version:
- Analysis of C # projects will be supported.
- The analyzer will no longer support VS2005, VS2008. We suggest a small number of users working with VS2005 / 2008 to continue to use version 5.31 or higher versions 5.xx if any errors are fixed.
Pricing policy will remain the same. We do not make a new product, but expand the capabilities of the existing one. We just support one more programming language. Previously, you could buy PVS-Studio and use it to test projects written in the languages: C, C ++, C ++ / CLI , C ++ / CX . Now it’s additionally possible to check C # projects. This will not affect the price in any way. Those who have already purchased the analyzer for C ++ can check C # code at the same time.
Why c #?
At conferences, I often said that making a C # analyzer is not very interesting. Many errors that are present in C ++ in C # are simply impossible. It really is. For example, in C # there are no functions like memset (), respectively, and there are no mass problems (see examples related to memset (): V511 , V512 , V575 , V579 , V597 , V598 ).
However, I gradually changed my mind. A large number of errors detected by PVS-Studio are associated not with some features of the programming language, but with the carelessness of the programmers. I mean typos and a failed code change after copy-paste. This is where the PVS-Studio analyzer for C ++ is strong, and we decided that these developments can also be applied to C #.
The C # language does not protect against confusing the variable name or from the " last line effect " associated with loss of attention.
Another important factor that pushed to decide to create a C # analyzer is the appearance of Roslyn. Without it, the work of creating an analyzer would be too expensive for us.
RoslynIs an open compilation platform for C # and Visual Basic. Roslyn performs two main actions: builds a parsing tree and compiles it. Additionally, it allows you to analyze the source code, recursively bypass it, work with Visual Studio projects, and execute code on the fly.
What was interesting?
For C ++, I have a favorite V501 diagnostic . Now there is its counterpart for C #: V3001. Let's start with this diagnosis.
Code snippet N1
The code has the “focal point” property:
[DataMember(Name = "focalPoint")]
public ImageCropFocalPoint FocalPoint { get; set; }
This property is of type 'ImageCropFocalPoint', the definition of which is given below:
public class ImageCropFocalPoint
{
[DataMember(Name = "left")]
public decimal Left { get; set; }
[DataMember(Name = "top")]
public decimal Top { get; set; }
}
It would seem that it is impossible to make a mistake when working with such a property. But no. There is an annoying typo in the HasFocalPoint () method:
public bool HasFocalPoint()
{
return FocalPoint != null &&
FocalPoint.Top != 0.5m && FocalPoint.Top != 0.5m;
}
'Top' is checked twice, but they forgot about 'Left'.
Corresponding warning of PVS-Studio: V3001 There are identical sub-expressions 'FocalPoint.Top! = 0.5m' to the left and to the right of the '&&' operator. ImageCropDataSet.cs 58
N2 Code Snippet
protected virtual void OnBeforeNodeRender(ref XmlTree sender,
ref XmlTreeNode node,
EventArgs e)
{
if (node != null && node != null)
{
if (BeforeNodeRender != null)
BeforeNodeRender(ref sender, ref node, e);
}
}
PVS-Studio Warning: V3001 There are identical sub-expressions 'node! = Null' to the left and to the right of the '&&' operator. BaseTree.cs 503 The
'node' link is double checked. Most likely they also wanted to check the 'sender' link.
Code snippet N3
public void Set (ExifTag key, string value)
{
if (items.ContainsKey (key))
items.Remove (key);
if (key == ExifTag.WindowsTitle || <<<<----
key == ExifTag.WindowsTitle || <<<<----
key == ExifTag.WindowsComment ||
key == ExifTag.WindowsAuthor ||
key == ExifTag.WindowsKeywords ||
key == ExifTag.WindowsSubject) {
items.Add (key, new WindowsByteString (key, value));
....
}
PVS-Studio Warning: V3001 There are identical sub-expressions 'key == ExifTag.WindowsTitle' to the left and to the right of the '||' operator. ExifPropertyCollection.cs 78 The
key is compared twice with the constant 'ExifTag.WindowsTitle'. It’s hard for me to judge how serious this mistake is. Perhaps one of the checks is just superfluous, and it can be deleted. But perhaps there should be a comparison with some other constant.
Code snippet N4
Consider another case where I am not at all sure of an error. However, this code is worth checking again.
There is an enumeration with 4 named constants:
public enum DBTypes
{
Integer,
Date,
Nvarchar,
Ntext
}
But the SetProperty () method for some reason considers only 3 options. Once again, I’m not saying that this is a mistake. However, the analyzer suggests paying attention to this piece of code and I completely agree with it.
public static Content SetProperty(....)
{
....
switch (((DefaultData)property.PropertyType.
DataTypeDefinition.DataType.Data).DatabaseType)
{
case DBTypes.Ntext:
case DBTypes.Nvarchar:
property.Value = preValue.Id.ToString();
break;
case DBTypes.Integer:
property.Value = preValue.Id;
break;
}
....
}
Warning PVS-Studio: V3002 The switch statement does not cover all values of the 'DBTypes' enum: Date. ContentExtensions.cs 286
N5 Code Snippet
public TinyMCE(IData Data, string Configuration)
{
....
if (p.Alias.StartsWith("."))
styles += p.Text + "=" + p.Alias;
else
styles += p.Text + "=" + p.Alias;
....
}
PVS-Studio Warning: V3004 The 'then' statement is equivalent to the 'else' statement. TinyMCE.cs 170
Code snippet N6, N7
At the beginning of the article I said that C # does not protect against the “ last line effect ”. And so we got to the corresponding example:
public void SavePassword(IMember member, string password)
{
....
member.RawPasswordValue = result.RawPasswordValue;
member.LastPasswordChangeDate = result.LastPasswordChangeDate;
member.UpdateDate = member.UpdateDate;
}
PVS-Studio Warning: V3005 The 'member.UpdateDate' variable is assigned to itself. MemberService.cs 114
The programmer copied class members from the 'result' object to the 'member' object. But at the last moment, the person relaxed and copied the member 'member.UpdateDate' to himself.
It is also alarming that the SavePassword () method works with passwords, which means that it requires increased attention to itself.
Exactly the same code fragment can be observed in the UserService.cs file (see line 269). Most likely he was simply copied there without checking.
Code snippet N8
private bool ConvertPropertyValueByDataType(....)
{
if (string.IsNullOrEmpty(string.Format("{0}", result)))
{
result = false;
return true;
}
....
return true;
....
return true;
....
return true;
....
return true;
....
....
return true;
}
PVS-Studio Warning: V3009 It's odd that this method always returns one and the same value of 'true'. DynamicNode.cs 695
The method uses many 'if' statements and many 'return' statements. It is alarming that all the 'return' statements return the value 'true'. Is there any mistake in this? Perhaps after all somewhere it was necessary to return 'false'?
Fragment of code N9 I
suggest readers to check attentiveness and to find an error in the following code fragment. Learn the method for this, but do not read the text below. In order not to do this by accident, I inserted a separator (unicorn :).
public static string GetTreePathFromFilePath(string filePath)
{
List treePath = new List();
treePath.Add("-1");
treePath.Add("init");
string[] pathPaths = filePath.Split('/');
pathPaths.Reverse();
for (int p = 0; p < pathPaths.Length; p++)
{
treePath.Add(
string.Join("/", pathPaths.Take(p + 1).ToArray()));
}
string sPath = string.Join(",", treePath.ToArray());
return sPath;
}
Figure 1. Separates the code from the explanation of what the error is.
PVS-Studio Warning: V3010 The return value of function 'Reverse' is required to be utilized. DeepLink.cs 19
By invoking the Reverse () method, the programmer planned to modify the array 'pathPaths'. Perhaps he was confused by the fact that such an operation is completely correct when it comes, for example, to lists ( List
It will be correct to write:
string[] pathPaths = filePath.Split('/');
pathPaths = pathPaths.Reverse().ToArray();
Or even like this:
string[] pathPaths = filePath.Split('/').Reverse().ToArray();
Snippet of code N10
PVS-Studio analyzer issued several warnings V3013 about the suspicious coincidence of the bodies of some methods. In my opinion, all these warnings are false. It seems to me that only one of these warnings deserves attention:
public void GetAbsolutePathDecoded(string input, string expected)
{
var source = new Uri(input, UriKind.RelativeOrAbsolute);
var output = source.GetSafeAbsolutePathDecoded();
Assert.AreEqual(expected, output);
}
public void GetSafeAbsolutePathDecoded(string input, string expected)
{
var source = new Uri(input, UriKind.RelativeOrAbsolute);
var output = source.GetSafeAbsolutePathDecoded();
Assert.AreEqual(expected, output);
}
PVS-Studio Warning: V3013 It is odd that the body of 'GetAbsolutePathDecoded' function is fully equivalent to the body of 'GetSafeAbsolutePathDecoded' function. UriExtensionsTests.cs 141
Perhaps inside the GetAbsolutePathDecoded () method, you should not use
source.GetSafeAbsolutePathDecoded()
a
source. GetAbsolutePathDecoded()
I'm not sure I'm right, but this place is worth checking.
Answers on questions
The article is designed for a new audience. Therefore, I foresee a number of issues that may arise. I’ll try to answer some of them in advance.
Did you notify the project developers about the defects found?
Yes, we always try to do it.
Do you use PVS-Studio to verify the code of PVS-Studio itself?
Yes.
PVS-Studio supports Mono?
No.
More detailed answers to these and other questions are given in the note: " Answers to questions from readers of articles about PVS-Studio ."
Conclusion
There were not many errors in the project. Our readers from the C ++ audience already know why this is happening. But since we only have to conquer and seduce C # programmers, I will list a number of important points:
- Static analyzer is a regular use tool. Its meaning is to find mistakes at an early stage. There is no point in one-time checks of the project. As a rule, errors that do not significantly affect the operation of the program or located in rarely used sections of the code are detected this way. The reason is that real mistakes all this time were corrected with sweat and blood. Programmers found them, debugging the code for hours , testers found them, and even worse, users. Many of these errors could be fixed right away if you regularly use a static code analyzer. Consider PVS-Studio as an extension of the C # compiler warnings. After all, I hope you see the list of warnings issued by the compiler more than once a year? In more detail, all this is discussed in the article: "Leo Tolstoy and static code analysis . "
- In the articles we mention only those code fragments that seemed interesting to us. As a rule, we do not write about code that the analyzer quite honestly considered suspicious, but it is clear that there is no real error. We call this "smell code." When you use PVS-Studio, this code is worth checking. But to talk about such places in the article is inappropriate.
- This item is not for C ++, but it exists for C # so far. We have not yet implemented so many diagnostics, but we are moving fast. Give our C # unicorn a bit of growth. Then he will show you!
Thank you all for your attention and wish you all hopeless programs.
If you want to share this article with an English-speaking audience, then please use the link to the translation: Andrey Karpov. The First C # Project Analyzed .