Static analysis in Go: how we save time when checking code

Hi, Habr. My name is Sergey Rudachenko, I am a technical member of the Roistat company. For the last two years, our team has been transferring various parts of the project to microservices on Go. They are developed by several teams, so we needed to set a hard bar for code quality. To do this, we use several tools, in this article we will discuss one of them - a static analysis.

Static analysis is the process of automatic verification of source code using special utilities. This article will talk about its benefits, briefly describe popular tools and give instructions for implementation. It is worth reading if you have not encountered such tools at all or use them spontaneously.

In articles on this topic, the term "linter" is often found. For us, this is a convenient name for simple static analysis tools. The linter's task is to search for simple errors and incorrect design.

Why do we need linters?

Working as a team, you are likely to review the code. Errors missed on the review are potential bugs. Missed unprocessed error- do not receive an informative message and will search for a problem blindly. Wrong in casting or turning to nil map - even worse, the binary will fall from panic.

The errors described above can be added to Code Conventions , but finding them when reading pull requests is not so easy, because the reviewer will have to read the code. If there is no compiler in your head, some of the problems will still go to battle. In addition, the search for minor errors distracts from the verification of logic and architecture. At a distance, the support of such a code will become more expensive. We write in a statically typed language, it is strange not to use it.

Popular tools

Most static analysis tools use packages go/astand go/parser. They provide functions for parsing the syntax of .go files. The standard execution flow (for example, for the golint utility) is:

load a list of files from the required packages
for each file is performed parser.ParseFile(...) (*ast.File, error)
starts checking the supported rules for each file or package
The check passes on each instruction, for example, like this:

f, err := parser.ParseFile(/* ... */)
ast.Walk(func(n *ast.Node) {
    switch v := node.(type) {
    case *ast.FuncDecl:
        if strings.Contains(v.Name, "_") {
            panic("wrong function naming")
        }
    }
}, f)

In addition to AST, there is a Single Static Assignment (SSA). This is a more complex way of analyzing code that works with a thread of execution, rather than syntactic constructs. In this article we will not discuss it in detail, you can read the documentation and take a look at the example of the stackcheck utility .

Further, only popular utilities that perform useful checks for us will be considered.

gofmt

This is a standard utility from the go package, which checks the style and can automatically fix it. Compliance with the style for us is a mandatory requirement, so checking gofmt is included in all our projects.

typecheck

Typecheck checks the type conformity in the code and supports the vendor (unlike gotype). Its launch is required to check compilability, but does not give full guarantees.

go vet

The go vet utility is part of the standard package and is recommended for use by the Go command. Checks for a number of common errors, for example:

incorrect use of printf and similar functions
incorrect build tags
function and nil comparison

golint

Golint is developed by the Go team and checks the code based on the Effective Go and CodeReviewComments documents . Unfortunately, there is no detailed documentation, but you can see from the code that the following is checked:

f.lintPackageComment()
f.lintImports()
f.lintBlankImports()
f.lintExported()
f.lintNames()
f.lintVarDecls()
f.lintElses()
f.lintRanges()
f.lintErrorf()
f.lintErrors()
f.lintErrorStrings()
f.lintReceiverNames()
f.lintIncDec()
f.lintErrorReturn()
f.lintUnexportedReturn()
f.lintTimeNames()
f.lintContextKeyTypes()
f.lintContextArgs()

staticcheck

The developers themselves represent staticcheck as an improved go vet. There are a lot of checks, they are divided into groups:

incorrect uses of standard libraries
problems with multithreading
test problems
useless code
performance issues
doubtful designs

gosimple

Specializes in finding constructions that should be simplified, for example:

Up to ( source code golint )

func(f *file)isMain()bool {
    if f.f.Name.Name == "main" {
        returntrue
    }
    returnfalse
}

After

func(f *file)isMain()bool {
    return f.f.Name.Name == "main"
}

Documentation is similar to staticcheck and includes detailed examples.

errcheck

Errors returned by functions cannot be ignored. The reasons are explained in detail in the Effective Go document . Errcheck will not miss the following code:

json.Unmarshal(text, &val)
f, _ := os.OpenFile(/* ... */)

gas

Finds vulnerabilities in the code: hard access, sql injections and the use of unsafe hash functions.

Examples of errors:

// доступ со всех IP адресов
l, err := net.Listen("tcp", ":2000")
// потенциальная sql инъекция
q := fmt.Sprintf("SELECT * FROM foo where name = '%s'", name)
q := "SELECT * FROM foo where name = " + name
// используйте другой хэш алгоритмimport"crypto/md5"

maligned

In Go, the order of fields in structures affects memory consumption. Maligned finds non-optimal sorting. With this order of fields:

struct {
    a bool
    b string
    c bool
}

The structure will take 32 bits in memory due to the addition of empty bits after the a and c fields.

If we change the sorting and put two bool fields together, then the structure takes only 24 bits:

The original image on stackoverflow

goconst

Magic variables in the code do not reflect the meaning and complicate reading. Goconst finds literals and numbers that occur 2 times or more in the code. Please note, often even a single use may be a mistake.

gocyclo

We consider the cyclomatic complexity of the code to be an important metric. Gocycle shows the difficulty for each function. You can only display functions that exceed the specified value.

gocyclo -over7 package/name

We chose a threshold value of 7 because we did not find a code with higher complexity that did not require refactoring.

Dead code

There are several utilities for finding unused code, their functionality may overlap in part.

ineffassign: checks useless assignments

funcfoo()error {
    var res interface{}
    log.Println(res)
    res, err := loadData() //  переменная res дальше не используетсяreturn err
}

deadcode: find unused functions
unused: finds unused functions, but does it better than deadcode

funcunusedFunc() {
    formallyUsedFunc()
}
funcformallyUsedFunc() {
}

As a result, unused will point at once to both functions, and deadcode only at unusedFunc. Due to this, the extra code is deleted in one pass. Also unused finds unused variables and fields of structures.

varcheck: find unused variables
unconvert: finds useless type conversions

var res intreturnint(res) // unconvert error

If there is no task to save on the launch time of checks, it is better to run them all together. If optimization is needed, I recommend using unused and unconvert.

How is it all convenient to set up

It is inconvenient to run the tools listed above consistently: errors are issued in a different format, execution takes a lot of time. Testing one of our services, ~ 8000 lines of code, took more than two minutes. Install utilities also have separately.

To solve this problem, there are aggregator utilities, for example, goreporter and gometalinter . Goreporter renders the report in html, and gometalinter writes to the console.

Gometalinter is still used in some large projects (for example, docker ). He is able to install all utilities with one command, run them in parallel and format errors according to a pattern. The execution time in the service mentioned above was reduced to one and a half minutes.

Aggregation works only on exact coincidence of the error text, therefore repeated errors are inevitable at the output.

In May 2018, the golangci-lint project appeared on the githaba , which greatly surpasses the gometalinter in convenience:

execution time on the same project was reduced to 16 seconds (8 times)
almost no duplicate errors
clear yaml config
nice error output with a line of code and a pointer to the problem
no need to install additional utilities

Now the increase in speed is provided by the reuse of SSA and loader.Program , in the future it is also planned to reuse the AST tree, which I wrote about at the beginning of the Tools section.

At the time of this writing on hub.docker.com there was no image with the documentation, so we made our own, customized according to our ideas of convenience. In the future, the config will change, so for production we recommend replacing it with your own. To do this, simply add the .golangci.yaml file to the project root directory ( an example is in the golangci-lint repository).

PACKAGE=package/name docker run --rm -t \
    -v $(GOPATH)/src/$(PACKAGE):/go/src/$(PACKAGE) \
    -w /go/src/$(PACKAGE) \
    roistat/golangci-lint

This team can check the entire project. For example, if it is in ~/go/src/project, then change the value of the variable to PACKAGE=project. Verification works recursively on all internal packages.

Please note that this command works correctly only when using vendor.

Implementation

In all our services, we use docker for development. Any project is started without the go environment set. To run the commands, use the Makefile and add the lint command to it:

lint:
    @docker run --rm -t -v $(GOPATH)/src/$(PACKAGE):/go/src/$(PACKAGE) -w /go/src/$(PACKAGE) roistat/golangci-lint

Now the check is started with this command:

make lint

There is an easy way to block code with errors from getting into the master - create a pre-receive-hook. It is suitable if:

You have a small project and few dependencies (or they are in the repository)
It's not a problem for you to wait for the command to complete for git pushseveral minutes.

Instructions for configuring hooks: Gitlab , Bitbucket Server , Github Enterprise .

In other cases, it is better to use CI and prohibit merge code, in which there is at least one error. We do just that by adding the launch of the linter before the tests.

Conclusion

The introduction of systematic checks markedly shortened the review period. However, more importantly: now we can discuss the big picture and architecture most of the time. This allows you to think about the development of the project instead of plugging holes.

Tags: