Go code generation

Original author: Rob Pike
  • Transfer
A translation of Rob Pike's article from Go's official blog on automatic code generation with go generate. The article is a bit dated (it was written before the release of Go 1.4, in which go generate appeared), but it well explains the essence of the work of go generate.

One of the properties of computability theory — Turing completeness — is that a program can write another program. This is a powerful idea that is not as appreciated as it deserves, although it is found quite often. This is a significant part of determining what compilers do, for example. Also, the go test commandIt also works on the same principle: it scans the packages that need to be tested, creates a new Go program, in which the necessary kit for tests is added, then compiles and runs it. Modern computers are so fast that such a seemingly expensive sequence of actions can be completed in a split second.

There are many other examples around when programs write programs. Yacc , for example, reads a description of a grammar and produces a program that parses this grammar. The “compiler” Protocol Buffers reads a description of the interface and provides definitions of structures, methods, and other code. A variety of configuration utilities work in a similar way, too, extracting metadata from the environment and creating custom launch commands.

Thus, programs that write programs are an important element in software development, but programs like Yacc that create the source code must be integrated into the build process so that their output can be passed to the compiler. When using an external build system like Make, this is usually simple to do. But in Go, in which the go utility gets all the necessary information about the build from the source code, this is a problem. It simply does not have a mechanism to launch Yacc with the go tool.

Up to this point, in the sense.

Go's latest release, 1.4, includes a new command, go generate, which allows you to run similar utilities. It is called go generate , and at startup it scans the code for special comments that indicate which commands to run. It is important to understand thatgo generate is not part of go build . It does not analyze dependencies and must be run before go build. It is intended for the author of the Go package, and not for its users.

The go generate command is very easy to use. To warm up, here's how to use it to generate a Yacc grammar. Let's say you have an input yacc file called gopher.y that defines the grammar of your new language. To generate Go code that will parse this grammar, you would usually run the standard Go version of yacc, something like this:
go tool yacc -o gopher.go -p parser gopher.y

The -o option here indicates the name of the resulting file, and -p the package name.

To transfer this process to go generate, you need to add the following comment in any regular (not auto-generated) .go file in this directory:
//go:generate go tool yacc -o gopher.go -p parser gopher.y

This text is the same command, but with a comment added at the beginning that go generate recognizes. The comment should begin at the beginning of the line and not have spaces between // and go: generate. After this token, the remainder indicates which go generate command should run.

Now run it. Go to the source directory and run go generate, then go build and so on:
$ cd $GOPATH/myrepo/gopher
$ go generate
$ go build
$ go test

And that’s all it takes. If there are no errors, then go generate will call yacc, which will create gopher.go, at this point the directory will contain all the necessary go-files that we can collect, test and work normally with them. Every time gopher.y changes, just restart go generate to recreate the parser.

If you are interested in more details on how go generate works internally, including parameters, environment variables, and so on, see the design description document .

Go generate does not do anything that could not be done using Make or another build mechanism, but it comes out of the box in the go command - there is no need to install anything extra - and it fits well with the Go ecosystem. Most importantly, remember that this is for the authors of the package, not for users, if only for reasons that the program that will be called may not be available on the user's machine. Also, if the package is supposed to be used with go get, do not forget to add the generated files to the version control system, making it available to users.

Now, let's see how you can use this for something new. As a radically different example where go generate can help, there is a new stringer program in the golang.org/x/tools repository. It automatically generates String () string methods for sets of numeric constants. It is not part of the standard Go kit, but it is easy to install:
$ go get golang.org/x/tools/cmd/stringer

Here is an example from the documentation for stringer . Imagine that we have some code, with a set of numerical constants that define different types of drugs:
package painkiller
type Pill int
const (
    Placebo Pill = iota
    Aspirin
    Ibuprofen
    Paracetamol
    Acetaminophen = Paracetamol
)

For debugging purposes, we would like these constants to be able to nicely render their name, in other words, we want a method with the following signature:
func (p Pill) String() string

It is easy to write by hand, for example, something like this:
func (p Pill) String() string {
    switch p {
    case Placebo:
        return "Placebo"
    case Aspirin:
        return "Aspirin"
    case Ibuprofen:
        return "Ibuprofen"
    case Paracetamol: // == Acetaminophen
        return "Paracetamol"
    }
    return fmt.Sprintf("Pill(%d)", p)
}

There are several ways to write this function, of course. We can use a line slice indexed by Pill, or map, or some other technique. One way or another, we must support it every time we change the set of drugs, and we must verify that the code is correct. (Two different names for paracetamol, for example, make this code a little more sophisticated than it could be.) Plus, the very question of choosing an implementation method depends on the types of values: signed or unsigned, dense and scattered, starting from zero or not, and so on.

The stringer program takes care of this. Although it can be launched manually, it is intended to be launched through go generate. To use it, add a comment to the source, most likely in the code with the type definition:
//go:generate stringer -type=Pill
This rule indicates that go generate should run the stringer command to generate the String method for type Pill. The output will automatically be written to the pill_string.go file (the output can be overridden with the -output flag).

Let's run it:
$ go generate
$ cat pill_string.go
// generated by stringer -type Pill pill.go; DO NOT EDIT
package pill
import "fmt"
const _Pill_name = "PlaceboAspirinIbuprofenParacetamol"
var _Pill_index = [...]uint8{0, 7, 14, 23, 34}
func (i Pill) String() string {
    if i < 0 || i+1 >= Pill(len(_Pill_index)) {
        return fmt.Sprintf("Pill(%d)", i)
    }
    return _Pill_name[_Pill_index[i]:_Pill_index[i+1]]
}
$

Every time we change the definition of Pill or constants, all we have to do is run
$ go generate

to update the String method. And of course, if we have several types in one package that need to be updated, go generate will update them all.

It goes without saying that the generated code is ugly. This is OK, however, since people will not work with this code; auto-generated code is very often ugly. He tries to be as effective as possible. All names are combined together in one line, which saves memory (just one line for all names, even there are a myriad of them). Then the array, _Pill_index, matches the type with the name using a simple and very efficient technique. Please note that _Pill_index is an array (not a slice; one header less) of uint8 type values, the smallest possible integer type that can contain the necessary values. If there are more or negative values, the type of the generated _Pill_index array can change to uint16 or int8, whichever works better.

The approach used in methods generated using stringer varies, depending on the properties of the constant set. For example, if the constants are discharged, he can use map. Here is a simple example based on a set of constants representing powers of two:

const _Power_name = "p0p1p2p3p4p5..."
var _Power_map = map[Power]string{
    1:    _Power_name[0:2],
    2:    _Power_name[2:4],
    4:    _Power_name[4:6],
    8:    _Power_name[6:8],
    16:   _Power_name[8:10],
    32:   _Power_name[10:12],
    ...,
}
func (i Power) String() string {
    if str, ok := _Power_map[i]; ok {
        return str
    }
    return fmt.Sprintf("Power(%d)", i)
}

In summary, the automatic generation of the method allows us to solve the problem better than a person would do.

There are tons of other go source examples in the Go source code. This includes generating Unicode tables in the unicode package, creating efficient methods for encoding and decoding arrays in encoding / gob, creating a timezone dataset in the time package, and the like.

Please use go generate creatively. He is here to encourage experimentation.

And even if not, use stringer to add String methods to your numeric constants. Let the computer do the work for you.

Also popular now: