Practical Go: Tips for Writing Supported Programs in the Real World

Original author: Dave Cheney
  • Transfer
This article focuses on best practices for writing Go code. It is composed in the style of presentation, but without the usual slides. We will try to briefly and clearly go through each item.

First you need to agree on what the best practices for a programming language mean . Here you can recall the words of Russ Cox, technical director of Go:

Software engineering is what happens to programming, if you add the time factor and other programmers.

Thus, Russ distinguishes between the concepts of programming and software engineering . In the first case, you write a program for yourself, in the second you create a product that other programmers will work on over time. Engineers come and go. Teams grow or shrink. New features are added and bugs are fixed. This is the nature of software development.

Content



1. Fundamental principles


I may be one of the first users of Go among you, but this is not my personal opinion. These basic principles underlie Go itself:

  1. Simplicity
  2. Readability
  3. Productivity

Note. Please note, I did not mention “performance” or “concurrency”. There are languages ​​faster than Go, but they certainly cannot be compared in simplicity. There are languages ​​that put parallelism as the top priority, but they cannot be compared in terms of readability or programming productivity.

Performance and concurrency are important attributes, but not as important as simplicity, readability, and productivity.


Simplicity


“Simplicity is a prerequisite for reliability”  - Edsger Dijkstra

Why strive for simplicity? Why is it important that Go programs are simple?

Each of us came across an incomprehensible code, right? When you’re afraid to make a change because it will break another part of the program that you don’t quite understand and don’t know how to fix. This is the difficulty.

“There are two ways to design software: the first is to make it so simple that there are no obvious flaws, and the second is to make it so complex that there are no obvious flaws. The first is much more difficult. ”  - C.E. R. Hoar

Complexity turns reliable software into unreliable. Complexity is what kills software projects. Therefore, simplicity is Go's ultimate goal. Whatever programs we write, they should be simple.

1.2. Readability


“Readability is an integral part of maintainability” - Mark Reinhold, JVM Conference, 2018

Why is it important that the code is readable? Why should we strive for readability?

“Programs should be written for people, and machines just execute them” - Hal Abelson and Gerald Sassman, “Structure and interpretation of computer programs”

Not only Go programs, but generally all software is written by people for people. The fact that machines also process code is secondary.

Once written code will be repeatedly read by people: hundreds, if not thousands of times.

“The most important skill for a programmer is the ability to communicate ideas effectively.” - Gaston Horker

Readability is the key to understanding what a program does. If you can’t understand the code, how to maintain it? If the software cannot be supported, it will be rewritten; and this may be the last time your company uses Go.

If you are writing a program for yourself, do what works for you. But if this is part of a joint project or the program will be used long enough to change the requirements, functions or the environment in which it works, then your goal is to make the program maintainable.

The first step to writing supported software is to make sure the code is clear.

1.3. Productivity


“Design is the art of organizing code so that it works today, but always supports change.” - Sandy Mets

As the last basic principle I want to name the productivity of the developer. This is a big topic, but it comes down to the ratio: how much time you spend on useful work, and how much - waiting for a response from tools or hopeless wanderings in an incomprehensible code base. Go programmers should feel that they can handle a lot of work.

It’s a joke that the Go language was developed while the C ++ program was compiling. Quick compilation is a key feature of Go and a key factor in attracting new developers. Although compilers are being improved, in general, minute compilation in other languages ​​takes a few seconds on Go. So Go developers feel as productive as programmers in dynamic languages, but without any problems with the reliability of those languages.

If we talk fundamentally about the productivity of developers, then Go programmers understand that reading code is essentially more important than writing it. In this logic, Go even goes so far as to use the tools to format all the code in a certain style. This eliminates the slightest difficulty in learning the specific dialect of a particular project and helps to identify errors because they just look wrong compared to regular code.

Go programmers do not spend days debugging strange compilation errors, complex build scripts, or deploying code in a production environment. And most importantly, they do not waste time trying to understand what a colleague wrote.

When Go developers talk about scalability , they mean productivity.

2. Identifiers


The first topic that we will discuss - identifiers , is a synonym for names : names of variables, functions, methods, types, packages, and so on.

“Bad Name Is A Symptom Of Poor Design” - Dave Cheney

Given Go’s limited syntax, object names have a huge impact on program readability. Readability is a key factor in good code, so choosing good names is crucial.

2.1. Name identifiers based on clarity rather than brevity


“It is important that the code is obvious. What you can do in one line, you must do in three. ” - Ukia Smith

Go is not optimized for tricky one-liners or the minimum number of lines in a program. We do not optimize the size of the source code on the disk, nor the time required to type the program in the editor.

“A good name is like a good joke. If you need to explain it, then it’s no longer funny. ”  - Dave Cheney

The key to maximum clarity is the names we choose to identify programs. What qualities are inherent in a good name?

  • A good name is concise . It does not have to be the shortest, but does not contain excess. It has a high signal to noise ratio.
  • A good name is descriptive . It describes the use of a variable or constant, not the contents. A good name describes the result of a function or the behavior of a method, not an implementation. The purpose of the package, not its contents. The more accurately the name describes the thing that identifies, the better.
  • A good name is predictable . By one name you must understand how the object will be used. The names should be descriptive, but it is also important to follow the tradition. That's what Go programmers mean when they say "idiomatic . "

Let us consider in more detail each of these properties.

2.2. ID Length


Sometimes Go's style is criticized for short variable names. As Rob Pike said , "Go programmers want identifiers of the correct length."

Andrew Gerrand offers longer identifiers to indicate importance.

“The greater the distance between the declaration of a name and the use of an object, the longer the name should be” - Andrew Gerrand

Thus, some recommendations can be made:

  • Short variable names are good if the distance between the declaration and the last use is small.
  • Long variable names should justify themselves; the longer they are, the more important they should be. Verbose titles contain little signal in relation to their weight on the page.
  • Do not include the type name in the variable name.
  • Constant names should describe the internal value, not how the value is used.
  • Prefer single-letter variables for loops and branches, separate words for parameters and return values, multiple words for functions and declarations at the package level.
  • Prefer single words for methods, interfaces, and packages.
  • Remember that the package name is part of the name that the caller uses for reference.

Consider an example.

type Person struct {
	Name string
	Age  int
}
// AverageAge returns the average age of people.
func AverageAge(people []Person) int {
	if len(people) == 0 {
		return 0
	}
	var count, sum int
	for _, p := range people {
		sum += p.Age
		count += 1
	}
	return sum / count
}

The range variable is declared in the tenth line p, and it is called only once from the next line. That is, the variable lives on the page for a very short time. If the reader is interested in the role pin the program, he only needs to read only two lines.

For comparison, it is peopledeclared in the function parameters and seven lines live. The same applies to sumand count, therefore, they justify their longer names. The reader needs to scan more code to find them: this justifies the more distinguished names.

You can choose sfor sumand c(or n) for count, but this will reduce the importance of all variables in the program to one level. Can be replaced peoplebyp, but there will be a problem how to name an iteration variable for ... range. The only one personwill look strange, because the short-lived iteration variable gets a longer name than the several values ​​from which it is derived.

Tip . Separate the stream of functions with empty lines, as empty lines between paragraphs break the flow of text. As AverageAgewe have three successive operations. First, checking the division by zero, then the conclusion of the total age and number of people, and the last - the calculation of the average age.

2.2.1. The main thing is context


It is important to understand that most naming tips are context specific. I like to say that this is a principle, not a rule.

What is the difference between identifiers iand index? For example, you can’t say unequivocally that such a code

for index := 0; index < len(s); index++ {
	//
}

fundamentally more readable than

for i := 0; i < len(s); i++ {
	//
}

I believe that the second option is no worse, because in this case the area is ieither indexlimited by the body of the cycle for, and additional verbosity adds little to the understanding of the program.

But which of these functions is more readable?

func (s *SNMP) Fetch(oid []int, index int) (int, error)

or

func (s *SNMP) Fetch(o []int, i int) (int, error)

In this example, it oidis an abbreviation of SNMP Object ID, and an additional abbreviation for oforces when reading code to switch from a documented notation to a shorter one in the code. Similarly, shortening indexto imakes it harder to understand, because in SNMP messages, the sub value of each OID is called an index.

Tip . Do not combine long and short formal parameters in one ad.

2.3. Do not name variables by type


You don’t call your pets “dog” and “cat”, right? For the same reason, you should not include the type name in the variable name. It should describe the content, not its type. Consider an example:

var usersMap map[string]*User

What good is this announcement? We see that this is a map, and it has something to do with the type *User: this is probably good. But it usersMap ’s really a map, and Go as a statically typed language will not allow accidentally using such a name where a scalar variable is required, so the suffix is Mapredundant.

Consider a situation where other variables are added:

var (
	companiesMap map[string]*Company
	productsMap  map[string]*Products
)

We now have the three variables of type map: usersMap, companiesMapand productsMap, as all lines are matched with different types. We know that these are cards, and we also know that the compiler will throw an error if we try to use companiesMapit where the code expects map[string]*User. In this situation, it is clear that the suffix Mapdoes not improve the clarity of the code, these are just extra characters.

I suggest avoiding any suffixes that resemble the type of a variable.

Tip . If the name is usersnot clear enough about the essence, then usersMaptoo.

This tip also applies to function parameters. For instance:

type Config struct {
	//
}
func WriteConfig(w io.Writer, config *Config)

The name configfor the parameter is *Configredundant. We already know what it is *Config, right next to it is written.

In this case, consider confor cif the lifetime of the variable is short enough.

If at some point in our region more than one *Config, the names conf1and conf2less meaningful than original, and updated, as the latter is more difficult to mix.

Note . Do not let package names steal good variable names.

The name of the imported identifier contains the name of the package. For example, the type Contextin the package contextwill be called context.Context. This makes it impossible to use a variable or type in your package context.

func WriteLog(context context.Context, message string)

This will not compile. This is why when declaring types locally context.Context, for example, names like are traditionally used ctx.

func WriteLog(ctx context.Context, message string)

2.4. Use a single naming style


Another property of a good name is that it should be predictable. The reader must immediately understand it. If this is a common name, then the reader has the right to assume that it has not changed the meaning from the previous time.

For example, if the code goes around the database descriptor, each time the parameter is displayed, it should have the same name. Instead of every type combinations d *sql.DB, dbase *sql.DB, DB *sql.DBand database *sql.DBmake better use of one thing:

db *sql.DB

It’s easier to understand the code. If you see db, you know that this *sql.DBand it is declared locally or provided by the caller.

Similar advice regarding recipients of a method; use the same recipient name for each method of this type. This will make it easier for the reader to understand the use of the receiver among the various methods of this type.

Note . Go Recipient Short Name Agreement contradicts previously voiced recommendations. This is one of those cases when the choice made at an early stage becomes the standard style, like using CamelCaseinstead snake_case.

Совет. Стиль Go указывает на однобуквенные имена или аббревиатуры для получателей, производные от их типа. Может оказаться, что имя получателя иногда конфликтует с именем параметра в методе. В этом случае рекомендуется сделать имя параметра немного длиннее и не забывать последовательно его использовать.

Finally, some one-letter variables are traditionally associated with loops and counting. For example, i, jand kare typically induction variable in cycles for, nusually associated with the counter or accumulator, vis a typical reduction in value encoding function, kusually used for key cards and sis often used as an abbreviation for the types of parameters string.

As in the example dbabove, programmers expect to be ian inductive variable. If they see it in code, they expect to see a loop soon.

Tip . If you have so many nested loops that you run out of stock of the variables i, jand kthen can be split function into smaller units.

2.5. Use a single declaration style


Go has at least six different ways to declare a variable.

  • var x int = 1
  • var x = 1
  • var x int; x = 1
  • var x = int(1)
  • x := 1

I’m sure I haven’t remembered everything yet. Go developers probably consider this a mistake, but it's too late to change anything. With this choice, how to ensure a uniform style?

I want to propose a style of declaring variables that I myself try to use wherever possible.

  • When declaring a variable without initialization, usevar .

    var players int    // 0
    var things []Thing // an empty slice of Things
    var thing Thing    // empty Thing struct
    json.Unmarshall(reader, &thing)

    var действует как подсказка, что эта переменная намеренно объявлена как нулевое значение указанного типа. Это согласуется с требованием объявлять переменные на уровне пакета с помощью var в отличие от синтаксиса короткого объявления, хотя позже я приведу аргументы, что переменные уровня пакета вообще не следует использовать.
  • При объявлении c инициализацией используйте :=. Это даёт понять читателю, что переменная слева от := намеренно инициализируется.

    Чтобы объяснить почему, давайте рассмотрим предыдущий пример, но на этот раз специально инициализируем каждую переменную:

    var players int = 0
    var things []Thing = nil
    var thing *Thing = new(Thing)
    json.Unmarshall(reader, thing)

Since Go does not have automatic conversions from one type to another, in the first and third examples the type on the left side of the assignment operator must be identical to the type on the right side. The compiler can infer the type of the declared variable from the type on the right, so the example can be written more concisely:

var players = 0
var things []Thing = nil
var thing = new(Thing)
json.Unmarshall(reader, thing)

Here, they are playersexplicitly initialized to 0, which is redundant, because the initial value playersin any case is zero. Therefore, it is better to make it clear that we want to use a null value:

var players int

What about the second operator? We cannot determine the type and write

var things = nil

Because u nilhave no type . Instead, we have a choice: or we use a zero value to slice ...

var things []Thing

... or create a slice with zero elements?

var things = make([]Thing, 0)

In the second case, the value for the slice is not zero, and we make it clear to the reader using a short form of declaration:

things := make([]Thing, 0)

This tells the reader that we decided to explicitly initialize things.

So we come to the third declaration:

var thing = new(Thing)

Here, at the same time, there is an explicit initialization of the variable and the introduction of a “unique” keyword new, which some Go programmers do not like. Using the recommended short syntax yields

thing := new(Thing)

This makes it clear that it is thingexplicitly initialized to the result new(Thing), but still leaves it atypical new. The problem could be solved using a literal:

thing := &Thing{}

Which is similar new(Thing), and such duplication upsets some Go programmers. However, this means that we explicitly initialize thingwith a pointer to Thing{}and a null value Thing.

But it is better to take into account the fact that it is thingdeclared with a zero value, and use the address of the operator to transfer the address thingto json.Unmarshall:

var thing Thing
json.Unmarshall(reader, &thing)

Note . Of course, there are exceptions to any rule. For example, sometimes two variables are closely related, so it will be strange to write

var min int
max := 1000

More readable declaration:

min, max := 0, 1000

Summarize:

  • When declaring a variable without initialization, use the syntax var.
  • When declaring and explicitly initializing a variable, use :=.

Tip . Explicitly point out complex things.

var length uint32 = 0x80

Здесь length может использоваться с библиотекой, что требует определённого числового типа, и такой вариант более явно указывает, что тип length специально выбран как uint32, чем в короткой декларации:

length := uint32(0x80)

В первом примере я намеренно нарушаю своё правило, используя декларацию var при явной инициализации. Отход от стандарта даёт читателю понять, что происходит нечто необычное.

2.6. Работайте на коллектив


I have already said that the essence of software development is the creation of readable, supported code. Most of your career will probably work on joint projects. My advice in this situation: follow the style adopted in the team.

Changing styles in the middle of the file is annoying. Consistency is important, albeit to the detriment of personal preference. My rule of thumb is: if the code matches through gofmt, then the problem is usually not worth the discussion.

Tip . If you want to rename throughout the code base, do not mix this with other changes. If someone uses git bisect, he will not like to wade through thousands of renames to find another modified code.

3. Comments


Before we move on to more important points, I want to take a couple of minutes to comment.

“A good code has a lot of comments, and a bad code needs a lot of comments.”  - Dave Thomas and Andrew Hunt, Pragmatic Programmer

Comments are very important for the readability of the program. Each comment should do one - and only one - of three things:

  1. Explain what the code does.
  2. Explain how he does it.
  3. Explain why .

The first form is ideal for commenting on public characters:

// Open открывает указанный файл для чтения.
// В случае успеха на возвращаемом файле можно использовать методы для чтения.

The second is ideal for comments inside a method:

// очередь всех зависимых действий
var results []chan error
for _, dep := range a.Deps {
        results = append(results, execute(seen, dep))
}

The third form (“why”) is unique in that it does not supplant or replace the first two. Such comments explain the external factors that led to the writing of the code in its current form. Often without this context, it is difficult to understand why the code is written in this way.

return &v2.Cluster_CommonLbConfig{
	// Отключаем HealthyPanicThreshold
    HealthyPanicThreshold: &envoy_type.Percent{
    	Value: 0,
    },
}

In this example, it may not be immediately clear what happens when HealthyPanicThreshold is set to zero percent. The comment is intended to clarify that a value of 0 disables the panic threshold.

3.1. Comments in variables and constants should describe their contents, not purpose


I said earlier that the name of a variable or constant should describe its purpose. But a comment on a variable or constant should describe exactly the content , not the purpose .

const randomNumber = 6 // выводится из случайной матрицы

In this example, a comment describes whyrandomNumber the value 6 is assigned and where it came from. The comment does not describe where it will be used randomNumber. Here are some more examples:

const (
    StatusContinue           = 100 // RFC 7231, 6.2.1
    StatusSwitchingProtocols = 101 // RFC 7231, 6.2.2
    StatusProcessing         = 102 // RFC 2518, 10.1
    StatusOK                 = 200 // RFC 7231, 6.3.1

In the context of HTTP, a number is 100known as StatusContinuethat defined in RFC 7231, section 6.2.1.

Совет. Для переменных без начального значения комментарий должен описывать, кто отвечает за инициализацию этой переменной.

// sizeCalculationDisabled указывает, безопасно ли
// рассчитать ширину и выравнивание типов. См. dowidth.
var sizeCalculationDisabled bool

Здесь комментарий сообщает читателю, что функция dowidth отвечает за поддержание состояния sizeCalculationDisabled.

Совет. Прячьте на виду. Это совет от Кейт Грегори. Иногда лучшее имя для переменной скрывается в комментариях.

// реестр драйверов SQL
var registry = make(map[string]*sql.Driver)

Комментарий добавлен автором, потому что имя registry недостаточно объясняет свое назначение — это реестр, но реестр чего?

Если переименовать переменную в sqlDrivers, то становится ясно, что она содержит драйверы SQL.

var sqlDrivers = make(map[string]*sql.Driver)

Теперь комментарий стал избыточным и его можно удалить.

3.2. Всегда документируйте общедоступные символы


The documentation for your package is generated by godoc, so you should add a comment to each public character declared in the package: a variable, constant, function, and method.

Here are two guidelines from the Google Style Guide:

  • Any public function that is not both obvious and concise should be commented on.
  • Any function in the library should be commented, regardless of length or complexity.


package ioutil
// ReadAll читает из r до ошибки или конца файла (EOF) и возвращает
// прочитанные.данные. Успешный вызов возвращает err == nil, not err == EOF.
// Поскольку ReadAll должна читать до конца файла, она не интерпретирует его
// как ошибку.
func ReadAll(r io.Reader) ([]byte, error)

There is one exception to this rule: you do not need to document methods that implement the interface. Specifically, do not do this:

// Read реализует интерфейс io.Reader
func (r *FileReader) Read(buf []byte) (int, error)

This comment does not mean anything. He does not say what the method does: worse, he sends somewhere to look for documentation. In this situation, I propose to completely delete the comment.

Here is an example from the package io.

// LimitReader возвращает Reader, который читает из r,
// но останавливается с EOF после n байт.
// Основан на *LimitedReader.
func LimitReader(r Reader, n int64) Reader { return &LimitedReader{r, n} }
// LimitedReader читает из R, но ограничивает объём возвращаемых
// данных всего N байтами. Каждый вызов Read обновляет N для
// отражения новой оставшейся суммы.
// Read возвращает EOF, когда N <= 0 или когда основное R возвращает EOF.
type LimitedReader struct {
	R Reader // underlying reader
	N int64  // max bytes remaining
}
func (l *LimitedReader) Read(p []byte) (n int, err error) {
	if l.N <= 0 {
		return 0, EOF
	}
	if int64(len(p)) > l.N {
		p = p[0:l.N]
	}
	n, err = l.R.Read(p)
	l.N -= int64(n)
	return
}

Note that the declaration is LimitedReaderimmediately preceded by the function that uses it, and the declaration LimitedReader.Readfollows the declaration itself LimitedReader. Although it is LimitedReader.Readnot documented itself , it can be understood that this is an implementation io.Reader.

Tip . Before writing a function, write a comment describing it. If you find it difficult to write a comment, then this is a sign that the code you are about to write will be difficult to understand.

3.2.1. Do not comment on bad code, rewrite it


“Don't Comment Bad Code - Rewrite It” - Brian Kernighan

It is not enough to indicate in the comments the difficulty of the code fragment. If you come across one of these comments, you should start a ticket with a reminder of refactoring. You can live with technical debt as long as its amount is known.

In the standard library, it is customary to leave comments in the TODO style with the name of the user who noticed the problem.

// TODO(dfc) является O(N^2), нужно найти более эффективную процедуру.

This is not an obligation to fix the problem, but the indicated user may be the best person to ask a question. Other projects accompany TODO with a date or ticket number.

3.2.2. Instead of commenting out the code, refactor it


“Good code is the best documentation. When you are about to add a comment, ask yourself the question: “How to improve the code so that this comment is not needed?” Refactor and leave a comment to make it even clearer. ”  - Steve McConnell

Functions should perform only one task. If you want to write a comment because some fragment is not related to the rest of the function, then consider extracting it into a separate function.

Smaller features are not only clearer, but easier to test separately from each other. When you isolated the code into a separate function, its name can replace a comment.

4. Package structure


“Write a modest code: modules that do not show anything superfluous to other modules and that do not rely on the implementation of other modules”  - Dave Thomas

Each package is essentially a separate small Go program. Just as the implementation of a function or method does not matter for the caller, the implementation of the functions, methods and types that make up the public API of your package does not matter.

A good Go package strives for minimal connectivity with other packages at the source code level so that as the project grows, changes in one package are not cascaded throughout the code base. Such situations greatly inhibit programmers working on this code base.

In this section, we’ll talk about package design, including its name and tips for writing methods and functions.

4.1. A good package starts with a good name


A good Go package starts with a quality name. Think of it as a short presentation limited to just one word.

Like the variable names in the previous section, the package name is very important. No need to think about the data types in this package, it is better to ask the question: “What service does this package provide?” Usually the answer is not “This package provides type X”, but “This package allows you to connect via HTTP.”

Tip . Choose a package name by its functionality, not its content.

4.1.1. Good package names must be unique


Each package has a unique name in the project. There is no difficulty if you followed the advice of giving names for the purpose of the packages. If it turns out that the two packages have the same name, most likely:

  1. У пакета слишком общее название.
  2. Пакет перекрывается другим пакетом с аналогичным названием. В этом случае следует либо просмотреть проект, либо рассмотреть возможность объединения пакетов.

4.2. Избегайте названий вроде base, common или util


A common reason for bad names is the so-called service packages , where over time various helpers and service code accumulate. Since it is difficult to find a unique name there. This often leads to the fact that the package name is derived from what it contains : utilities.

Names like utilsor helpersare usually found in large projects, in which a deep hierarchy of packages is rooted, and auxiliary functions are shared. If you extract some function into a new package, the import breaks down. In this case, the name of the package does not reflect the purpose of the package, but only the fact that the import function failed due to improper organization of the project.

In such situations, I recommend analyzing where the packages are called from.utilshelpers, and, if possible, move the corresponding functions to the calling packet. Even if this implies duplication of some auxiliary code, it is better than introducing an import dependency between two packages.

“[A little] duplication is much cheaper than a wrong abstraction”  - Sandy Mets

If utility functions are used in many places, instead of one monolithic package with utility functions, it is better to make several packages, each of which focuses on one aspect.

Tip . Use the plural for service packages. For example, stringsfor string processing utilities.

Packages with names like baseor commonare often encountered when a certain common functionality of two or more implementations or common types for a client and a server is merged into a separate package. I believe that in such cases it is necessary to reduce the number of packages by combining the client, server and common code in one package with a name that corresponds to its function.

For example, to net/httpnot do the individual packages clientand server, instead, there are files client.goand server.gowith the corresponding data types, as well as transport.gofor the total transport.

Tip . It is important to remember that the identifier name includes the package name.

  • A function Getfrom a package net/httpbecomes a http.Getlink from another package.
  • A type Readerfrom a package is stringstransformed into when imported into other packages strings.Reader.
  • The interface Errorfrom the package is netclearly associated with network errors.

4.3. Come back quickly without diving deep


Since Go does not use exceptions in the control flow, there is no need to dig deep into the code to provide a top-level structure for tryand blocks catch. Instead of a multi-level hierarchy, Go code goes down the screen as the function progresses. My friend Matt Ryer calls this practice a "line of sight . "

This is achieved using boundary operators : conditional blocks with a precondition at the input to the function. Here is an example from the package bytes:

func (b *Buffer) UnreadRune() error {
	if b.lastRead <= opInvalid {
		return errors.New("bytes.Buffer: UnreadRune: previous operation was not a successful ReadRune")
	}
	if b.off >= int(b.lastRead) {
		b.off -= int(b.lastRead)
	}
	b.lastRead = opInvalid
	return nil
}

Upon entering the function UnreadRune, the state is checked b.lastReadand if the previous operation was not ReadRune, then an error is immediately returned. The rest of the function works based on what is b.lastReadgreater than opInvalid.

Compare with the same function, but without the boundary operator:

func (b *Buffer) UnreadRune() error {
	if b.lastRead > opInvalid {
		if b.off >= int(b.lastRead) {
			b.off -= int(b.lastRead)
		}
		b.lastRead = opInvalid
		return nil
	}
	return errors.New("bytes.Buffer: UnreadRune: previous operation was not a successful ReadRune")
}

The body of a more likely successful branch is embedded in the first condition if, and the condition for a successful exit return nilmust be discovered by carefully matching the closing brackets. The last line of the function now returns an error, and you need to track the execution of the function to the corresponding opening bracket to find out how to get to this point.

This option is harder to read, which degrades the quality of programming and code support, so Go prefers to use boundary operators and return errors at an early stage.

4.4. Make null value useful


Each variable declaration, assuming the absence of an explicit initializer, will be automatically initialized with a value corresponding to the contents of zeroed memory, that is, zero . The type of value is determined by one of the options: for numeric types - zero, for pointer types - nil, the same for slices, maps, and channels.

The ability to always set a known default value is important for the security and correctness of your program and can make your Go programs easier and more compact. This is what Go programmers have in mind when they say, “Give structures a useful zero value.”

Consider a type sync.Mutexthat contains two integer fields representing the internal state of the mutex. These fields are automatically null in any declaration.sync.Mutex. This fact is taken into account in the code, so the type is suitable for use without explicit initialization.

type MyInt struct {
	mu  sync.Mutex
	val int
}
func main() {
	var i MyInt
	// i.mu is usable without explicit initialisation.
	i.mu.Lock()
	i.val++
	i.mu.Unlock()
}

Another example of a type with a useful null value is bytes.Buffer. You can declare and start writing to it without explicit initialization.

func main() {
	var b bytes.Buffer
	b.WriteString("Hello, world!\n")
	io.Copy(os.Stdout, &b)
}

The zero value of this structure means that lenboth capare equal 0, and y array, the pointer to the memory with the contents of the backup slice array, value nil. This means that you do not need to explicitly cut, you can simply declare it.

func main() {
	// s := make([]string, 0)
	// s := []string{}
	var s []string
	s = append(s, "Hello")
	s = append(s, "world")
	fmt.Println(strings.Join(s, " "))
}

Note . var s []stringsimilar to the two commented lines at the top, but not identical to them. There is a difference between a slice value of nil and a slice value of zero length. The following code will print false.

func main() {
	var s1 = []string{}
	var s2 []string
	fmt.Println(reflect.DeepEqual(s1, s2))
}

A useful, albeit unexpected, property of uninitialized pointer variables - nil pointers - is the ability to call methods on types that are nil. This can be used to easily provide default values.

type Config struct {
	path string
}
func (c *Config) Path() string {
	if c == nil {
		return "/usr/home"
	}
	return c.path
}
func main() {
	var c1 *Config
	var c2 = &Config{
		path: "/export",
	}
	fmt.Println(c1.Path(), c2.Path())
}

4.5. Avoid package level state


The key to writing easy-to-support programs that are weakly connected is that changing one package should have a low probability of affecting another package that is not directly dependent on the first.

There are two great ways to achieve weak connectivity in Go:

  1. Use interfaces to describe the behavior required by functions or methods.
  2. Avoid global status.

In Go, we can declare variables in the scope of a function or method, as well as in the scope of a package. When a variable is publicly available, with an identifier with a capital letter, then its scope is actually global for the entire program: any package at any time sees the type and contents of this variable.

The mutable global state provides a close relationship between the independent parts of the program, as global variables become an invisible parameter for each function in the program! Any function that relies on a global variable can be violated when the type of this variable changes. Any function that depends on the state of a global variable can be violated if another part of the program changes this variable.

How to reduce the connectivity that a global variable creates:

  1. Move the corresponding variables as fields to the structures that need them.
  2. Use interfaces to reduce the connection between behavior and the implementation of this behavior.

5. Project structure


Let's talk about how packages are combined into a project. This is usually a single Git repository.

Like the package, each project should have a clear goal. If it is a library, it must do one thing, for example, XML parsing or journaling. You should not combine several goals in one project, this will help to avoid a scary library common.

Tip . In my experience, the repository commonultimately is closely associated with the largest consumer, and this makes it difficult to make corrections to previous versions (back-port fixes) without updating both the commonand the consumer at the blocking stage, which leads to many unrelated changes, plus they break along the way API

If you have an application (web application, Kubernetes controller, etc.), the project may have one or more main packages. For example, in my Kubernetes controller there is one package cmd/contourthat serves as a server deployed in a Kubernetes cluster and as a debug client.

5.1. Fewer packages but larger


In the code review, I noticed one of the typical mistakes of programmers who switched to Go from other languages: they tend to abuse packages.

Go does not provide the elaborate system of visibility: the language is not enough access modifiers as in the Java ( public, protected, privateand implicit default). There is no analogue of friendly classes from C ++.

In Go, we have only two access modifiers: these are public and private identifiers, which is indicated by the first letter of the identifier (uppercase / lowercase). If the identifier is public, its name begins with an uppercase letter, it can be referenced by any other Go package.

Note . You could hear the words “exported” or “not exported” as synonyms for public and private.

Given the limited access control features, what methods can be used to avoid overly complex package hierarchies?

Tip . In each package, in addition to cmd/and internal/must be present source code.

I have repeatedly said that it is better to prefer fewer larger packets. Your default position should be to not create a new package. This causes too many types to become public, creating a wide and small scope of available API. Below we consider this thesis in more detail.

Tip . Came from Java?

If you come from the Java or C # world, then remember the unspoken rule: a Java package is equivalent to a single source file .go. The Go package is equivalent to the whole Maven module or .NET assembly.

5.1.1. Sorting code by file using import instructions


If you organize packages by service, should you do the same for the files in the package? How to know when to split one file .gointo several? How do you know if you have gone too far and need to think about merging files?

Here are the recommendations I use:

  • Start each package with one file .go. Give this file the same name as the directory. For example, the package httpshould be in a file http.goin a directory http.
  • As the package grows, you can split the various functions into several files. For example, the file messages.gowill contain types Requestand Response, file client.go- type Client, file server.go - type server.
  • Если у файлов оказались похожие декларации импорта, подумайте об их объединении. Как вариант, можно проанализировать наборы импорта и переместить их.
  • Разные файлы должны отвечать за разные области пакета. Так, messages.go может отвечать за маршалинг HTTP-запросов и ответов в сети и вне сети, http.go может содержать низкоуровневую логику обработки сети, client.go и server.go — логику построения запроса HTTP или маршрутизации и так далее.

Совет. Предпочитайте существительные для названия исходных файлов.

Примечание. Компилятор Go компилирует каждый пакет параллельно. Внутри пакета параллельно компилируется каждая функция (методы — это просто причудливые функции в Go). Изменение макета кода в пакете не должно повлиять на время компиляции.

5.1.2. Prefer internal tests to external


The tool gosupports the package testingin two places. If you have a package http2, you can write a file http2_test.goand use the package declaration http2. It compiles the code http2_test.go, as it is part of the package http2. In colloquial speech, such a test is called internal.

The tool goalso supports a special package declaration that ends with test , that is http_test. This allows the test files to live in the same package with the code, but when such tests are compiled, they are not part of the code of your package, but live in their own package. This allows you to write tests as if another package was invoking your code. Such tests are called external.

I recommend using internal tests for unit unit tests. This allows you to test each function or method directly, avoiding the bureaucracy of external testing.

But it is necessary to place examples of test functions ( Example) in an external test file . This ensures that when viewed in godoc, the examples will receive the appropriate package prefix and can be easily copied.

Совет. Избегайте сложных иерархий пакетов, не поддавайтесь желанию применять таксономию.

За одним исключением, о котором поговорим ниже, иерархия пакетов Go не имеет значения для инструмента go. Например, пакет net/http не является дочерним или вложенным пакетом net.

Если у вас в проекте появились промежуточные каталоги без файлов .go, возможно, вы ослушались этого совета.

5.1.3. Используйте внутренние пакеты, чтобы уменьшить область общедоступного API


If your project has multiple packages, you may find exported functions that are intended to be used by other packages, but not for the public API. In such a situation, the tool gorecognizes a special folder name internal/that can be used to place code that is open for your project, but closed to others.

To create such a package, place it in a directory with a name internal/or in its subdirectory. When the team gosees the import of the package with the path internal, it checks the location of the calling package in a directory or subdirectory internal/.

For example, a package .../a/b/c/internal/d/e/fcan import only a package from a directory tree .../a/b/c, but not at all .../a/b/gor any other repository (see the documentation)

5.2. The smallest main package


A function mainand a package mainmust have minimal functionality, because it main.mainacts like a singleton: a program can have only one function main, including tests.

Since it main.mainis a singleton, there are many restrictions on called objects: they are called only during main.mainor main.init, and only once . This makes writing tests for code difficult main.main. Thus, you need to strive to derive as much logic as possible from the main function and, ideally, from the main package.

Tip . func main()must analyze flags, open connections to databases, loggers, etc., and then transfer execution to a high-level object.

6. API structure


The last design advice for the project I consider the most important.

All previous sentences are, in principle, not binding. These are just recommendations based on personal experience. I do not push these recommendations too much into a code review.

The API is another matter, here we take the errors more seriously, because everything else can be fixed without breaking backward compatibility: for the most part, these are just implementation details.

When it comes to public APIs, it’s worth seriously considering the structure from the very beginning, because subsequent changes will be destructive for users.

6.1. Design APIs that are hard to abuse by design


“APIs must be simple for proper use and difficult for incorrect”  - Josh Bloch

Josh Bloch's advice is perhaps the most valuable in this article. If the API is difficult to use for simple things, then every API call is more complicated than necessary. When an API call is complex and unobvious, it is likely to be overlooked.

6.1.1. Be careful with functions that accept multiple parameters of the same type.


A good example of a simple at first glance, but difficult to use API is when it requires two or more parameters of the same type. Compare two function signatures:

func Max(a, b int) int
func CopyFile(to, from string) error

What is the difference between these two functions? Obviously, one returns a maximum of two numbers, and the other copies the file. But this is not the point.

Max(8, 10) // 10
Max(10, 8) // 10

Max is commutative : the order of the parameters does not matter. A maximum of eight and ten is ten, regardless of whether eight and ten or ten and eight are compared.

But in the case of CopyFile, this is not so.

CopyFile("/tmp/backup", "presentation.md")
CopyFile("presentation.md", "/tmp/backup")

Which of these operators will backup your presentation, and which will overwrite it with the version of last week? You cannot tell until you check the documentation. In the course of the code review, it is unclear whether the order of the arguments is correct or not. Again, look at the documentation.

One possible solution is to introduce an auxiliary type that is responsible for the correct call CopyFile.

type Source string
func (src Source) CopyTo(dest string) error {
	return CopyFile(dest, string(src))
}
func main() {
	var from Source = "presentation.md"
	from.CopyTo("/tmp/backup")
}

It is CopyFilealways called correctly here - this can be stated using a unit test - and can be done private, which further reduces the likelihood of incorrect use.

Tip . An API with multiple parameters of the same type is difficult to use correctly.

6.2. Design an API for a Basic Use Case


A few years ago, I made a presentation on using functional options to make the API easier by default.

The essence of the presentation was that you should develop an API for the main use case. In other words, the API should not require the user to provide extra parameters that do not interest him.

6.2.1. Using nil as a parameter is not recommended


I started by saying that you should not force the user to provide API parameters that do not interest him. This means designing the APIs for the main use case (default option).

Here is an example from the net / http package.

package http
// ListenAndServe listens on the TCP network address addr and then calls
// Serve with handler to handle requests on incoming connections.
// Accepted connections are configured to enable TCP keep-alives.
//
// The handler is typically nil, in which case the DefaultServeMux is used.
//
// ListenAndServe always returns a non-nil error.
func ListenAndServe(addr string, handler Handler) error {

ListenAndServeaccepts two parameters: a TCP address for listening on incoming connections and http.Handlerfor processing an incoming HTTP request. Serveallows the second parameter to be nil. In the comments, it is noted that usually the calling object will indeed pass nil, indicating a desire to use it http.DefaultServeMuxas an implicit parameter.

Now the caller Servehas two ways to do the same.

http.ListenAndServe("0.0.0.0:8080", nil)
http.ListenAndServe("0.0.0.0:8080", http.DefaultServeMux)

Both options do the same thing.

This application nilspreads like a virus. The package also httphas a helper http.Serve, so you can imagine the structure of the function ListenAndServe:

func ListenAndServe(addr string, handler Handler) error {
	l, err := net.Listen("tcp", addr)
	if err != nil {
		return err
	}
	defer l.Close()
	return Serve(l, handler)
}

Since ListenAndServeit allows the caller to pass nilfor the second parameter, it http.Servealso supports this behavior. In fact, it is in the http.Servelogic implemented "if the handler is equal nil, use DefaultServeMux". Acceptance nilfor one parameter can lead the caller to think that it can be passed nilfor both parameters. But suchServe

http.Serve(nil, nil)

leads to a terrible panic.

Tip . Do not mix parameters in the same function signature niland not nil.

The author http.ListenAndServetried to simplify the life of API users for the default case, but security was affected.

In the presence, nilthere is no difference in the number of lines between explicit and indirect use DefaultServeMux.

	const root = http.Dir("/htdocs")
	http.Handle("/", http.FileServer(root))
	http.ListenAndServe("0.0.0.0:8080", nil)

compared with

	const root = http.Dir("/htdocs")
	http.Handle("/", http.FileServer(root))
	http.ListenAndServe("0.0.0.0:8080", http.DefaultServeMux)

Was it worth the confusion to keep one line?

	const root = http.Dir("/htdocs")
	mux := http.NewServeMux()
	mux.Handle("/", http.FileServer(root))
	http.ListenAndServe("0.0.0.0:8080", mux)

Tip . Think seriously about how much time the helper functions will save the programmer. Clarity is better than brevity.

Tip . Avoid public APIs with parameters that only tests need. Avoid exporting APIs with parameters whose values ​​differ only during testing. Instead, export wrapper functions that hide the transfer of such parameters, and in tests use similar helper functions that pass values ​​necessary for the test.

6.2.2. Use variable length arguments instead of [] T


Very often, a function or method takes a slice of values.

func ShutdownVMs(ids []string) error

This is just a made-up example, but this is very common. The problem is that these signatures assume that they will be called with more than one record. As experience shows, they are often called with only one argument, which must be “packed” inside the slice in order to meet the requirements of the function signature.

In addition, since the parameter idsis a slice, you can pass an empty slice or zero to the function, and the compiler will be happy. This adds an extra test burden as testing should cover such cases.

To give an example of such an API class, I recently refactored logic that required the installation of some additional fields if at least one of the parameters was nonzero. The logic looked something like this:

if svc.MaxConnections > 0 || svc.MaxPendingRequests > 0 || svc.MaxRequests > 0 || svc.MaxRetries > 0 {
	// apply the non zero parameters
}

Since the operator was ifgetting very long, I wanted to pull the validation logic into a separate function. Here is what I came up with:

// anyPostive indicates if any value is greater than zero.
func anyPositive(values ...int) bool {
	for _, v := range values {
		if v > 0 {
			return true
		}
	}
	return false
}

This made it possible to clearly state the condition under which the indoor unit will be executed:

if anyPositive(svc.MaxConnections, svc.MaxPendingRequests, svc.MaxRequests, svc.MaxRetries) {
        // apply the non zero parameters
}

However, there is a problem with anyPositive, someone could accidentally call it like this:

if anyPositive() { ... }

In that case, anyPositivewill return false. This is not the worst option. Worse if anyPositivereturned truein the absence of arguments.

However, it would be better to be able to change the signature of anyPositive to ensure that at least one argument is passed to the caller. This can be done by combining parameters for normal arguments and variable-length arguments (varargs):

// anyPostive indicates if any value is greater than zero.
func anyPositive(first int, rest ...int) bool {
	if first > 0 {
		return true
	}
	for _, v := range rest {
		if v > 0 {
			return true
		}
	}
	return false
}

Now anyPositiveyou cannot call with less than one argument.

6.3. Let the functions determine the desired behavior.


Suppose I was given the task of writing a function that preserves the structure Documenton disk.

// Save записывает содержимое документа в файл f.
func Save(f *os.File, doc *Document) error

I could write a function Savethat writes Documentto a file *os.File. But there are a few problems.

The signature Saveeliminates the possibility of recording data over the network. If such a requirement appears in the future, the signature of the function will have to be changed, which will affect all calling objects.

Savealso unpleasant to test, since it works directly with files on disk. Thus, in order to verify its operation, the test must read the contents of the file after writing.

And I have to make sure that it is fwritten to a temporary folder and subsequently deleted.

*os.Filealso defines many methods that are not related to Save, for example, reading directories and checking if a path is a symbolic link. Well, if the signatureSavedescribed only the relevant parts *os.File.

What can be done?

// Save записывает содержимое документа в предоставленный
// ReadWriterCloser.
func Save(rwc io.ReadWriteCloser, doc *Document) error

With the help of io.ReadWriteCloserit you can apply the principle of interface separation - and redefine it Saveon an interface that describes the more general properties of the file.

After such a change, any type that implements the interface io.ReadWriteClosercan be replaced with the previous one *os.File.

This simultaneously expands the scope Saveand clarifies to the caller which type methods *os.Fileare related to its operation.

And the author Savecan no longer call these unrelated methods for *os.File, because he is hidden behind the interface io.ReadWriteCloser.

But we can extend the principle of interface separation even further.

Firstly ifSave follows the principle of single responsibility, it is unlikely that he will read the file that he just wrote to check its contents - other code should do this.

// Save записывает содержимое документа в предоставленный
// WriteCloser.
func Save(wc io.WriteCloser, doc *Document) error

Therefore, you can narrow down the specifications of the interface to Savejust write and close.

Secondly, the thread closing mechanism y Saveis a legacy of the time when it worked with the file. The question is, under what circumstances wcwill it be closed.

Whether the Savecause Closeunconditionally, whether in the case of success.

This presents a problem for the caller because he might want to add data to the stream after the document is written.

// Save записывает содержимое документа в предоставленный
// Writer.
func Save(w io.Writer, doc *Document) error

The best option is to redefine Save to work only with io.Writer, saving the operator from all other functionality, except for writing data to the stream.

After applying the principle of interface separation, the function at the same time became both more specific in terms of requirements (it needs only an object where it can be written), and more general in terms of functionality, since now we can use it Saveto save data wherever it is implemented io.Writer.

7. Error handling


I gave several presentations and wrote a lot on this topic on the blog, so I won’t repeat it. Instead, I want to cover two other areas related to error handling.



7.1. Eliminate the need for error handling by removing the errors themselves


I made many suggestions for improving the error handling syntax, but the best option is not to handle them at all.

Note . I do not say “delete error handling”. I suggest changing the code so that there are no errors for processing.

John Osterhout’s recent software development philosophy book inspired me to make this suggestion . One of the chapters is entitled “Eliminate Errors from Reality”. Let's try to apply this advice.

7.1.1. Row count


We will write a function to count the number of lines in a file.

func CountLines(r io.Reader) (int, error) {
	var (
		br    = bufio.NewReader(r)
		lines int
		err   error
	)
	for {
		_, err = br.ReadString('\n')
		lines++
		if err != nil {
			break
		}
	}
	if err != io.EOF {
		return 0, err
	}
	return lines, nil
}

As we follow the advice from the previous sections, CountLinesaccepts io.Reader, not *os.File; it’s already the task of the caller to provide io.Readerwhose content we want to count.

We create bufio.Reader, and then call the method in a loop ReadString, increasing the counter, until we reach the end of the file, then we return the number of lines read.

At least we want to write such code, but the function is burdened with error handling. For example, there is such a strange construction:

		_, err = br.ReadString('\n')
		lines++
		if err != nil {
			break
		}

We increase the number of lines before checking for errors - this looks strange.

The reason we should write it this way is because it ReadStringwill return an error if it encounters the end of the file earlier than the newline character. This can happen if there is no new line at the end of the file.

To try to fix this, change the logic of the row counter, and then see if we need to exit the loop.

Note . This logic is still not perfect, can you find a mistake?

But we have not finished checking for errors. ReadStringwill return io.EOFwhen it encounters the end of the file. This is the expected situation, so for ReadStringyou need to do some way to say "stop, there is nothing more to read." Therefore, before returning the error to the calling object CountLine, you need to check that the error is not related to io.EOF, and then pass it on, otherwise we return niland say that everything is fine.

I think this is a good example of Russ Cox's thesis about how error handling can hide the function. Let's look at the improved version.

func CountLines(r io.Reader) (int, error) {
	sc := bufio.NewScanner(r)
	lines := 0
	for sc.Scan() {
		lines++
	}
	return lines, sc.Err()
}

This improved version uses bufio.Scannerinstead bufio.Reader.

Under the hood bufio.Scanneruses bufio.Reader, but adds a good level of abstraction, which helps remove error handling.

Примечание. bufio.Scanner может сканировать любой шаблон, но по умолчанию ищет новые строки.

The method sc.Scan()returns a value trueif the scanner encountered a string and did not find an error. Thus, the loop body is forcalled only if there is a line of text in the scanner buffer. This means that the new CountLinesone handles cases when there is no new line or when the file is empty.

Secondly, since it sc.Scanreturns falsewhen an error is detected, the cycle forends when it reaches the end of the file or an error is detected. The type bufio.Scannerremembers the first error that it encountered, and using the method sc.Err()we can restore that error as soon as we exit the loop.

Finally, it sc.Err()takes care of the processing io.EOFand converts it to nilif the end of the file is reached without errors.

Tip . If you encounter excessive error handling, try extracting some operations into a helper type.

7.1.2. Writeresponse


My second example is inspired by the post “Mistakes are Values .

Earlier we saw examples of how a file is opened, written, and closed. There is error handling, but it is not too much, because operations can be encapsulated in helpers, such as ioutil.ReadFileand ioutil.WriteFile. But when working with low-level network protocols, there is a need to build an answer directly using I / O primitives. In this case, error handling can become intrusive. Consider a fragment of an HTTP server that creates an HTTP response.

type Header struct {
	Key, Value string
}
type Status struct {
	Code   int
	Reason string
}
func WriteResponse(w io.Writer, st Status, headers []Header, body io.Reader) error {
	_, err := fmt.Fprintf(w, "HTTP/1.1 %d %s\r\n", st.Code, st.Reason)
	if err != nil {
		return err
	}
	for _, h := range headers {
		_, err := fmt.Fprintf(w, "%s: %s\r\n", h.Key, h.Value)
		if err != nil {
			return err
		}
	}
	if _, err := fmt.Fprint(w, "\r\n"); err != nil {
		return err
	}
	_, err = io.Copy(w, body)
	return err
}

First, build the status bar with fmt.Fprintfand check the error. Then for each heading we write a key and heading value, each time checking an error. Finally, we complete the header section with an additional one \r\n, check the error, and copy the response body to the client. Finally, although we do not need to check the error from io.Copy, we need to translate it from two return values ​​to the only one that returns WriteResponse.

This is a lot of monotonous work. But you can ease your task by applying a small type of wrapper errWriter.

errWritersatisfies the contract io.Writer, so it can be used as a wrapper. errWriterpasses records through the function until an error is detected. In this case, it rejects the entries and returns the previous error.

type errWriter struct {
	io.Writer
	err error
}
func (e *errWriter) Write(buf []byte) (int, error) {
	if e.err != nil {
		return 0, e.err
	}
	var n int
	n, e.err = e.Writer.Write(buf)
	return n, nil
}
func WriteResponse(w io.Writer, st Status, headers []Header, body io.Reader) error {
	ew := &errWriter{Writer: w}
	fmt.Fprintf(ew, "HTTP/1.1 %d %s\r\n", st.Code, st.Reason)
	for _, h := range headers {
		fmt.Fprintf(ew, "%s: %s\r\n", h.Key, h.Value)
	}
	fmt.Fprint(ew, "\r\n")
	io.Copy(ew, body)
	return ew.err
}

If applied errWriterto WriteResponse, then the clarity of the code is greatly improved. You no longer need to check for errors in each individual operation. The error message moves to the end of the function as a field check ew.err, avoiding the annoying translation of the returned io.Copy values.

7.2. Handle the error only once


Finally, I want to note that errors should be handled only once. Processing means checking the meaning of the error and making a single decision.

// WriteAll writes the contents of buf to the supplied writer.
func WriteAll(w io.Writer, buf []byte) {
        w.Write(buf)
}

If you make less than one decision, you ignore the error. As we see here, the error from is w.WriteAllignored.

But making more than one decision in response to one mistake is also wrong. Below is the code I often come across.

func WriteAll(w io.Writer, buf []byte) error {
	_, err := w.Write(buf)
	if err != nil {
		log.Println("unable to write:", err) // annotated error goes to log file
		return err                           // unannotated error returned to caller
	}
	return nil
}

In this example, if an error occurs during the time w.Write, the line is written to the log, and is also returned to the caller, which may also log it and pass it on, up to the top level of the program.

Most likely, the caller does the same:

func WriteConfig(w io.Writer, conf *Config) error {
	buf, err := json.Marshal(conf)
	if err != nil {
		log.Printf("could not marshal config: %v", err)
		return err
	}
	if err := WriteAll(w, buf); err != nil {
		log.Println("could not write config: %v", err)
		return err
	}
	return nil
}

Thus, a stack of repeating lines is created in the log.

unable to write: io.EOF
could not write config: io.EOF

But at the top of the program you get an original error without any context.

err := WriteConfig(f, &conf)
fmt.Println(err) // io.EOF

I want to analyze this topic in more detail, because I do not consider the problem of simultaneously returning an error and logging my personal preferences.

func WriteConfig(w io.Writer, conf *Config) error {
	buf, err := json.Marshal(conf)
	if err != nil {
		log.Printf("could not marshal config: %v", err)
		// oops, forgot to return
	}
	if err := WriteAll(w, buf); err != nil {
		log.Println("could not write config: %v", err)
		return err
	}
	return nil
}

I often encounter a problem that a programmer forgets to return from an error. As we said earlier, Go's style is to use boundary operators, check prerequisites as the function executes, and return early.

In this example, the author checked the error, registered it, but forgot to return. Because of this, a subtle problem arises.

The Go error handling contract says that in the presence of an error, no assumptions can be made about the contents of other return values. Since JSON marshaling failed, the contents are bufunknown: it may contain nothing, but worse, it may contain a half-written JSON fragment.

Since the programmer forgot to return after checking and registering the error, the damaged buffer will be transferred WriteAll. The operation is likely to succeed, and therefore the configuration file will not be written correctly. However, the function completes normally, and the only sign that a problem has occurred is a line in the log where JSON marshaling failed, and not a configuration record failure.

7.2.1. Adding context to errors


An error occurred because the author was trying to add context to the error message. He tried to leave a mark to indicate the source of the error.

Let's look at another way to do the same through fmt.Errorf.

func WriteConfig(w io.Writer, conf *Config) error {
	buf, err := json.Marshal(conf)
	if err != nil {
		return fmt.Errorf("could not marshal config: %v", err)
	}
	if err := WriteAll(w, buf); err != nil {
		return fmt.Errorf("could not write config: %v", err)
	}
	return nil
}
func WriteAll(w io.Writer, buf []byte) error {
	_, err := w.Write(buf)
	if err != nil {
		return fmt.Errorf("write failed: %v", err)
	}
	return nil
}

If you combine the error record with returning on one line, it is more difficult to forget to return and avoid accidental continuation.

If an I / O error occurs while writing the file, the method Error()will produce something like this:

could not write config: write failed: input/output error

7.2.2. Error wrapping with github.com/pkg/errors


The pattern fmt.Errorfworks well for recording error messages , but the type of error goes by the wayside. I argued that handling errors as opaque values ​​is important for loosely coupled projects, so the type of source error should not matter if we only need to work with its value:

  1. Make sure that it is not zero.
  2. Display it on the screen or log it.

However, it happens that you need to restore the original error. To annotate such errors, you can use something like my package errors:

func ReadFile(path string) ([]byte, error) {
	f, err := os.Open(path)
	if err != nil {
		return nil, errors.Wrap(err, "open failed")
	}
	defer f.Close()
	buf, err := ioutil.ReadAll(f)
	if err != nil {
		return nil, errors.Wrap(err, "read failed")
	}
	return buf, nil
}
func ReadConfig() ([]byte, error) {
	home := os.Getenv("HOME")
	config, err := ReadFile(filepath.Join(home, ".settings.xml"))
	return config, errors.WithMessage(err, "could not read config")
}
func main() {
	_, err := ReadConfig()
	if err != nil {
		fmt.Println(err)
		os.Exit(1)
	}
}

Now the message becomes a nice K&D-style bug:

could not read config: open failed: open /Users/dfc/.settings.xml: no such file or directory

and its value contains a link to the original reason.

func main() {
	_, err := ReadConfig()
	if err != nil {
		fmt.Printf("original error: %T %v\n", errors.Cause(err), errors.Cause(err))
		fmt.Printf("stack trace:\n%+v\n", err)
		os.Exit(1)
	}
}

Thus, you can restore the original error and display the stack trace:

original error: *os.PathError open /Users/dfc/.settings.xml: no such file or directory
stack trace:
open /Users/dfc/.settings.xml: no such file or directory
open failed
main.ReadFile
        /Users/dfc/devel/practical-go/src/errors/readfile2.go:16
main.ReadConfig
        /Users/dfc/devel/practical-go/src/errors/readfile2.go:29
main.main
        /Users/dfc/devel/practical-go/src/errors/readfile2.go:35
runtime.main
        /Users/dfc/go/src/runtime/proc.go:201
runtime.goexit
        /Users/dfc/go/src/runtime/asm_amd64.s:1333
could not read config

The package errorsallows you to add context to error values ​​in a convenient format for both a person and a machine. At a recent presentation, I told you that in the upcoming release of Go, such a wrapper will appear in the standard library.

8. Concurrency


Go is often chosen because of its concurrency capabilities. The developers have done a lot to increase its efficiency (in terms of hardware resources) and performance, but Go's parallelism functions can be used to write code that is neither productive nor reliable. At the end of the article I want to give a couple of tips on how to avoid some of the pitfalls of Go's concurrency functions.

Go's top-notch concurrency support is provided by channels, as well as instructions selectandgo. If you studied Go theory from textbooks or at a university, you might have noticed that the parallelism section is always one of the last in the course. Our article is no different: I decided to talk about parallelism at the end, as something additional to the usual skills that the Go programmer should learn.

There is a certain dichotomy here, because the main feature of Go is our simple, easy model of parallelism. As a product, our language sells itself at the expense of almost this one function. On the other hand, concurrency is actually not so easy to use, otherwise the authors would not have made it the last chapter in their books, and we would not have looked with regret at our code.

This section discusses some of the pitfalls of the naive use of Go concurrency functions.

8.1. Do some work all the time.


What is the problem with this program?

package main
import (
	"fmt"
	"log"
	"net/http"
)
func main() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		fmt.Fprintln(w, "Hello, GopherCon SG")
	})
	go func() {
		if err := http.ListenAndServe(":8080", nil); err != nil {
			log.Fatal(err)
		}
	}()
	for {
	}
}

The program does what we intended: it serves a simple web server. At the same time, it spends CPU time in an infinite loop, because for{}in the last line it mainblocks gorutin main, without performing any I / O, there is no waiting for blocking, sending or receiving messages, or some kind of connection with the sheduler.

Since the Go runtime is usually served by a sheduler, this program will run senselessly on the processor and may end up in an active lock (live-lock).

How to fix it? Here is one option.

package main
import (
	"fmt"
	"log"
	"net/http"
	"runtime"
)
func main() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		fmt.Fprintln(w, "Hello, GopherCon SG")
	})
	go func() {
		if err := http.ListenAndServe(":8080", nil); err != nil {
			log.Fatal(err)
		}
	}()
	for {
		runtime.Gosched()
	}
}

It may look silly, but this is a common solution that comes across to me in real life. This is a symptom of a misunderstanding of the underlying problem.

If you're a little more experienced with Go, you can write something like this.

package main
import (
	"fmt"
	"log"
	"net/http"
)
func main() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		fmt.Fprintln(w, "Hello, GopherCon SG")
	})
	go func() {
		if err := http.ListenAndServe(":8080", nil); err != nil {
			log.Fatal(err)
		}
	}()
	select {}
}

An empty statement is selectblocked forever. This is useful, because now we do not spin the entire processor just for a call runtime.GoSched(). However, we treat only the symptom, not the cause.

I want to show you another solution, which, I hope, has already occurred to you. Instead of running http.ListenAndServein goroutine, leaving the main goroutine problem, just run http.ListenAndServein the main goroutine.

Tip . If you exit the function main.main, the Go program unconditionally terminates, regardless of what other goroutines running during the execution of the program do.

package main
import (
	"fmt"
	"log"
	"net/http"
)
func main() {
	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		fmt.Fprintln(w, "Hello, GopherCon SG")
	})
	if err := http.ListenAndServe(":8080", nil); err != nil {
		log.Fatal(err)
	}
}

So this is my first advice: if goroutine cannot make progress until he receives a result from another, then it is often easier to do the work yourself, rather than delegating it.

This often eliminates a lot of state tracking and channel manipulation needed to transfer the result back from goroutine to the process initiator.

Tip . Many Go programmers abuse goroutines, especially at first. Like everything else in life, the key to success is moderation.

8.2. Leave parallelism to the caller


What is the difference between the two APIs?

// ListDirectory returns the contents of dir.
func ListDirectory(dir string) ([]string, error)

// ListDirectory returns a channel over which
// directory entries will be published. When the list
// of entries is exhausted, the channel will be closed.
func ListDirectory(dir string) chan string

We mention the obvious differences: the first example reads the directory into a slice, and then returns the entire slice or error if something went wrong. This happens synchronously, the caller blocks ListDirectoryuntil all the directory entries have been read. Depending on how large the directory is, it can take a lot of time and potentially a lot of memory.

Consider the second example. It is a bit more like classic Go programming, here it ListDirectoryreturns the channel through which directory entries will be transmitted. When the channel is closed, this is a sign that there are no more catalog entries. Since the filling of the channel occurs after the return ListDirectory, it can be assumed that goroutines start to fill the channel.

Note . In the second option, it is not necessary to actually use goroutine: you can select a channel sufficient to store all the directory entries without blocking, fill it in, close it, and then return the channel to the caller. But this is unlikely, since in this case the same problems will arise when using a large amount of memory to buffer all the results in the channel.

The ListDirectorychannel version has two more problems:

  • Using a closed channel as a signal that there are no more elements to process, ListDirectorycannot inform the caller of an incomplete set of elements due to an error. The caller has no way of conveying the difference between an empty directory and an error. In both cases, it seems that the channel will be immediately closed.
  • The caller must continue reading from the channel when it is closed, because this is the only way to understand that the channel filling goroutine has stopped working. This is a serious restriction on use ListDirectory: the caller spends time reading from the channel, even if he received all the necessary data. This is probably more efficient in terms of memory usage for medium and large directories, but the method is no faster than the original slice based method.

In both cases, the solution is to use a callback: a function that is called in the context of each directory entry as it executes.

func ListDirectory(dir string, fn func(string))

Unsurprisingly, the function filepath.WalkDirworks that way.

Tip . If your function launches goroutine, you must provide the caller with a way to explicitly stop this routine. It is often easiest to leave asynchronous execution mode on the caller.

8.3. Never run goroutine without knowing when it will stop


In the previous example, goroutine was used unnecessarily. But one of Go’s main strengths is its first-class concurrency capabilities. Indeed, in many cases parallel work is quite appropriate, and then it is necessary to use goroutines.

This simple application serves http traffic on two different ports: port 8080 for application traffic and port 8001 for access to the endpoint /debug/pprof.

package main
import (
	"fmt"
	"net/http"
	_ "net/http/pprof"
)
func main() {
	mux := http.NewServeMux()
	mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) {
		fmt.Fprintln(resp, "Hello, QCon!")
	})
	go http.ListenAndServe("127.0.0.1:8001", http.DefaultServeMux) // debug
	http.ListenAndServe("0.0.0.0:8080", mux)                       // app traffic
}

Although the program is simple, it is the foundation of a real application.

The application in its current form has several problems that will appear as they grow, so let's immediately look at some of them.

func serveApp() {
	mux := http.NewServeMux()
	mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) {
		fmt.Fprintln(resp, "Hello, QCon!")
	})
	http.ListenAndServe("0.0.0.0:8080", mux)
}
func serveDebug() {
	http.ListenAndServe("127.0.0.1:8001", http.DefaultServeMux)
}
func main() {
	go serveDebug()
	serveApp()
}

Breaking handlers serveAppand serveDebugto separate functions, we have separated them from main.main. We also followed the previous advice and made sure serveAppand serveDebugleave the task to ensure the parallelism of the caller.

But there are some problems with the performance of such a program. If we exit serveAppand then exit main.main, then the program terminates and will be restarted by the process manager.

Tip . Just as functions in Go leave parallelism to the caller, so applications should quit monitoring their state and restarting the program that called them. Do not make your applications responsible for restarting themselves: this procedure is best handled from outside the application.

However, it serveDebugstarts in a separate goroutine, and in case of its release, goroutine ends, while the rest of the program continues. Your devs will not like the fact that you can’t get application statistics because the handler /debughas long stopped working.

We need to make sure the application is closed if any goroutine serving it stops .

func serveApp() {
	mux := http.NewServeMux()
	mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) {
		fmt.Fprintln(resp, "Hello, QCon!")
	})
	if err := http.ListenAndServe("0.0.0.0:8080", mux); err != nil {
		log.Fatal(err)
	}
}
func serveDebug() {
	if err := http.ListenAndServe("127.0.0.1:8001", http.DefaultServeMux); err != nil {
		log.Fatal(err)
	}
}
func main() {
	go serveDebug()
	go serveApp()
	select {}
}

Now serverAppthey serveDebugcheck errors from ListenAndServeand, if necessary, call them log.Fatal. Since both handlers work in goroutines, we draw up the main routine in select{}.

This approach has a number of problems:

  1. If it ListenAndServereturns with an error nil, there will be no call log.Fatal, and the HTTP service on this port will exit without stopping the application.
  2. log.Fatalcalls os.Exitthat unconditionally exit the program; deferred calls will not work, other goroutines will not be notified of closure, the program will simply stop. This makes it difficult to write tests for these functions.

Tip . Use only log.Fatalon functions main.mainor init.

In fact, we want to convey any error that occurs to the creator of the goroutine, so that he can find out why she stopped and cleanly completed the process.

func serveApp() error {
	mux := http.NewServeMux()
	mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) {
		fmt.Fprintln(resp, "Hello, QCon!")
	})
	return http.ListenAndServe("0.0.0.0:8080", mux)
}
func serveDebug() error {
	return http.ListenAndServe("127.0.0.1:8001", http.DefaultServeMux)
}
func main() {
	done := make(chan error, 2)
	go func() {
		done <- serveDebug()
	}()
	go func() {
		done <- serveApp()
	}()
	for i := 0; i < cap(done); i++ {
		if err := <-done; err != nil {
			fmt.Println("error: %v", err)
		}
	}
}

Goroutine return status can be obtained through the channel. The channel size is equal to the number of goroutines that we want to control, so sending to the channel donewill not be blocked, as this will block the shutdown of goroutines and cause a leak.

Since the channel donecannot be closed safely, we cannot use the idiom for the channel cycle for rangeuntil all the goroutines have reported. Instead, we run all the running goroutines in a cycle, which is equal to the capacity of the channel.

Now we have a way to cleanly exit every goroutine and fix all the errors that they encounter. It remains only to send a signal to complete the work from the first goroutine to everyone else.

The appeal tohttp.Serverabout completion, so I wrapped this logic in a helper function. The helper serveaccepts the address and http.Handler, likewise http.ListenAndServe, the channel stopthat we use to run the method Shutdown.

func serve(addr string, handler http.Handler, stop <-chan struct{}) error {
	s := http.Server{
		Addr:    addr,
		Handler: handler,
	}
	go func() {
		<-stop // wait for stop signal
		s.Shutdown(context.Background())
	}()
	return s.ListenAndServe()
}
func serveApp(stop <-chan struct{}) error {
	mux := http.NewServeMux()
	mux.HandleFunc("/", func(resp http.ResponseWriter, req *http.Request) {
		fmt.Fprintln(resp, "Hello, QCon!")
	})
	return serve("0.0.0.0:8080", mux, stop)
}
func serveDebug(stop <-chan struct{}) error {
	return serve("127.0.0.1:8001", http.DefaultServeMux, stop)
}
func main() {
	done := make(chan error, 2)
	stop := make(chan struct{})
	go func() {
		done <- serveDebug(stop)
	}()
	go func() {
		done <- serveApp(stop)
	}()
	var stopped bool
	for i := 0; i < cap(done); i++ {
		if err := <-done; err != nil {
			fmt.Println("error: %v", err)
		}
		if !stopped {
			stopped = true
			close(stop)
		}
	}
}

Now for each value in the channel donewe close the channel stop, which makes each gorutin on this channel close its own http.Server. In turn, this leads to a return of all remaining goroutines ListenAndServe. When all running gorutins have stopped, it main.mainends and the process stops cleanly.

Tip . Writing such logic on your own is repetitive work and the risk of mistakes. Look at something like this package that will do most of the work for you.

Also popular now: