Errorx - library for working with errors in Go

What is Errorx and how it is useful

Errorx is a library for working with errors in Go. It provides tools for solving problems related to the error mechanism in large projects, and a single syntax for working with them.

Most Joom server components are written on Go since the company was founded. This choice justified itself at the initial stages of development and the life of the service, and in the light of announcements about the prospects for Go 2, we are sure that we will not regret it in the future. One of the main virtues of Go is simplicity, and an approach to error shows this principle as nothing else. Not every project reaches a sufficient scale so that the capabilities of the standard library become lacking, prompting them to look for their own solutions in this area. We happened to go through some evolution in approaches to working with errors, and the errorx library reflects the result of this evolution. We are convinced that it may be useful to many - including those who are not yet experiencing severe discomfort in working with errors on their projects.

Errors in go

Before turning to the story of errorx, you should make some explanations. In the end, what is wrong with the errors?

type error interface {
   Error() string
}

Very simple, isn't it? In practice, an implementation often does not really bring with it anything but a string description of the error. Such minimalism is associated with the approach that a mistake does not necessarily mean something “exceptional”. The most commonly used errors.New () from the standard library is true to this idea:

funcNew(text string)error {
    return &errorString{text}
}

If we recall that mistakes in the language have no special status and are common objects, the question arises: what is the peculiarity of working with them?

Errors are not exceptions . It is no secret that many people who get acquainted with Go encounter this distinction with some resistance. There are many publications, both explaining and supporting, and criticizing the approach chosen in Go. Anyway, errors in Go serve many purposes, and at least one of them is exactly the same as exceptions in some other languages: troubleshooting. As a consequence, it is natural to expect from them the same expressive power, even if the approach and syntax associated with their use is very different.

What's wrong?

Many projects use errors in Go, as they are, and do not have the slightest difficulty in this. However, as the complexity of the system grows, a number of problems begin to manifest themselves, which attract attention even in the absence of high expectations. A good illustration is a similar line in the log of your service:

Error: duplicate key

Here, the first problem immediately becomes obvious: if you do not take care of this on purpose, then it is almost impossible to understand what went wrong in any large system, just by the initial message. This post lacks the details and broader context of the problem. This is a programmer's mistake, but it happens too often to be neglected. The code dedicated to the "positive" branches of the control graph, in practice, always deserves more attention and is better covered with tests than the "negative" code associated with interruption of execution or external problems. How often the mantra is if err != nil {return err}repeated in Go programs makes this misstep even more likely.

As a small digression, consider the following example:

func(m *Manager)ApplyToUsers(action func(User)(*Data, error), ids []UserID) error {
    users, err := m.LoadUsers(ids)
    if err != nil {
        return err
    }
    var actionData []*Data
    for _, user := range users {
        data, err := action(user)
        if err != nil {
            return err
        }
        ok, err := m.validateData(data)
        if err != nil {
            returnnil
        }
        if !ok {
            log.Error("Validation failed for %v", data)
            continue
        }
        actionData = append(actionData, data)
    }
    return m.Apply(actionData)
}

How quickly did you see an error in this code? But it was done at least once, probably, by any Go programmer. Hint: expression error if err != nil { return nil }.

If we go back to the problem with a vague message in the log, then in this situation, of course, everyone also happened. Starting to correct the error handling code already at the moment of the onset of the problem is very unpleasant; besides, according to the source data from the log it is not at all clear from which side to start the search for that part of the code, which, in fact, needs to be improved. This may seem to be a far-fetched complexity for projects that are small in code size and number of external dependencies. However, for large-scale projects this is a completely real and painful problem.

Suppose a programmer with a bitter experience wants to add context to the error in advance, which he returns. The naive way to do this is something like this:

funcInsertUser(u *User)error {
    err := usersTable.Insert(u)
    if err != nil {
        return errors.New(fmt.Sprintf("failed to insert user %s: %v", u.Name, err)
    }
    returnnil
}

Got better. The broader context is still unclear, but now it’s much easier to find at least the code in which the error occurred. However, having solved one problem, we inadvertently created another. The error created here kept the diagnostic message original, but everything else, including its type and additional content, was lost.

To see how dangerous it is, consider the following code in the database driver:

var ErrDuplicateKey = errors.New("duplicate key")
func(t *Table)Insert(entity interface{})error { 
    // returns ErrDuplicateKey if a unique constraint is violated by insert 
}  
funcIsDuplicateKeyError(err error)bool {
    return err == ErrDuplicateKey
}

Now the check is IsDuplicateKeyError()destroyed, although at the moment when we added our text to the error, we had no intention to change its semantics. This, in turn, will break the code that relies on this check:

funcRegisterUser(u *User)error {
    err := InsertUser(u)
    if db.IsDuplicateKeyError(err) {
        // find existing user, handle conflict
    } else {
        return err
    }
}

If we want to get smarter and add our own type of error, which will store the original error and be able to return it, say, through a method Cause() error, then we will also solve the problem only in part.

Now at the point of error handling you need to know that the true reason lies in Cause()
There is no way to teach external libraries this knowledge, and the helper functions written in them will be useless.
Our implementation can expect to Cause()return the immediate cause of the error (or nil if there is none), while the implementation in another library will expect the method to return the non-nil root cause; lack of standard tools or a generally accepted contract faces very unpleasant surprises

However, this partial solution is used in many error libraries, including, to some extent, ours. In Go 2, there are plans to popularize this approach — if that happens, it should be easier to deal with the problems described above.

Errorx

Below we will talk about what solutions errorx offers, but first we will try to formulate the considerations that underlie the library.

Diagnostics is more important than resource saving. The performance of creating and displaying errors is important. However, they represent a negative, not a positive way, and in most cases serve as a signal of the problem, therefore the presence of diagnostic information in error is even more important.
Stack trace by default. In order for the error to go away with the fullness of the diagnosis, it should not take effort. On the contrary, it is precisely to exclude part of the information (for the sake of brevity or for performance reasons) that additional actions may be required.
Error semantics. There should be a simple and reliable way to check the meaning of the error: its type, type, properties.
Ease of addition. Adding diagnostic information to a passing error should be simple, and it should not destroy the verification of its semantics.
Simplicity. Error code is often and routinely written, so the syntax for basic manipulations with them should be simple and concise. This reduces the number of bugs and makes it easier to read.
Less is more. The comprehensibility and uniformity of the code is more important than the optional features and opportunities for expansion (which, perhaps, no one will take advantage of).
Error semantics are part of the API. Errors that require separate processing in the calling code are de facto part of the package's public API. You do not need to try to hide it or make it less obvious, but you can make the processing more convenient, and the external dependencies - less fragile.
Most errors are opaque. The more types of errors for the external user are indistinguishable from each other, the better. The loading of the API by the types of errors that require special handling, as well as the loading of the errors themselves with the data necessary for processing them - a design defect that should be avoided.

The most difficult for us was the question of extensibility: should errorx provide primitives for institutions of arbitrarily different user-defined error types, or an implementation that allows you to get everything you need from the box? We chose the second option. First, errorx solves a quite practical problem - and our experience of using it shows that it is better to have a solution for this purpose, and not spare parts for its creation. Secondly, the consideration of simplicity is very important: since less attention is paid to errors, the code should be designed to prevent a bug in working with them was more difficult. Practice has shown that for this it is important that all such code looks and works the same.

TL; DR by main library features:

Stack trace creation in all errors by default
Type checks on errors, several varieties
Ability to add information to an existing error without breaking anything
Manage type visibility if you want to hide the original reason from caller
The mechanism for generalizing error handling code (type hierarchy, traits)
Custom error dynamic properties
Standard error types
Syntax utilities to improve readability of error handling code

Introduction

If we rework the example we analyzed above using errorx, we get the following:

var (
   DBErrors        = errorx.NewNamespace("db")
   ErrDuplicateKey = DBErrors.NewType("duplicate_key")
)
func(t *Table)Insert(entity interface{})error {
   // ...return ErrDuplicateKey.New("violated constraint %s", details)
}
funcIsDuplicateKeyError(err error)bool {
   return errorx.IsOfType(err, ErrDuplicateKey)
}

funcInsertUser(u *User)error {
   err := usersTable.Insert(u)
   if err != nil {
      return errorx.Decorate(err, "failed to insert user %s", u.Name)
   }
   returnnil
}

The calling code that uses IsDuplicateKeyError()will not change at all.

What has changed in this example?

ErrDuplicateKeybecame a type, not an error instance; checking for it is resistant to copying an error, there is no fragile dependence on exact equality
Appeared namespace for database errors; there will most likely be other errors as well, and such a grouping is useful for readability and in some cases can be used in code
Insert returns a new error for each call:
- The error contains more details; this is, of course, possible without errorx, but it is impossible if the same error instance is returned each time, which was previously required forIsDuplicateKeyError()
- These errors can carry a different stack trace, which is useful because not for all calls to the Insert function, such a situation is valid
InsertUser() complements the text of the error, but applies the original error, which is saved in its entirety for subsequent operations
IsDuplicateKeyError() now it works: it cannot be spoiled either by copying the error or by any layers of Decorate ()

It is not necessary to always follow exactly this pattern:

The type of error is not always unique: the same types can be used in many places.
If desired, the stack trace collection can be disabled, and also not create a new error every time, but return the same one as in the original example; These are the so-called sentinel errors, and we do not recommend their use, but this can be useful if the error is used only as a marker in the code and you want to save on the creation of objects.
There is a way to make the test errorx.IsOfType(err, ErrDuplicateKey)stop working if you want to hide the semantics of the root cause from prying eyes
For the actual type checking there are other ways besides the comparison for the exact type.

Godoc contains detailed information about all this. Below we take a little more detail on the main features, which are enough for everyday work.

Types

Any errorx error belongs to some type. Type matters, because inherited error properties can be passed through it; it is through him or his traits that, if necessary, a check of semantics will be made. In addition, the expressive type name complements the error message and may in some cases replace it.

AuthErrors = errorx.NewNamespace("auth")
ErrInvalidToken    = AuthErrors.NewType("invalid_token")

return ErrInvalidToken.NewWithNoMessage()

The error message will contain auth.invalid_token. The error declaration might look different:

ErrInvalidToken    = AuthErrors.NewType("invalid_token").ApplyModifiers(errorx.TypeModifierOmitStackTrace)

In this variant, using the type modifier, the stack trace collection is disabled. The error has marker semantics: its type is given to the external user of the service, and the call stack in the logs would not be useful, since This is not a problem to be repaired.

Here it can be said that errors have a dual nature in several aspects. The contents of the error are used both for diagnostics and, sometimes, as information for an external user: an API client, a library user, etc. The error code is used both as a means of communicating the semantics of the incident, and as a mechanism for transferring control. When using error types, this should be borne in mind.

Making a mistake

return MyType.New("fail")

It’s completely unnecessary to create your own type for every mistake. Any project can have its own general-purpose error package, and some set comes as part of the common namespace along with errorx. It contains errors that in most cases do not involve processing in the code and are suitable for "exceptional" situations when something went wrong.

return errorx.IllegalArgument.New("negative value %d", value)

In a typical case, the call chain is arranged so that the error is created at the very end of the chain, and is processed at the very beginning. In Go, it is not without reason that it is considered bad form to handle an error twice, that is, for example, to write an error to the log and return it higher up the stack. You can, however, add information to the error itself before giving it away:

return errorx.Decorate(err, "failed to upload '%s' to '%s'", filename, location)

The text added to the error will appear in the log, but it does not hurt to check the type of the original error.

Sometimes a reverse need arises: whatever the nature of the error, the external user of the package should not know it. If he got the opportunity, he could create a fragile dependence on a part of the implementation.

return service.ErrBadRequest.Wrap(err, "failed to load user data")

An important difference that makes Wrap the preferred alternative to New is that the original error is fully reflected in the logs. And, in particular, it will bring with it a useful initial call stack.

Another useful technique that allows you to save all possible information about the call stack looks like this:

return errorx.EnhanceStackTrace(err, "operation fail")

If the original error came from another gorutina, the result of such a call will contain the stack trace of both gorutins, which unusually increases its usefulness. The need to make such a call is clearly determined by performance issues: this case is relatively rare, and ergonomics, which would detect it itself, would slow down the usual Wrap, where it is not required at all.

Godoc contains more information and also describes additional features such as DecorateMany.

Error processing

Best of all, if error handling comes down to the following:

log.Error("Error: %+v", err)

The less an error is required to do, except how to print it to the log on the system layer of the project, the better. In reality, this is sometimes not enough, and you have to do this:

if errorx.IsOfType(err, MyType) { /* handle */ }

This check will succeed both on the type error MyTypeand on its child types, and it is resistant to errorx.Decorate(). Here, however, there is a direct dependence on the type of error, which is quite normal within the package, but it can be frustrating if used outside of it. In some cases, the type of this error is part of a stable external API, and sometimes we would like to replace this test with a property check, rather than the exact type of error.

In classic Go errors, this would be done through an interface, the type cast on which would serve as an indicator of the type of error. Errorx types do not support this extension, but a mechanism can be used instead Trait. For example:

funcIsTemporary(err error)bool {
   return HasTrait(err, Temporary())
}

This errorx built-in function checks if an error has a standard property Temporary, i.e. whether it is temporary. Marking error types with traits is the responsibility of the source of the error, and through them it can transmit a useful signal without making specific internal types part of the external API.

return errorx.IgnoreWithTrait(err, errorx.NotFound())

This syntax is useful when a particular kind of error is needed to interrupt the control flow, but should not be passed to the calling function.

Despite the abundance of processing tools, not all of which are listed here, it is important to remember that working with errors should remain as simple as possible. An example of the rules that we try to follow:

The code that received the error should always log it in its entirety; if some of the information is redundant, let the error-generating code take care of it
You can never use an error text or the result of a function Error()to process it in code; only type / trait checks are suitable for this, or type assertion in case of non-errorx errors
User code should not break if some kind of error is not handled in a special way, even if such processing is possible and gives it additional features.
Errors that are checked by properties are better than the so-called sentinel errors, since such checks are less fragile

Beyond errorx

Here we described what is available to the user of the library out of the box, but in Joom the penetration of the code associated with errors is very large. The logging module explicitly accepts errors in its signature and is engaged in printing them itself, in order to eliminate the possibility of incorrect formatting, as well as retrieving from the error chain the optionally available contextual information. The module responsible for panic-safe work with gorutiny unpacks the error if it arrives along with panic, and also knows how to present panic using the error syntax without losing the original stack trace. Some of this, perhaps, we will also publish.

Compatibility issues

Despite the fact that we are very pleased with the way errorx allows us to work with errors, the situation with the library code devoted to this topic is far from ideal. We at Joom solve quite specific practical problems with the help of errorx, but from the point of view of the Go ecosystem, it would be preferable to have this entire set of tools in the standard library. An error, the source of which actually or potentially belongs to another paradigm, has to be considered as foreign, i.e. potentially not carrying information in the form as is customary in the project.

However, some things were done in such a way as not to conflict with other existing solutions.

The format is '%+v'used to print an error along with stack trace, if present. This is the de facto standard in the Go ecosystem and is even included in the draft design for Go 2.

The method Cause() errormakes errorx errors, in theory, compatible with libraries based on the Causer interface, even though their approach violates the errorx contract about the possibility of opaque wrapping through Wrap ().

Future

A recent announcement told about the changes that are planned in Go 2, and the topic of errors is widely represented there. Describing problems with type checks can be a useful addition to this article.

Let's make a reservation that the current state of errorx reflects the error status in Go 1. We do not rule out the appearance of a version of the library that is more friendly to Go 2 when changes in syntax occur. At first glance, the differences in this version, although significant from the point of view of everyday work with errors in the code, do not conflict with the ecological niche and errorx features.

The check-handle idiom in no way contradicts how errorx is used today, a Unwrap() errorcan be supported both with the retention Wrap()of the errorx semantics (i.e., so that it refuses to unfold the error chain below the point where it was made Wrap) and without it. At the moment, this decision, like the decision on the accompanying syntax, seems to be premature.

If the syntax from the current design draft is preserved in Go 2, it would be appropriate to add errorx.Is()it errorx.As()with the same semantics if the use of standard functions from the errors package is not sufficient.

Conclusion

We invite all those who are close to the problems described in the article, and who have found something useful described for themselves here, to get acquainted with the library and to use it. We are open to suggestions and changes, so that the current version of the API cannot be considered finally stable: it is possible that there will be proposals that will convince us to revise some features of the syntax. Version 1.0 can be expected within a few months, but the current version has been quite a long operation and polished inside Joom. It is unlikely that we will want to remove some of the features currently available.

Repository: https://github.com/joomcode/errorx

Thank you for your attention, and always handle errors!

Tags: