Experience with Go in production Yandex

    I want to share the experience of using the Go language in Yandex production systems. In general, we are quite conservative about which languages ​​to use for real systems. And this only adds utility to the experience that we got this time.

    We started developing on Go last summer. Then came the Go framework for the Cocaine cloud platform . Prior to this, the server-side browser API applications were written primarily in C ++ and Python. The server API at that time was just transitioning to the cloud platform, and for the most part we were only deciding what technologies to use in the future for it. The API performs the following functions: receive data, process it, send it to the Yandex internal service, process it again, send it back to the Browser. A set of simple applications.



    The drawback of C ++ for us was an obvious overkill for our purposes, it took a lot of time to develop, and the big problem for us was that the positive framework for Cocaine did not present any opportunity to work asynchronously, except with the help of callbacks. We had many calls to various services, so as a result, soon all the code became one big noodle from callbacks. Scaling and debugging it was very difficult.

    Python had a slightly different plan. Firstly, this speed was very low even with PyPysecondly, dynamic typing could potentially lead to errors. It was necessary to write tests even where one could do without it. Although in general it is worth noting that just developing such applications in Python was quite simple. Development was faster than on the pros and generally easier. There were no callbacks, the framework supported generators, you could write asynchronous code as synchronous.

    And here we somehow decided to try Go, having studied it beforehand. Go is a compiled, multi-threaded programming language, with strict static typing ( duck typingfor interfaces) and garbage-collector. Developed by Google. Go's initial development began in September 2007, with Robert Griesmer, Rob Pike, and Ken Thompson directly involved in the design. Officially, the language was introduced in November 2009.

    We have three: me, Vyacheslav Bakhmutov and Anton Tyurin had the assumption that Go would work better. Looking ahead, I will say that our expectations were confirmed. Now we will go into more detail about what is better, and why.

    Development speed


    On Go, small programs can be written faster than in C ++, about as fast as in Python. And the sensations from the tongue are about the same.

    Go has a good standard library that has almost everything you need. Rarely do you have to use external libraries, but when it comes to that, in most cases you can just do go get github.com/library/libit and it will install.

    Go makes it very easy to write asynchronous code, it looks the same as synchronous, but go runtime executes it asynchronously. No callbacks, of course, no. All of this is done on top of the goroutines. Rob Pike describes the goroutines as "something like threads, but more lightweight." Something similar to what other languages ​​often call “green threads” or “fibers”.

    There are many good enough IDEs and plugins for them for Go. I personally use the plugin for IntelliJ IDEA, others use Sublime. The comrade who makes Go has used vim quite successfully with me.



    This is what typical Go code looks like, almost every method called here actually does asynchronous work. region.GetRegionByLbsasynchronously goes to the geobase and Lbs services, findLanguageto langdetect, and the third method with a long name goes through urlfetcher to the Yandex sage registry. As you can see, the code looks synchronous, it is convenient to write and debug it.

    Ease of testing and error detection


    We try to cover the code strongly with tests, sometimes we write tests before writing code, but this is still quite rare. Here Go shows itself from a very good point. Out of the box testing works, code coverage in tests. Moreover, the latter presents conveniently in the form of html those places that are not covered by tests in the code. Accordingly, you can get the general code coverage for the modules. Above the general testing infrastructure, you can use a third-party library with wider functionality, we do so.



    Typically, to run tests, a command of the following kind is executed: go test suggest... This instruction allows you to test all the modules that lie in the suggest module.

    Profiling also works out of the box, allowing you to view the graph of function calls and their execution time through the built-in web service. The graph looks the same as in google performance tools.



    There is a built-in thread sanitizer. Go is just developing one of the fellows who does the sanitizer on Google. It is possible to get a stack trace of errors without using rocket science.

    There can be no memory errors in Go, this helps a lot, since not always and not everyone manages to be careful. We have a suggest-data application, it loads 300 megabytes of data at startup. When it was written on the pros, it sometimes fell, which caused slight discomfort for our admin. The first time it fell, it was because of the Cocaine framework. This was fixed, but then the fall continued, the second time we did not fully understand the reasons, maybe there was a problem and that we wrote something wrong. As a result, we decided not to bother and rewrite it to Go (there were only 200 lines of code there). The falls immediately disappeared. The problem was further complicated by the fact that the stack was often corrupt and it was hard to find the cause of the crash. That is, it was possible, but difficult. After switching to Go, the memory is no longer corrupted,

    This is what the log looks like:

    Wrong format of region (lr)
    /home/lamerman/work/omnibox/.../inside.go:109       (*Inside).getRegionData
    /home/lamerman/work/omnibox/.../inside.go:194       (*Inside).Call
    /home/lamerman/work/omnibox/.../main/main.go:57     *Inside.Call·fm
    /usr/local/go/src/pkg/net/http/server.go:1221                                   HandlerFunc.ServeHTTP
    /home/lamerman/work/go/src/github.com/.../httpreq.go:124 func·006
    /home/lamerman/work/go/src/github.com/.../worker.go:219  func·015
    /usr/local/go/src/pkg/runtime/proc.c:1394                                       goexit
    

    Performance


    In order not to waste money on servers, the language should be fast enough. Here is a comparison of Go with C ++ and Python in standard tests. Result for Python:



    As you can see, on average, Go is tens of times faster. Same thing compared to C ++:



    On average, Go is two to three times slower. Given the fact that the language is young, you can think that in the future it can still seriously accelerate.

    I would also like to share my own observations about the speed of work with us. Cocaine has a service for retrieving content by url called urlfetcher. For some reason, for the time being we are using our own version of it and we have it presented in two copies of pyurlfetcher and gofetcher. As you can easily guess, the difference is in the language in which they are written. They implement the same interface. Let's try to shoot them. For 10,000 call units, gofetcher spent 2.52 seconds of processor time, pyurlfetcher spent 19.5 seconds on it, in fairness it should be noted that under PyPy it works exactly twice as fast, i.e. 10 seconds. The bottom line is that Go is 4 times faster than Python under PyPy and 8 times faster than cpython. Well, that is, if you use cpython,

    You can also compare with C ++ on one of our applications - suggest. The application shows a smart line in the browser, taking data from the Yandex Sajest and sorcerers, that is, basically there is json processing and going to all sorts of network services.

    For 1000 requests to suggest, 1.10 seconds of processor time is spent on Go, 0.57 seconds on C ++, that is, you can see that Go is exactly two times slower than C ++ on this application. The application itself is about 6 thousand lines of code.

    Memory


    This is how the memory usage pattern of our pens on the production server looks like:

    7855 cocaine   20   0  262m 9620 3744 S    1  0.0 249:35.82 barnavig
    7855 cocaine   20   0  262m 9620 3744 S    1  0.0 249:35.82 barnavig
    8590 cocaine   20   0  324m  11m 3604 S    1  0.0  87:05.82 umaproto
    

    You can see that on average there is not very much memory consumed, about 10 megabytes per pen in case there is no memory leak or caches. Memory leaks are relatively easy to debug with an internal tool.

    Compare two identical applications on pluses and Go. Both applications store a lot of data. As you can see, consumption is almost identical at the very beginning of the application. Each has 300 megabytes of data:

    14742 cocaine   20   0 1071m 388m 3376 S    0  0.3  26:04.85 suggestdata (suggest data на Go)
    2734 cocaine   20   0  825m 345m 3388 S    0  0.5  23:47.80 suggest-data-pr (suggest data на c++)
    

    Memory can also be profiled in case of leaks, more about profiling can be found here .

    Rest


    You can debug applications using gdb and all the wonderful things that it gives people. We had no problems debugging these programs under gdb, everything seems to work.

    It’s worth adding that all all Go applications are assembled together with all used libraries into one big binary (static linking), this is most often convenient when deploying applications to the server, you don’t need to think about dependencies, you can also easily deploy to any system, the same the binary itself can be shoved on both lucid and precise, it doesn’t matter which set of packages is installed there. The only dependency, as far as I remember, is libc. This approach also has some disadvantages.

    Conclusion

    I think that Go in the first place can be useful to those who write in Python, but are dissatisfied with the speed of applications. Writing in Go is as easy as writing in Python, but you can save a lot of machine resources. For people writing in C ++, Go can be useful where you need to write simple applications. Personally, after such a transition, my productivity has greatly increased.

    Go, of course, is not perfect. There is a possibility that I just did not fully understand it. What I miss first of all is generics. They plan to introduce them in some future, but until the end their perspective is not yet clear. Rob Pike himself said the following about this: "Go has generics, they are called interfaces there." Indeed, some generic code can be written using interfaces, but in some cases this is not enough. The lack of generics is partially offset by the presence of reflection. But faq and Pike assure that there will be generics.

    Our Go apps have been running under Cocaine in production for about a year now, and some fatal things have never happened. Go works, and I think it works well.

    Go is actively developing, there are many language conferences, and new versions are regularly released that improve productivity. Go is used internally by Google, Facebook, Docker, disqus.com ( http://blog.disqus.com/post/51155103801/trying-out-this-go-thing ) and many other large companies. The list can be viewed here .

    Also popular now: