
Kubernetes will take over the world. When and how?
In anticipation of DevOpsConf, Vitaliy Khabarov interviewed Dmitry Stolyarov ( distol ), technical director and co-founder of Flant. Vitaly asked Dmitry about what Flant is doing, about Kubernetes, ecosystem development, support. We discussed why Kubernetes is needed and whether it is needed at all. And also about microservices, Amazon AWS, the “I'm Lucky” approach in DevOps, the future of Kubernetes itself, why, when and how it will take over the world, the prospects of DevOps and what engineers should prepare for in the bright and near future with simplification and neural networks. Listen to the
original interview as a podcast on DevOps Deflop, a Russian-language podcast about DevOps, and below is a text version.

Hereinafter, questions are asked by Vitaliy Khabarov engineer from Express42.
- Hello Dima. You are the technical director of Flant and also its founder. Tell me, please, what is the company doing and are you in it?
Dmitry : From the outside, it seems like we are the guys who go, put Kubernetes on everyone and do something with him. But this is not so. We started as a company that deals with Linux, but for a very long time our main activity was servicing production and turnkey highload projects. Usually we build the entire infrastructure from scratch and then we are responsible for it for a long time. Therefore, the main work that Flant performs, for which he receives money, is to take responsibility and turnkey implementation of the production .
As a technical director and one of the founders of the company, I work around the clock to think of ways to increase the availability of production, simplify its operation, make life easier for admins, and make life more enjoyable for developers.
- The last time from "Flanta" I see a lot of reports and articles about Kubernetes. How did you come to him?
Dmitry : I have already talked about this many times, but I am not at all sorry to repeat it. I believe that it is correct to repeat this topic, because there is confusion between cause and effect.
We really needed a tool. We faced a lot of problems, struggled, overcome them with different crutches and felt the need for an instrument. Sifted through many different options, built their bicycles, gained experience. Gradually we got to the point that we started using Docker almost as soon as it appeared - around 2013. At the time of its appearance, we already had a lot of experience with containers, we already wrote an analogue of “Docker” - some of our crutches in Python. With the advent of Docker, crutches can be thrown out and used by a reliable and community-supported solution.
With Kubernetes, the story is similar. By the time he started to gain momentum - for us this is version 1.2 - we already had a bunch of crutches on both Shell and Chef, which we somehow tried to orchestrate Docker. We seriously looked towards Rancher and various other solutions, but then Kubernetes appeared, in which everything is implemented exactly as we would or even better. There is nothing to complain about.
Yes, there is some kind of imperfection, there is some kind of imperfection - there are a lot of imperfections, and 1.2 is generally horrible, but .... Kubernetes is like a building under construction - you look at the project and understand that it will be cool. If the building now has a foundation and two floors, then you understand that it’s better not to populate yet, but there are no such problems with software - you can already use it.
- Do you participate directly in the development of Kubernetes itself?
Dmitry : Mediocre. We are more likely to participate in the development of the ecosystem. We send a certain amount of pull requests: to Prometheus, to all kinds of operators, to Helm to the ecosystem. Unfortunately, I am not able to follow everything that we do and I can be mistaken, but there is not a single pool from our core.
- Do you develop a lot of your tools around Kubernetes?
Dmitry : The strategy is this: we go and pull-pull into everything that is already there. If pull requests are not accepted there, we simply fork them for ourselves and live until they are accepted with our builds. Then, when it reaches upstream, we return back to the upstream version.
For example, we have a Prometheus operator with which we switched back and forth to the upstream of our assembly 5 times already, probably. We need some feature, we sent a pull request, we need to roll it out tomorrow, and we do not want to wait until it is released in upstream. Accordingly, we collect for ourselves, roll our assembly with our feature, which for some reason we need, to all our clusters. Then, for example, they wrap us upstream with the words: “Guys, let's do it for a more general case”, we, or someone else, will finish this, and eventually merge again again.
Everything that exists, we are trying to develop. Many elements that are not yet there have not yet been invented or have been invented, but have not yet been implemented - we are doing it. And not because we like the process itself or bicycle building as an industry, but simply because we need this tool. They often ask the question, why did we do this or that thing? The answer is simple - because we had to go further, solve some practical problem, and we solved it with this tool.
“I know that Flant now has addon operators, shell operators, dapp / werf tools. As I understand it, this is the same tool in different incarnations. I also understand that there are many more different tools inside Flant. This is true?
Dmitry : We still have a lot of things on GitHub. From what I’ll remember now, we have a statusmap - a panel for Grafana that has gone to everyone. It is mentioned in almost every second article about monitoring Kubernetes on the Medium. It is impossible to briefly describe what statusmap is - you need a separate article, but this is a very useful thing for monitoring status over time, since in Kubernetes we often need to show status over time. We also have LogHouse - this is a ClickHouse and black magic-based piece for collecting logs in Kubernetes.
Many utilities! And there will be even more, because a number of internal solutions will be released this year. Of the very large addon-based operators, there are a bunch of addons to Kubernetes, but how to install the sert manager correctly - a certificate management tool, how to install Prometheus with a bunch of dodges correctly - these are twenty different binaries that export data and collect something, for this Prometheus is awesome graphics and alerts. All this is just a bunch of addons to Kubernetes that are put into a cluster, and it turns from simple to cool, sophisticated, automatic, in which many issues have already been resolved. Yes, we do a lot.
- It seems to me that this is a very big contribution to the development of this tool and its methods of use. Can you roughly figure out who else would make the same contribution to the development of the ecosystem?
Dmitry : In Russia, none of those companies that operate in our market is close . Of course, this is a high-profile statement, because there are large players like Mail with Yandex - they also do something with Kubernetes, but even they have not matched the contribution of companies in the world, which are doing much more than we do. It is difficult to compare Flant with a staff of 80 people and Red Hat, in which there are only 300 Kubernetes engineers, if I am not mistaken. It’s hard to compare. We have 6 people in the RnD department, including me, who are sawing all our tools. 6 people against 300 Red Hat engineers - it’s somehow difficult to compare.
- Nevertheless, when even these 6 people can do something really useful and alienated, when they are faced with a practical task and give a decision to the community - an interesting case. I understand that in large technology companies, where they have their own development and Kubernetes support team, in principle, the same tools can be developed. This is an example for them that can be developed and given to the community, to give an impetus to the entire community that uses Kubernetes.
Dmitriy: Probably, this is an integrator chip, its feature. We have many projects and we see many different situations. For us, the main way to create added value is to analyze these cases, find the common ones and make them as cheap as possible for us. We are actively doing this. It's hard for me to talk about Russia and the world, but we have about 40 DevOps engineers at Kubernetes. I don’t think that in Russia there are many companies with a comparable number of specialists who understand Kubernetes, if any.
I understand everything about the job title DevOps engineer, everyone understands everything and are used to calling DevOps engineers DevOps engineers, we will not discuss this. All these 40 wonderful DevOps engineers face problems every day and solve them, we just analyze this experience and try to summarize. We understand that if it remains with us, then in a year or two the tool is useless, because somewhere in the community a ready-made tool will appear. It makes no sense to accumulate this experience inside - it's just a drain of time and energy in dev / null. And so we are not at all sorry. It is with great pleasure that we publish everything and understand that we need to publish, develop, promote, promote it, so that people can use and add their own experience - then everything grows and lives. Then, after two years, the tool does not go to the trash. It’s not a pity to continue to pour energy, because it’s clear
This is part of our big strategy with dapp / werf . I don’t remember when we started doing it, it seems, about 3 years ago. Initially, it was generally on the shell. It was a super proof of concept, we solved some of our private tasks - it turned out! But there are problems with the shell, it’s impossible to build it further, programming on the shell is something else. We had a habit of writing in Ruby, respectively, in Ruby we redid something, developed, developed, developed, and rested on the fact that the community, a crowd that does not say “we want or do not want”, turns up Ruby’s nose, it’s not funny. We realized that we should write all this stuff on Go, just to correspond to the first paragraph in the checklist: DevOps-tool should be a static binary . On Go or not Go is not so important, but a static binary written in Go is better.
Spent effort, rewrote dapp on Go and named it werf. Dapp is no longer supported, does not develop, it works in some latest version, but there is an absolute upgrade path to the top, and you can follow it.
- Can you tell us briefly why dapp was created, what problems does it solve?
Dmitry : The first reason in the assembly. Initially, we had strong build problems when Docker did not know how to multi-stage, and we did multi-stage on our own. Then we had a bunch of questions with cleaning image. Everyone who makes CI / CD, sooner rather than later, faces the problem that there is a bunch of collected images, you need to somehow clean up what is not needed and leave what is needed.
The second reason is in the deploy. Yes, there is Helm, but it solves only part of the problems. Ironically, it is written that "Helm - the Package Manager for Kubernetes". Namely that "the". There are also the words “Package Manager” - what is the usual expectation from the Package Manager? We say: “Package Manager - deliver the package!” and expect him to tell us: "The package has been delivered."
It is interesting that we say: “Helm, put the package”, and when he answers that he has installed it, it turns out that he just started the installation - Kubernetes pointed out: “Run this thing!”, But it started up or not, it works or not Helm does not solve this problem at all.
But we want to know in the framework of any deployment - the application has rolled out to prod or not? Rolled out to the prod means that the application went there, a new version has deployed, and it has at least not crashed and responds correctly. Helm does not solve this problem. To solve it, you need to spend a lot of energy, because you need to give Kubernetes the command to roll out and monitor what happens there - whether it turned around, whether it rolled out. And there are also a bunch of tasks related to deployment, cleaning, and assembly.
This year we will go to local development. We want to come to what used to be in Vagrant - we typed “vagrant up” and we deployed virtual machines. We want to come to such a state that there is a project in Git, we write “werf up” there, and it raises a local copy of this project, deployed in a local mini-Kub, with all the directories convenient for development connected to. Depending on the development language, this is done in different ways, but, nevertheless, so that it is convenient to conduct local development under the mounted files.
The next step for us is to invest heavily in developer convenience.. In order to quickly deploy a project locally, with one tool, to complete, push to Git, and it will also roll out to stage or tests, depending on the pipelines, and then go to the prod with the same tool. This unity, unification, reproducibility of the infrastructure from the local environment to the sale is a very important moment for us. But this is not yet in werf - just planning to do it.
But the path to dapp / werf has always been the same as with Kubernetes in the beginning. We ran into problems, solved them by workarounds - we came up with some kind of solutions for ourselves on the shell, on anything. Then these workarounds tried to somehow straighten, generalize and consolidate into binaries in this case, which we simply share.
There is another view of this whole story, with analogies.
In the case of werf, this is another component to Kubernetes. Only now in our alpha version of werf, for example, Helm compiles into werf altogether, because we are tired of doing this ourselves. There are many reasons to do this, in detail about why we compiled helm as a whole together with tiller inside werf, I will tell just at the report on RIT ++ .
Now werf is a more integrated component. We get a ready-made steering wheel, steering pin - I'm not very good at cars, but this is a large block that already solves a fairly wide range of tasks. We do not need to climb the catalog ourselves, pick up one part to another, think about how to fasten them to each other. We get a ready-made combine that immediately solves a large bundle of tasks. But inside it is made up of all the same open source components, it also uses Docker for assembly, Helm for part of the functionality, and there are several other libraries. This is an integrated tool to get fast and conveniently cool CI / CD out of the box.
- You talk about the experience that you started using Kubernetes, this is a frame, an engine for you, and that you can hang a lot of different things on it: body, steering wheel, fasten pedals, seats. The question is - how difficult is Kubernetes support for you? You have rich experience, how much time and resources does it take you to support Kubernetes separately from everything else?
Dmitry : This is a very difficult question and to answer, you need to understand what support is and what we want from Kubernetes. Maybe you will reveal?
- As far as I know and as I see it, now many teams want to try Kubernetes. Everyone harnessed to him, put on his knee. I have a feeling that people do not always understand the complexity of this system.
Dmitry : That's it.
- How difficult is it to take and deliver Kubernetes with nothing to make it production ready?
Dmitry : Do you think how difficult it is to transplant a heart? I understand, the question is compromising. To carry with a scalpel and not make a mistake is not so difficult. If you are told where to cut and where to sew, then the procedure itself is simple. It is difficult to guarantee from time to time that everything will work out.
Questions always arise - what have we not yet taken into account? What haven't we done yet? What parameters of the Linux kernel did you specify incorrectly? Lord, did we even point them out ?! Which Kubernetes components have we delivered and which are not? Thousands of questions arise, and to answer them, you need to cook for 15-20 years in this industry.
I have a fresh example on this topic that may reveal the meaning of the “Is it difficult to maintain Kubernetes?” Problem. Some time ago, we seriously considered whether we should try to implement Cilium as a network in Kubernetes.
Let me explain what Cilium is. Kubernetes has many different implementations of the network subsystem, and one of them is very cool - it's Cilium. What is its meaning? In the kernel some time ago it became possible to write hooks for the kernel that somehow invade the network subsystem and various other subsystems and allow you to bypass large chunks in the kernel.
The Linux kernel historically has ip rout, a superfilter, bridges, and many different old components that are 15, 20, 30 years old. In general, they work, everything is cool, but now they have made a container of containers, and it looks like a tower of 15 bricks on top of each other, and you stand on it on one leg - a strange sensation. This system has historically developed with many nuances, like an appendix in the body. In some situations, there are problems with performance, for example.
There is a wonderful BPF and the ability to write hooks for the kernel - the guys wrote their hooks for the kernel. The package comes to the Linux kernel, they take it out right at the input, process it themselves without bridges, without TCP, without the IP stack - in short, bypassing everything that is written in the Linux kernel, and immediately spit it out in the container.
What happened? Very cool performance, cool features - just great! But we look at this and see that on each machine there is a program that connects to the Kubernetes API and, according to the data received from this API, generates C code and compiles the binaries that it loads into the kernel so that these hooks work in kernel space .
What happens if something goes wrong? We do not know. To understand this, you need to read all this code, understand all the logic, and this is stunned, how difficult. But, on the other hand, there are these bridges, net filters, ip rout - I did not read their sources, and 40 engineers who work in our company too. Maybe some pieces understand units.
And what's the difference? It turns out that there is ip rout, the Linux kernel, and there is a new tool - what's the difference, we do not understand either one or the other. But we are afraid to use the new - why? Because if the tool is 30 years old, then over 30 years all the bugs have been found, all the rakes have come and you don’t need to know everything - it works like a black box, and it always works. Everyone knows which diagnostic screwdriver to stick in which place, which tcpdump at what point to start. Everyone knows diagnostic utilities well and understands how this set of components works in the Linux kernel - not how it works, but how to use it.
And awesomely cool Cilium is not 30 years old, it is not yet mature. With Kubernetes the same problem, copy. That Cilium is set perfectly, that Kubernetes is set perfectly, but when something goes wrong in the prod, are you able to quickly understand what went wrong in a critical situation?
- Are there companies where these nuances are almost guaranteed to appear? Suppose Yandex suddenly transfers all services to Kubernetes without exception, there will be a wow what kind of load.
Dmitry : No, this is not a conversation about the load, but about the simplest things. For example, we have Kubernetes, we deployed an application there. How to understand that it works? There is simply no ready-made tool to understand that the application is not crashing. There is no finished system that sends alerts, you need to configure these alerts and each schedule. And here we are updating Kubernetes.
There is Ubuntu 16.04. We can say that this is an old version, but we are still on it because there is LTS. There is systemd, the nuance of which is that it does not clean C-groups. Kubernetes launches pods, creates C-groups, then deletes pods, and somehow it turns out - I don't remember the details, I'm sorry - that the systemd slices remain. This leads to the fact that over time, any machine begins to brake significantly. This is not even a question about highload. If constant pods are started, for example, if there is a Cron Job that constantly generates pods, then the machine with Ubuntu 16.04 will start to slow down in a week. There will be a constantly high load average due to the fact that a bunch of C-groups have been created. This is a problem that anyone who simply installs Ubuntu 16 and on top of Kubernetes will run into.
Suppose he somehow updates systemd or something else, but in the Linux kernel to 4.16 is even funnier - when you delete C-groups, they leak in the kernel and are not actually deleted. Therefore, after a month of work on this machine, it will be impossible to see statistics on memory by heart rate. We get the file, roll it in the program, and one file rolls for 15 seconds, because the kernel counts for a very long time inside itself a million C-groups, which seem to be deleted, but no - they are leaking.
There are still a lot of such trifles both here and there. This is not a question that giant companies can sometimes encounter under very high loads - no, it is a matter of everyday things. People can live like this for months - put Kubernetes, deployed an application - it seems to work. Many are so normal. They will not even know that this application will fall for some reason, the alert will not come, but for them this is the norm. We used to live on virtual machines without monitoring, now we have moved to Kubernetes also without monitoring - what's the difference?
The question is that when we walk on ice, we never know its thickness, if not measured in advance. Many walk and do not worry, because they used to walk.
In IT, it seems to me that there are too many “I'm lucky” approaches. Many people install software, use software libraries in the hope that they are lucky. In general, many are lucky. This is probably why it works.
- From my pessimistic assessment, it looks like this: when the risks are great, and the application should work, then you need support from Flant, possibly Red Hat, or you need your own internal team dedicated specifically to Kubernetes, which is ready to pull it.
Dmitry : Objectively, this is so. Getting into a story with Kubernetes yourself on a small team is a certain amount of risk.
- Can you tell me how Kubernetes is generally distributed in Russia?
Dmitry : I do not have these data, and I'm not sure that anyone has them at all. We say: “Kubernetes, Kubernetes”, but there is another view on this issue. I do not know how widespread containers are, but I know a figure from reports on the Internet that 70% of containers are orchestrated by Kubernetes. It was a reliable source for a fairly large sample of the world.
Then another question - do we need containers? I have a personal feeling and, on the whole, Flant's position is such that Kubernetes is the de facto standard.
This is an absolute game-changer in the field of infrastructure management. Just absolute - everything, no more Ansible, Chef, virtual machines, Terraform. I'm not talking about the old collective farm methods. Kubernetes is an absolute changer , and now it will be just that.
It is clear that someone needs a couple of years, and someone needs a couple of dozen to realize this. I have no doubt that there will be nothing but Kubernetes and this new look: we no longer hurt the OS, but use infrastructure as code , not just with code, but with yml - a declaratively described infrastructure. I have a feeling that it will always be so.
- That is, those companies that have not yet switched to Kubernetes will definitely go to it or remain in oblivion. I understood you correctly?
Dmitriy: This is also not entirely true. For example, if our task is to start the dns server, then it can be run on FreeBSD 4.10 and it can work fine for 20 years. Just work and that's it. It may take 20 years to update something once. If we talk about software in the format that we launched and it really works for many years without any updates, without making changes, then, of course, Kubernetes will not be there. He is not needed there.
- Here I have a little dissonance. To work with Kubernetes, you need external or internal support - this is the first moment. The second - when we are just starting development, we are a small startup, we still have nothing, development for Kubernetes or even for microservice architecture can be complicated, and not always justified economically. I’m interested in your opinion - do startups need to start writing from scratch immediately under Kubernetes, or can you still write a monolith, and then only come to Kubernetes?
Dmitry : A tough question. I have a report about microservices “Microservices: size matters.”Many times I came across the fact that people try to hammer nails with a microscope. The approach itself is correct; we design internal software in this way. But when you do this, you need to clearly understand what you are doing. Most of all in microservices, I hate the word "micro." Historically, this word appeared there, and for some reason people think that micro is very small, less than a millimeter, like a micrometer. This is not true.
For example, there is a monolith written by 300 people, and everyone who participated in the development understands that there are problems there, and it would be necessary to break it into micro-pieces - 10 pieces, each of which is written by 30 people in a minimal form. This is important, necessary and cool. But when a startup comes to us, where 3 very cool and talented guys wrote 60 microservices on their knees, every time I look for Corvalol.
It seems to me that they have already talked about this thousands of times - they have received a distributed monolith in one or another hypostasis. It is not economically justified; it is very difficult in general in everything. It’s just that I have seen it so many times that it directly hurts me, so I continue to talk about it.
To the initial question, that there is a conflict between the fact that, on the one hand, Kubernetes is scary to use, because it is not clear that it may break or not work, on the other hand, it is clear that everything goes there and nothing but Kubernetes will . The answer is to weigh the amount of benefit that comes, the amount of tasks that you can solve . This is on the one hand the scales. On the other hand, there are risks that are associated with downtime or with a decrease in response time, availability level - with a decrease in performance indicators.
Here it is - or we can move fast, and Kubernetes allows many things to be done much faster and better, or use reliable, time-tested solutions, but move much slower. Each company must make this choice. You can think of it as a path in the jungle - when you walk for the first time, you can meet a snake, a tiger or a mad badger, and when you go 10 times - trodden the path, removed the branches and walk easier. Each time the path is wider. Then it is an asphalt road, and later a beautiful boulevard.
Kubernetes does not stand still. Again, the question is: Kubernetes, on the one hand, is 4-5 binaries, on the other, it is the entire ecosystem. This is the OS that we have on the machines. What is it? Ubuntu or Curios? This is the Linux kernel, a bunch of additional components. All these things here were thrown out one poisonous snake from the road, there they put up a fence. Kubernetes is developing very quickly and dynamically, and the volume of risks, the volume of the unknown, is decreasing every month and, accordingly, these scales are rebalanced.
Answering the question of what to do a startup, I would say - come to Flant, pay 150 thousand rubles and get a turnkey DevOps easy service. If you are a small startup with several developers, this works. Instead of hiring your DevOps, who will need to learn to solve your problems and pay a salary at this time, you will receive a turnkey solution to all issues. Yes, there are some cons. As an outsourcer, we cannot be so involved and respond quickly to changes. But we have a lot of expertise, ready-made practices. We guarantee that in any situation we will quickly figure it out and lift any Kubernetes from the other world.
- Is it possible to consider a host from a solution from Amazon or Google as an outsourcing?
Dmitry : Yes, of course, this solves a number of issues. But again, the nuances. You still need to understand how to use it. For example, there are a thousand little things in the work of Amazon AWS: you need to warm up the Load Balancer or write a request in advance that “guys, we will get traffic, warm us the Load Balancer!” You need to know these nuances.
When you turn to people who specialize in this, you get almost all the typical things closed. We now have 40 engineers, by the end of the year there will probably be 60 of them - we definitely came across all these things. Even if on some project we once again encounter this problem, we quickly ask each other and know how to solve it.
Perhaps the answer is this - of course, the hosted story makes some part easier. The question is whether you are ready to trust these hosters, and whether they will solve your problems. Amazon and Google have proven their worth. For all of our cases - for sure. We no longer have any positive experiences. All the other clouds that we tried to work with, create a lot of problems - both Ager, and everything that is in Russia, and all kinds of OpenStack in different implementations: Headster, Overage - everything you want. They all create problems that you don’t want to solve.
Therefore, the answer is yes, but, in fact, there are not many mature hosted solutions.
“And yet, who needs Kubernetes?” Who should already be switching to Kubernetes, who is the typical Flanta client who comes for Kubernetes?
Dmitry : This is an interesting question, because right now in the wake of Kubernetes many people come to us: “Guys, we know that you are doing Kubernetes, do it to us!” We answer them: "Gentlemen, we do not do Kubernetes, we do prod and everything connected with it." Because to make a prod without having done the whole CI / CD and this whole story is currently simply impossible. Everyone has left the separation that we have developed by development, and then exploitation by exploitation.
Our clients expect different things, but everyone expects some kind of miracle that they have certain problems, and now - hop! - Kubernetes will solve them. People believe in miracles. They understand with reason that there will be no miracle, but they hope with their hearts - what if this Kubernetes now decides everything for us, they talk so much about it! Suddenly he is now - sneeze! - and a silver bullet, sneeze! - and we have 100% uptime, all developers can release 50 times what went to the prod, and it does not fall. In general, a miracle!
When such people come to us, we say: "I'm sorry, but there is no miracle." To be healthy, you need to eat well and exercise. To have a reliable product, it must be done reliably. To have a convenient CI / CD, you need to make it like that. This is a lot of work that needs to be done.
Some people have the erroneous feeling that they need Kubernetes. People need, they have a deeply urgent need to stop thinking, being engaged, interested in all the problems of infrastructure and the problems of launching their applications. They want applications to just work and just deploy. For them, Kubernetes is the hope that they will no longer hear the story that “we were lying there,” or “we can’t roll out,” or something else.
Usually a technical director comes to us. Two things are being asked from him: on the one hand, give us features, on the other hand, stability. We offer to take it upon ourselves and do it. The silver bullet, more precisely, silver plated, is that you stop thinking about these problems and waste time. You will have special people who close this question.
Admins really need Kubernetes, because it’s a very interesting toy to play with, to dig deeper into. Let's be honest - everyone loves toys. We are all children somewhere, and when we see a new one, we want to play it. For some, this was repulsed, for example, in the administration, because they had already played enough and were already tired of what I simply did not want. But this is not completely repulsed by anyone. For example, if I’ve been tired of toys in the field of system administration and DevOps for a long time, then I still love toys, I buy some new ones anyway. One way or another, all people still want some kind of toys.
No need to play with production. What would I categorically not recommend doing and what I see now massively: “Ah, a new toy!” - They ran to buy it, bought it and: “Let's take her to school now, show her to all her friends.” Do not do so. I apologize, I just have children growing up, I constantly see something in children, I notice it in myself, and then I generalize to the others.
You can achieve the fact that:
There are two real needs: reliability and dynamism / flexibility of rolling out. Anyone who is doing some IT projects right now doesn’t care about soft for easing the world in any business, and who understands this, they need to solve these needs. Kubernetes with the right approach, with the right understanding and with sufficient experience allows them to be solved.
- If you look a little further into the future, then trying to solve the problem of lack of headache with the infrastructure, with roll-out speed and application change speed, new solutions appear, for example, serverless. Do you feel any potential in this direction and, let’s say so, a danger to Kubernetes and similar solutions?
Dmitry : Here we need to make a remark again, that I am not a visionary who looks ahead and says - it will be like that! Although I just did the same. I look at my feet and see a bunch of problems there, for example, how transistors work in a computer. Funny, huh? We encounter some bugs in the CPU.
Making serverless reliable enough, cheap, efficient and convenient, resolving all ecosystem issues. Then I agree with Elon Mask that we need a second planet to make fault tolerance for humanity. Although I don’t know what he’s saying, I understand that I’m not ready to fly to Mars myself and it won’t be tomorrow.
With serverless, it is clear that this is the ideologically correct thing, as fault tolerance for humanity - two planets are better than one. But how to do it now? To send one expedition is not a problem if we concentrate on this effort. To send several expeditions and populate several thousand people there, I think, is also realistic. But it’s completely to make fault tolerance that half of humanity lives there, it seems to me now impossible, not considered.
With serverless one to one: the thing is cool, but it is far from the problems of 2019. Closer to 2030 - let's live to see it. I have no doubt that we will survive, we will certainly survive (repeat before bedtime), but now we need to solve other problems. It's like believing in a fabulous pony Rainbow. Yes, a couple of percent of cases are solved, and solved perfectly, but subjectively serverless is a rainbow ... For me this topic is too far and too incomprehensible. I am not ready to talk. In 2019, with serverless, you will not write a single application.
“As we move toward this potentially beautiful, distant future, what do you think will develop Kubernetes and the ecosystem around it?”
Dmitry : I thought a lot about this and I have a clear answer. The first is statefull - still stateless is easier to do. Kubernetes initially invested more in it, it all started with it. Stateless works almost perfectly in Kubernetes, there’s nothing to complain about. By statefull there are still a lot of problems, or rather, nuances. Everything already works fine for us there, but we are. For this to work for everyone, you need at least a couple more years. This is not a calculated indicator, but my feeling from the head.
In short, statefull needs to - and will - develop very much, because all of our applications have status, there are no stateless applications. This is an illusion, you always need some kind of database and something else. Statefull is the rectification of everything that is possible, the correction of all bugs, the improvement of all the problems that are currently facing - let's call it adoption.
The level of the unknown, the level of unresolved problems, the level of probability of encountering something, will fall sharply. This is an important story. And the operators - everything related to the codification of administration logic, control logic, to get easy service: MySQL easy service, RabbitMQ easy service, Memcache easy service - in general, all these components that we need to get guaranteed to work out of the box. This just solves the pain that we want a database, but don’t want to administer it, or want Kubernetes, but don’t want to administer it.
This story with the development of operators in one form or another will be important in the next couple of years.
I once listened to Isaac Asimov’s old 80s interview on YouTube on Saturday Night Live, an Urgant-like show that's just interesting. He was asked there about the future of computers. He said that the future is simple, as was the case with the radio. The radio was originally a complicated thing. To catch the wave, it took 15 minutes to twist the twirls, twirl the skewers and generally know how everything works, to understand the physics of radio wave transmission. As a result, one twist remained in the radio.
Now in 2019, which radio? In the car, the radio finds all the waves, the name of the stations. The physics of the process has not changed in 100 years, the ease of use has changed. Now, and not only now, already in 1980, when there was an interview with Azimov, everyone used the radio and no one thought about how it was arranged. It always worked - it is a given.
Azimov then said that it would be similar with computers - ease of use would increase . If in 1980 you need to get a special education to press buttons on a computer, then in the future this will not be so.
I have the feeling that with Kubernetes and with the infrastructure, ease of use will also increase dramatically. This, in my opinion, is obvious - it lies on the surface.
- And then what will happen to the engineers, system administrators who support Kubernetes?
Dmitry : And what happened to the accountant after the appearance of 1C? About the same. Before that, they thought on a piece of paper - now in the program. Labor productivity has increased by orders of magnitude, and labor itself has not disappeared from this. Previously, 10 engineers needed to screw in the bulb, but now one will be enough.
The number of software and the number of tasks, it seems to me, is now growing at a speed greater than new DevOps and the efficiency is increasing. There is a specific shortage on the market and it will last a long time. Later, everything will go into a certain norm, at which the efficiency of work will increase, it will become more serverless, they will attach a neuron to Kubernetes, which will select all the resources right as it should, and in general it will do everything as it should - the person get away and do not bother.
But anyway, someone will have to make decisions. It is clear that the level of qualification and specialization of this person is higher. Now in the accounting department you do not need 10 employees who keep books so that their hand does not get tired. It is just not necessary. Many documents are automatically scanned and recognized by the electronic document management system. One clever chief accountant is enough, already with much larger skills, with a good understanding.
In general, such a path in all sectors. It is the same with cars: earlier, a car mechanic and three drivers were attached to the car. Now driving a car is the simplest process in which we all participate every day. No one thinks that a car is something complicated.
- I also heard an interesting idea that in fact the work will increase.
Dmitry : Of course, one hundred percent! Because the amount of software that we write is constantly growing. The number of issues that we solve with software is constantly growing. The amount of work is growing. Now the DevOps market is terribly overheated. This is evident from salary expectations. In a good way, without going into details, there should be juniors who want X, middles who want 1.5X, and seniors who want 2X. And now, if you look at the Moscow DevOps salary market, the junior wants from X to 3X and the senior wants from X to 3X.
Of course, this situation will change very soon - some saturation should come. With software development, this is not the case - despite the fact that everyone needs developers and everyone needs good developers, the market understands how much it costs - the industry has settled down. This is not the case with DevOps.
- From what I heard, I concluded that the current system administrator should not be very worried, but it’s time to download skills and prepare for the fact that tomorrow there will be more work, but it will be more highly qualified.
Dmitry : Absolutely. In general, we live in 2019 and the rule of life is: lifetime learning - we learn all our life. It seems to me that now everyone knows and feels it, but knowing a little is necessary. Every day we have to change. If we do not, then sooner or later we will be dropped off on the sidelines of the profession.
Get ready for sharp 180-degree turns. I do not exclude a situation when something changes dramatically, they come up with something new - it happens. Hop! - and now we act differently. It is important to be prepared for this and not to steam. It may happen that tomorrow everything that I do will be unnecessary - nothing, I’ve studied all my life and am ready to learn something else. It's not a problem. You should not be afraid of job security, but you need to be prepared to constantly learn something new.
- Will you have any wish?
Dmitry : Yes, I have a few wishes.
The first and mercantile - subscribe to YouTube . Dear readers, go to YouTube and subscribe to our channel. In about a month, we will begin an active expansion into the video service. We will have a bunch of educational content about Kubernetes, open and different: from practical things, up to laboratories, to deep fundamental theoretical things and how to apply Kubernetes at the level of principles and patterns.
Second mercantile wish - go to GitHuband put stars, because we eat them. If you don’t give us stars, we will have nothing to eat. It's like mana in a computer game. We are doing something, doing, trying, someone says that these are terrible bicycles, someone that everything is generally wrong, and we continue and act absolutely honestly. We see the problem, solve it and share our experience. Therefore, give us an asterisk, it will not decrease from you, but it will come to us, because we eat them.
The third, important, and no longer mercantile wish - stop believing in fairy tales. You are professionals. DevOps is a very serious and responsible profession. Stop playing in the workplace. Let you click and you will understand it. Imagine that you will come to the hospital, and there the doctor will experiment with you. I understand that this can be offensive to someone, but most likely this is not about you, but about someone else. Tell others to stop too. It really spoils life for all of us - many begin to relate to exploitation, to admins and DevOps, as to dudes who again broke something. This was “broken” most often due to the fact that we went to play, and did not look with a cold consciousness that it was like that, but that way.
This does not mean that you do not need to experiment. We need to experiment, we do it ourselves. To be honest, we ourselves also play sometimes - this, of course, is very bad, but nothing human is alien to us. Let us declare 2019 the year of serious thoughtful experiments, not games on prod. Probably so.
- Many thanks!
Dmitry : Thank you, Vitaly, both for the time and for the interview. Dear readers, thank you very much if you suddenly reached this point. I hope that at least a couple of thoughts we brought you.
original interview as a podcast on DevOps Deflop, a Russian-language podcast about DevOps, and below is a text version.

Hereinafter, questions are asked by Vitaliy Khabarov engineer from Express42.
About Flant
- Hello Dima. You are the technical director of Flant and also its founder. Tell me, please, what is the company doing and are you in it?

As a technical director and one of the founders of the company, I work around the clock to think of ways to increase the availability of production, simplify its operation, make life easier for admins, and make life more enjoyable for developers.
About Kubernetes
- The last time from "Flanta" I see a lot of reports and articles about Kubernetes. How did you come to him?
Dmitry : I have already talked about this many times, but I am not at all sorry to repeat it. I believe that it is correct to repeat this topic, because there is confusion between cause and effect.
We really needed a tool. We faced a lot of problems, struggled, overcome them with different crutches and felt the need for an instrument. Sifted through many different options, built their bicycles, gained experience. Gradually we got to the point that we started using Docker almost as soon as it appeared - around 2013. At the time of its appearance, we already had a lot of experience with containers, we already wrote an analogue of “Docker” - some of our crutches in Python. With the advent of Docker, crutches can be thrown out and used by a reliable and community-supported solution.
With Kubernetes, the story is similar. By the time he started to gain momentum - for us this is version 1.2 - we already had a bunch of crutches on both Shell and Chef, which we somehow tried to orchestrate Docker. We seriously looked towards Rancher and various other solutions, but then Kubernetes appeared, in which everything is implemented exactly as we would or even better. There is nothing to complain about.
Yes, there is some kind of imperfection, there is some kind of imperfection - there are a lot of imperfections, and 1.2 is generally horrible, but .... Kubernetes is like a building under construction - you look at the project and understand that it will be cool. If the building now has a foundation and two floors, then you understand that it’s better not to populate yet, but there are no such problems with software - you can already use it.
We did not have the moment that we thought to use Kubernetes or not. We were waiting for him long before he appeared, and tried to make analogs ourselves.
Near Kubernetes
- Do you participate directly in the development of Kubernetes itself?
Dmitry : Mediocre. We are more likely to participate in the development of the ecosystem. We send a certain amount of pull requests: to Prometheus, to all kinds of operators, to Helm to the ecosystem. Unfortunately, I am not able to follow everything that we do and I can be mistaken, but there is not a single pool from our core.
- Do you develop a lot of your tools around Kubernetes?
Dmitry : The strategy is this: we go and pull-pull into everything that is already there. If pull requests are not accepted there, we simply fork them for ourselves and live until they are accepted with our builds. Then, when it reaches upstream, we return back to the upstream version.
For example, we have a Prometheus operator with which we switched back and forth to the upstream of our assembly 5 times already, probably. We need some feature, we sent a pull request, we need to roll it out tomorrow, and we do not want to wait until it is released in upstream. Accordingly, we collect for ourselves, roll our assembly with our feature, which for some reason we need, to all our clusters. Then, for example, they wrap us upstream with the words: “Guys, let's do it for a more general case”, we, or someone else, will finish this, and eventually merge again again.
Everything that exists, we are trying to develop. Many elements that are not yet there have not yet been invented or have been invented, but have not yet been implemented - we are doing it. And not because we like the process itself or bicycle building as an industry, but simply because we need this tool. They often ask the question, why did we do this or that thing? The answer is simple - because we had to go further, solve some practical problem, and we solved it with this tool.
The way is always this: we look very carefully and, if we do not find any solution, how to make a trolley bus out of a loaf of bread, then we make our own loaf and our own trolley bus.
Flant Tools
“I know that Flant now has addon operators, shell operators, dapp / werf tools. As I understand it, this is the same tool in different incarnations. I also understand that there are many more different tools inside Flant. This is true?
Dmitry : We still have a lot of things on GitHub. From what I’ll remember now, we have a statusmap - a panel for Grafana that has gone to everyone. It is mentioned in almost every second article about monitoring Kubernetes on the Medium. It is impossible to briefly describe what statusmap is - you need a separate article, but this is a very useful thing for monitoring status over time, since in Kubernetes we often need to show status over time. We also have LogHouse - this is a ClickHouse and black magic-based piece for collecting logs in Kubernetes.
Many utilities! And there will be even more, because a number of internal solutions will be released this year. Of the very large addon-based operators, there are a bunch of addons to Kubernetes, but how to install the sert manager correctly - a certificate management tool, how to install Prometheus with a bunch of dodges correctly - these are twenty different binaries that export data and collect something, for this Prometheus is awesome graphics and alerts. All this is just a bunch of addons to Kubernetes that are put into a cluster, and it turns from simple to cool, sophisticated, automatic, in which many issues have already been resolved. Yes, we do a lot.
Ecosystem development
- It seems to me that this is a very big contribution to the development of this tool and its methods of use. Can you roughly figure out who else would make the same contribution to the development of the ecosystem?
Dmitry : In Russia, none of those companies that operate in our market is close . Of course, this is a high-profile statement, because there are large players like Mail with Yandex - they also do something with Kubernetes, but even they have not matched the contribution of companies in the world, which are doing much more than we do. It is difficult to compare Flant with a staff of 80 people and Red Hat, in which there are only 300 Kubernetes engineers, if I am not mistaken. It’s hard to compare. We have 6 people in the RnD department, including me, who are sawing all our tools. 6 people against 300 Red Hat engineers - it’s somehow difficult to compare.
- Nevertheless, when even these 6 people can do something really useful and alienated, when they are faced with a practical task and give a decision to the community - an interesting case. I understand that in large technology companies, where they have their own development and Kubernetes support team, in principle, the same tools can be developed. This is an example for them that can be developed and given to the community, to give an impetus to the entire community that uses Kubernetes.
Dmitriy: Probably, this is an integrator chip, its feature. We have many projects and we see many different situations. For us, the main way to create added value is to analyze these cases, find the common ones and make them as cheap as possible for us. We are actively doing this. It's hard for me to talk about Russia and the world, but we have about 40 DevOps engineers at Kubernetes. I don’t think that in Russia there are many companies with a comparable number of specialists who understand Kubernetes, if any.
I understand everything about the job title DevOps engineer, everyone understands everything and are used to calling DevOps engineers DevOps engineers, we will not discuss this. All these 40 wonderful DevOps engineers face problems every day and solve them, we just analyze this experience and try to summarize. We understand that if it remains with us, then in a year or two the tool is useless, because somewhere in the community a ready-made tool will appear. It makes no sense to accumulate this experience inside - it's just a drain of time and energy in dev / null. And so we are not at all sorry. It is with great pleasure that we publish everything and understand that we need to publish, develop, promote, promote it, so that people can use and add their own experience - then everything grows and lives. Then, after two years, the tool does not go to the trash. It’s not a pity to continue to pour energy, because it’s clear
This is part of our big strategy with dapp / werf . I don’t remember when we started doing it, it seems, about 3 years ago. Initially, it was generally on the shell. It was a super proof of concept, we solved some of our private tasks - it turned out! But there are problems with the shell, it’s impossible to build it further, programming on the shell is something else. We had a habit of writing in Ruby, respectively, in Ruby we redid something, developed, developed, developed, and rested on the fact that the community, a crowd that does not say “we want or do not want”, turns up Ruby’s nose, it’s not funny. We realized that we should write all this stuff on Go, just to correspond to the first paragraph in the checklist: DevOps-tool should be a static binary . On Go or not Go is not so important, but a static binary written in Go is better.
Spent effort, rewrote dapp on Go and named it werf. Dapp is no longer supported, does not develop, it works in some latest version, but there is an absolute upgrade path to the top, and you can follow it.
Why was dapp created?
- Can you tell us briefly why dapp was created, what problems does it solve?
Dmitry : The first reason in the assembly. Initially, we had strong build problems when Docker did not know how to multi-stage, and we did multi-stage on our own. Then we had a bunch of questions with cleaning image. Everyone who makes CI / CD, sooner rather than later, faces the problem that there is a bunch of collected images, you need to somehow clean up what is not needed and leave what is needed.
The second reason is in the deploy. Yes, there is Helm, but it solves only part of the problems. Ironically, it is written that "Helm - the Package Manager for Kubernetes". Namely that "the". There are also the words “Package Manager” - what is the usual expectation from the Package Manager? We say: “Package Manager - deliver the package!” and expect him to tell us: "The package has been delivered."
It is interesting that we say: “Helm, put the package”, and when he answers that he has installed it, it turns out that he just started the installation - Kubernetes pointed out: “Run this thing!”, But it started up or not, it works or not Helm does not solve this problem at all.
It turns out that Helm is just a text preprocessor that loads data into Kubernetes.
But we want to know in the framework of any deployment - the application has rolled out to prod or not? Rolled out to the prod means that the application went there, a new version has deployed, and it has at least not crashed and responds correctly. Helm does not solve this problem. To solve it, you need to spend a lot of energy, because you need to give Kubernetes the command to roll out and monitor what happens there - whether it turned around, whether it rolled out. And there are also a bunch of tasks related to deployment, cleaning, and assembly.
Plans
This year we will go to local development. We want to come to what used to be in Vagrant - we typed “vagrant up” and we deployed virtual machines. We want to come to such a state that there is a project in Git, we write “werf up” there, and it raises a local copy of this project, deployed in a local mini-Kub, with all the directories convenient for development connected to. Depending on the development language, this is done in different ways, but, nevertheless, so that it is convenient to conduct local development under the mounted files.
The next step for us is to invest heavily in developer convenience.. In order to quickly deploy a project locally, with one tool, to complete, push to Git, and it will also roll out to stage or tests, depending on the pipelines, and then go to the prod with the same tool. This unity, unification, reproducibility of the infrastructure from the local environment to the sale is a very important moment for us. But this is not yet in werf - just planning to do it.
But the path to dapp / werf has always been the same as with Kubernetes in the beginning. We ran into problems, solved them by workarounds - we came up with some kind of solutions for ourselves on the shell, on anything. Then these workarounds tried to somehow straighten, generalize and consolidate into binaries in this case, which we simply share.
There is another view of this whole story, with analogies.
Kubernetes is a car frame with an engine. There are no doors, windows, a radio, Christmas trees - nothing at all. Only the frame and engine. And there is Helm - this is the steering wheel. Cool - there is a steering wheel, but still need a steering pin, steering rack, gearbox and wheels, but without them nothing.
In the case of werf, this is another component to Kubernetes. Only now in our alpha version of werf, for example, Helm compiles into werf altogether, because we are tired of doing this ourselves. There are many reasons to do this, in detail about why we compiled helm as a whole together with tiller inside werf, I will tell just at the report on RIT ++ .
Now werf is a more integrated component. We get a ready-made steering wheel, steering pin - I'm not very good at cars, but this is a large block that already solves a fairly wide range of tasks. We do not need to climb the catalog ourselves, pick up one part to another, think about how to fasten them to each other. We get a ready-made combine that immediately solves a large bundle of tasks. But inside it is made up of all the same open source components, it also uses Docker for assembly, Helm for part of the functionality, and there are several other libraries. This is an integrated tool to get fast and conveniently cool CI / CD out of the box.
Is Kubernetes difficult to maintain?
- You talk about the experience that you started using Kubernetes, this is a frame, an engine for you, and that you can hang a lot of different things on it: body, steering wheel, fasten pedals, seats. The question is - how difficult is Kubernetes support for you? You have rich experience, how much time and resources does it take you to support Kubernetes separately from everything else?
Dmitry : This is a very difficult question and to answer, you need to understand what support is and what we want from Kubernetes. Maybe you will reveal?
- As far as I know and as I see it, now many teams want to try Kubernetes. Everyone harnessed to him, put on his knee. I have a feeling that people do not always understand the complexity of this system.
Dmitry : That's it.
- How difficult is it to take and deliver Kubernetes with nothing to make it production ready?
Dmitry : Do you think how difficult it is to transplant a heart? I understand, the question is compromising. To carry with a scalpel and not make a mistake is not so difficult. If you are told where to cut and where to sew, then the procedure itself is simple. It is difficult to guarantee from time to time that everything will work out.
Put Kubernetes and make it work simple: chick! - delivered, there are a bunch of installation methods. But what happens when problems arise?
Questions always arise - what have we not yet taken into account? What haven't we done yet? What parameters of the Linux kernel did you specify incorrectly? Lord, did we even point them out ?! Which Kubernetes components have we delivered and which are not? Thousands of questions arise, and to answer them, you need to cook for 15-20 years in this industry.
I have a fresh example on this topic that may reveal the meaning of the “Is it difficult to maintain Kubernetes?” Problem. Some time ago, we seriously considered whether we should try to implement Cilium as a network in Kubernetes.
Let me explain what Cilium is. Kubernetes has many different implementations of the network subsystem, and one of them is very cool - it's Cilium. What is its meaning? In the kernel some time ago it became possible to write hooks for the kernel that somehow invade the network subsystem and various other subsystems and allow you to bypass large chunks in the kernel.
The Linux kernel historically has ip rout, a superfilter, bridges, and many different old components that are 15, 20, 30 years old. In general, they work, everything is cool, but now they have made a container of containers, and it looks like a tower of 15 bricks on top of each other, and you stand on it on one leg - a strange sensation. This system has historically developed with many nuances, like an appendix in the body. In some situations, there are problems with performance, for example.
There is a wonderful BPF and the ability to write hooks for the kernel - the guys wrote their hooks for the kernel. The package comes to the Linux kernel, they take it out right at the input, process it themselves without bridges, without TCP, without the IP stack - in short, bypassing everything that is written in the Linux kernel, and immediately spit it out in the container.
What happened? Very cool performance, cool features - just great! But we look at this and see that on each machine there is a program that connects to the Kubernetes API and, according to the data received from this API, generates C code and compiles the binaries that it loads into the kernel so that these hooks work in kernel space .
What happens if something goes wrong? We do not know. To understand this, you need to read all this code, understand all the logic, and this is stunned, how difficult. But, on the other hand, there are these bridges, net filters, ip rout - I did not read their sources, and 40 engineers who work in our company too. Maybe some pieces understand units.
And what's the difference? It turns out that there is ip rout, the Linux kernel, and there is a new tool - what's the difference, we do not understand either one or the other. But we are afraid to use the new - why? Because if the tool is 30 years old, then over 30 years all the bugs have been found, all the rakes have come and you don’t need to know everything - it works like a black box, and it always works. Everyone knows which diagnostic screwdriver to stick in which place, which tcpdump at what point to start. Everyone knows diagnostic utilities well and understands how this set of components works in the Linux kernel - not how it works, but how to use it.
And awesomely cool Cilium is not 30 years old, it is not yet mature. With Kubernetes the same problem, copy. That Cilium is set perfectly, that Kubernetes is set perfectly, but when something goes wrong in the prod, are you able to quickly understand what went wrong in a critical situation?
When we say it’s difficult to maintain Kubernetes, no, it’s very simple, and yes, it’s incredibly difficult. Kubernetes works great on its own, but with a billion nuances.
About the “I'm lucky” approach
- Are there companies where these nuances are almost guaranteed to appear? Suppose Yandex suddenly transfers all services to Kubernetes without exception, there will be a wow what kind of load.
Dmitry : No, this is not a conversation about the load, but about the simplest things. For example, we have Kubernetes, we deployed an application there. How to understand that it works? There is simply no ready-made tool to understand that the application is not crashing. There is no finished system that sends alerts, you need to configure these alerts and each schedule. And here we are updating Kubernetes.
There is Ubuntu 16.04. We can say that this is an old version, but we are still on it because there is LTS. There is systemd, the nuance of which is that it does not clean C-groups. Kubernetes launches pods, creates C-groups, then deletes pods, and somehow it turns out - I don't remember the details, I'm sorry - that the systemd slices remain. This leads to the fact that over time, any machine begins to brake significantly. This is not even a question about highload. If constant pods are started, for example, if there is a Cron Job that constantly generates pods, then the machine with Ubuntu 16.04 will start to slow down in a week. There will be a constantly high load average due to the fact that a bunch of C-groups have been created. This is a problem that anyone who simply installs Ubuntu 16 and on top of Kubernetes will run into.
Suppose he somehow updates systemd or something else, but in the Linux kernel to 4.16 is even funnier - when you delete C-groups, they leak in the kernel and are not actually deleted. Therefore, after a month of work on this machine, it will be impossible to see statistics on memory by heart rate. We get the file, roll it in the program, and one file rolls for 15 seconds, because the kernel counts for a very long time inside itself a million C-groups, which seem to be deleted, but no - they are leaking.
There are still a lot of such trifles both here and there. This is not a question that giant companies can sometimes encounter under very high loads - no, it is a matter of everyday things. People can live like this for months - put Kubernetes, deployed an application - it seems to work. Many are so normal. They will not even know that this application will fall for some reason, the alert will not come, but for them this is the norm. We used to live on virtual machines without monitoring, now we have moved to Kubernetes also without monitoring - what's the difference?
The question is that when we walk on ice, we never know its thickness, if not measured in advance. Many walk and do not worry, because they used to walk.
From my point of view, the nuance and complexity of operating any system is to ensure that the thickness of the ice is exactly enough to solve our problems. It's about that.
In IT, it seems to me that there are too many “I'm lucky” approaches. Many people install software, use software libraries in the hope that they are lucky. In general, many are lucky. This is probably why it works.
- From my pessimistic assessment, it looks like this: when the risks are great, and the application should work, then you need support from Flant, possibly Red Hat, or you need your own internal team dedicated specifically to Kubernetes, which is ready to pull it.
Dmitry : Objectively, this is so. Getting into a story with Kubernetes yourself on a small team is a certain amount of risk.
Do we need containers?
- Can you tell me how Kubernetes is generally distributed in Russia?
Dmitry : I do not have these data, and I'm not sure that anyone has them at all. We say: “Kubernetes, Kubernetes”, but there is another view on this issue. I do not know how widespread containers are, but I know a figure from reports on the Internet that 70% of containers are orchestrated by Kubernetes. It was a reliable source for a fairly large sample of the world.
Then another question - do we need containers? I have a personal feeling and, on the whole, Flant's position is such that Kubernetes is the de facto standard.
There will be nothing but Kubernetes.
This is an absolute game-changer in the field of infrastructure management. Just absolute - everything, no more Ansible, Chef, virtual machines, Terraform. I'm not talking about the old collective farm methods. Kubernetes is an absolute changer , and now it will be just that.
It is clear that someone needs a couple of years, and someone needs a couple of dozen to realize this. I have no doubt that there will be nothing but Kubernetes and this new look: we no longer hurt the OS, but use infrastructure as code , not just with code, but with yml - a declaratively described infrastructure. I have a feeling that it will always be so.
- That is, those companies that have not yet switched to Kubernetes will definitely go to it or remain in oblivion. I understood you correctly?
Dmitriy: This is also not entirely true. For example, if our task is to start the dns server, then it can be run on FreeBSD 4.10 and it can work fine for 20 years. Just work and that's it. It may take 20 years to update something once. If we talk about software in the format that we launched and it really works for many years without any updates, without making changes, then, of course, Kubernetes will not be there. He is not needed there.
Everything related to CI / CD - wherever you need Continuous Delivery, where you need to update versions, make active changes, wherever you need to build fault tolerance - only Kubernetes.
About microservices
- Here I have a little dissonance. To work with Kubernetes, you need external or internal support - this is the first moment. The second - when we are just starting development, we are a small startup, we still have nothing, development for Kubernetes or even for microservice architecture can be complicated, and not always justified economically. I’m interested in your opinion - do startups need to start writing from scratch immediately under Kubernetes, or can you still write a monolith, and then only come to Kubernetes?
Dmitry : A tough question. I have a report about microservices “Microservices: size matters.”Many times I came across the fact that people try to hammer nails with a microscope. The approach itself is correct; we design internal software in this way. But when you do this, you need to clearly understand what you are doing. Most of all in microservices, I hate the word "micro." Historically, this word appeared there, and for some reason people think that micro is very small, less than a millimeter, like a micrometer. This is not true.
For example, there is a monolith written by 300 people, and everyone who participated in the development understands that there are problems there, and it would be necessary to break it into micro-pieces - 10 pieces, each of which is written by 30 people in a minimal form. This is important, necessary and cool. But when a startup comes to us, where 3 very cool and talented guys wrote 60 microservices on their knees, every time I look for Corvalol.
It seems to me that they have already talked about this thousands of times - they have received a distributed monolith in one or another hypostasis. It is not economically justified; it is very difficult in general in everything. It’s just that I have seen it so many times that it directly hurts me, so I continue to talk about it.
To the initial question, that there is a conflict between the fact that, on the one hand, Kubernetes is scary to use, because it is not clear that it may break or not work, on the other hand, it is clear that everything goes there and nothing but Kubernetes will . The answer is to weigh the amount of benefit that comes, the amount of tasks that you can solve . This is on the one hand the scales. On the other hand, there are risks that are associated with downtime or with a decrease in response time, availability level - with a decrease in performance indicators.
Here it is - or we can move fast, and Kubernetes allows many things to be done much faster and better, or use reliable, time-tested solutions, but move much slower. Each company must make this choice. You can think of it as a path in the jungle - when you walk for the first time, you can meet a snake, a tiger or a mad badger, and when you go 10 times - trodden the path, removed the branches and walk easier. Each time the path is wider. Then it is an asphalt road, and later a beautiful boulevard.
Kubernetes does not stand still. Again, the question is: Kubernetes, on the one hand, is 4-5 binaries, on the other, it is the entire ecosystem. This is the OS that we have on the machines. What is it? Ubuntu or Curios? This is the Linux kernel, a bunch of additional components. All these things here were thrown out one poisonous snake from the road, there they put up a fence. Kubernetes is developing very quickly and dynamically, and the volume of risks, the volume of the unknown, is decreasing every month and, accordingly, these scales are rebalanced.
Answering the question of what to do a startup, I would say - come to Flant, pay 150 thousand rubles and get a turnkey DevOps easy service. If you are a small startup with several developers, this works. Instead of hiring your DevOps, who will need to learn to solve your problems and pay a salary at this time, you will receive a turnkey solution to all issues. Yes, there are some cons. As an outsourcer, we cannot be so involved and respond quickly to changes. But we have a lot of expertise, ready-made practices. We guarantee that in any situation we will quickly figure it out and lift any Kubernetes from the other world.
I strongly recommend outsourcing startups and established businesses to the point where you can set aside a team of 10 people for operation, because otherwise it makes no sense. It categorically makes sense to outsource.
About Amazon and Google
- Is it possible to consider a host from a solution from Amazon or Google as an outsourcing?
Dmitry : Yes, of course, this solves a number of issues. But again, the nuances. You still need to understand how to use it. For example, there are a thousand little things in the work of Amazon AWS: you need to warm up the Load Balancer or write a request in advance that “guys, we will get traffic, warm us the Load Balancer!” You need to know these nuances.
When you turn to people who specialize in this, you get almost all the typical things closed. We now have 40 engineers, by the end of the year there will probably be 60 of them - we definitely came across all these things. Even if on some project we once again encounter this problem, we quickly ask each other and know how to solve it.
Perhaps the answer is this - of course, the hosted story makes some part easier. The question is whether you are ready to trust these hosters, and whether they will solve your problems. Amazon and Google have proven their worth. For all of our cases - for sure. We no longer have any positive experiences. All the other clouds that we tried to work with, create a lot of problems - both Ager, and everything that is in Russia, and all kinds of OpenStack in different implementations: Headster, Overage - everything you want. They all create problems that you don’t want to solve.
Therefore, the answer is yes, but, in fact, there are not many mature hosted solutions.
Who needs Kubernetes?
“And yet, who needs Kubernetes?” Who should already be switching to Kubernetes, who is the typical Flanta client who comes for Kubernetes?
Dmitry : This is an interesting question, because right now in the wake of Kubernetes many people come to us: “Guys, we know that you are doing Kubernetes, do it to us!” We answer them: "Gentlemen, we do not do Kubernetes, we do prod and everything connected with it." Because to make a prod without having done the whole CI / CD and this whole story is currently simply impossible. Everyone has left the separation that we have developed by development, and then exploitation by exploitation.
Our clients expect different things, but everyone expects some kind of miracle that they have certain problems, and now - hop! - Kubernetes will solve them. People believe in miracles. They understand with reason that there will be no miracle, but they hope with their hearts - what if this Kubernetes now decides everything for us, they talk so much about it! Suddenly he is now - sneeze! - and a silver bullet, sneeze! - and we have 100% uptime, all developers can release 50 times what went to the prod, and it does not fall. In general, a miracle!
When such people come to us, we say: "I'm sorry, but there is no miracle." To be healthy, you need to eat well and exercise. To have a reliable product, it must be done reliably. To have a convenient CI / CD, you need to make it like that. This is a lot of work that needs to be done.
When asked who needs Kubernetes, no one needs Kubernetes.
Some people have the erroneous feeling that they need Kubernetes. People need, they have a deeply urgent need to stop thinking, being engaged, interested in all the problems of infrastructure and the problems of launching their applications. They want applications to just work and just deploy. For them, Kubernetes is the hope that they will no longer hear the story that “we were lying there,” or “we can’t roll out,” or something else.
Usually a technical director comes to us. Two things are being asked from him: on the one hand, give us features, on the other hand, stability. We offer to take it upon ourselves and do it. The silver bullet, more precisely, silver plated, is that you stop thinking about these problems and waste time. You will have special people who close this question.
The wording that we or someone needs Kubernetes is wrong.
Admins really need Kubernetes, because it’s a very interesting toy to play with, to dig deeper into. Let's be honest - everyone loves toys. We are all children somewhere, and when we see a new one, we want to play it. For some, this was repulsed, for example, in the administration, because they had already played enough and were already tired of what I simply did not want. But this is not completely repulsed by anyone. For example, if I’ve been tired of toys in the field of system administration and DevOps for a long time, then I still love toys, I buy some new ones anyway. One way or another, all people still want some kind of toys.
No need to play with production. What would I categorically not recommend doing and what I see now massively: “Ah, a new toy!” - They ran to buy it, bought it and: “Let's take her to school now, show her to all her friends.” Do not do so. I apologize, I just have children growing up, I constantly see something in children, I notice it in myself, and then I generalize to the others.
Final answer: you do not need Kubernetes. You need to solve your problems.
You can achieve the fact that:
- prod does not fall;
- even if he tries to fall, we know about this in advance, and we can put something;
- we can change it at the speed with which we need for business, and to do it conveniently, it does not cause problems for us.
There are two real needs: reliability and dynamism / flexibility of rolling out. Anyone who is doing some IT projects right now doesn’t care about soft for easing the world in any business, and who understands this, they need to solve these needs. Kubernetes with the right approach, with the right understanding and with sufficient experience allows them to be solved.
About serverless
- If you look a little further into the future, then trying to solve the problem of lack of headache with the infrastructure, with roll-out speed and application change speed, new solutions appear, for example, serverless. Do you feel any potential in this direction and, let’s say so, a danger to Kubernetes and similar solutions?
Dmitry : Here we need to make a remark again, that I am not a visionary who looks ahead and says - it will be like that! Although I just did the same. I look at my feet and see a bunch of problems there, for example, how transistors work in a computer. Funny, huh? We encounter some bugs in the CPU.
Making serverless reliable enough, cheap, efficient and convenient, resolving all ecosystem issues. Then I agree with Elon Mask that we need a second planet to make fault tolerance for humanity. Although I don’t know what he’s saying, I understand that I’m not ready to fly to Mars myself and it won’t be tomorrow.
With serverless, it is clear that this is the ideologically correct thing, as fault tolerance for humanity - two planets are better than one. But how to do it now? To send one expedition is not a problem if we concentrate on this effort. To send several expeditions and populate several thousand people there, I think, is also realistic. But it’s completely to make fault tolerance that half of humanity lives there, it seems to me now impossible, not considered.
With serverless one to one: the thing is cool, but it is far from the problems of 2019. Closer to 2030 - let's live to see it. I have no doubt that we will survive, we will certainly survive (repeat before bedtime), but now we need to solve other problems. It's like believing in a fabulous pony Rainbow. Yes, a couple of percent of cases are solved, and solved perfectly, but subjectively serverless is a rainbow ... For me this topic is too far and too incomprehensible. I am not ready to talk. In 2019, with serverless, you will not write a single application.
How Kubernetes Will Develop
“As we move toward this potentially beautiful, distant future, what do you think will develop Kubernetes and the ecosystem around it?”
Dmitry : I thought a lot about this and I have a clear answer. The first is statefull - still stateless is easier to do. Kubernetes initially invested more in it, it all started with it. Stateless works almost perfectly in Kubernetes, there’s nothing to complain about. By statefull there are still a lot of problems, or rather, nuances. Everything already works fine for us there, but we are. For this to work for everyone, you need at least a couple more years. This is not a calculated indicator, but my feeling from the head.
In short, statefull needs to - and will - develop very much, because all of our applications have status, there are no stateless applications. This is an illusion, you always need some kind of database and something else. Statefull is the rectification of everything that is possible, the correction of all bugs, the improvement of all the problems that are currently facing - let's call it adoption.
The level of the unknown, the level of unresolved problems, the level of probability of encountering something, will fall sharply. This is an important story. And the operators - everything related to the codification of administration logic, control logic, to get easy service: MySQL easy service, RabbitMQ easy service, Memcache easy service - in general, all these components that we need to get guaranteed to work out of the box. This just solves the pain that we want a database, but don’t want to administer it, or want Kubernetes, but don’t want to administer it.
This story with the development of operators in one form or another will be important in the next couple of years.
I think the ease of use should greatly increase - the box will become more and more black, more and more reliable, with more and more simple twists.
I once listened to Isaac Asimov’s old 80s interview on YouTube on Saturday Night Live, an Urgant-like show that's just interesting. He was asked there about the future of computers. He said that the future is simple, as was the case with the radio. The radio was originally a complicated thing. To catch the wave, it took 15 minutes to twist the twirls, twirl the skewers and generally know how everything works, to understand the physics of radio wave transmission. As a result, one twist remained in the radio.
Now in 2019, which radio? In the car, the radio finds all the waves, the name of the stations. The physics of the process has not changed in 100 years, the ease of use has changed. Now, and not only now, already in 1980, when there was an interview with Azimov, everyone used the radio and no one thought about how it was arranged. It always worked - it is a given.
Azimov then said that it would be similar with computers - ease of use would increase . If in 1980 you need to get a special education to press buttons on a computer, then in the future this will not be so.
I have the feeling that with Kubernetes and with the infrastructure, ease of use will also increase dramatically. This, in my opinion, is obvious - it lies on the surface.
What to do with the engineers?
- And then what will happen to the engineers, system administrators who support Kubernetes?
Dmitry : And what happened to the accountant after the appearance of 1C? About the same. Before that, they thought on a piece of paper - now in the program. Labor productivity has increased by orders of magnitude, and labor itself has not disappeared from this. Previously, 10 engineers needed to screw in the bulb, but now one will be enough.
The number of software and the number of tasks, it seems to me, is now growing at a speed greater than new DevOps and the efficiency is increasing. There is a specific shortage on the market and it will last a long time. Later, everything will go into a certain norm, at which the efficiency of work will increase, it will become more serverless, they will attach a neuron to Kubernetes, which will select all the resources right as it should, and in general it will do everything as it should - the person get away and do not bother.
But anyway, someone will have to make decisions. It is clear that the level of qualification and specialization of this person is higher. Now in the accounting department you do not need 10 employees who keep books so that their hand does not get tired. It is just not necessary. Many documents are automatically scanned and recognized by the electronic document management system. One clever chief accountant is enough, already with much larger skills, with a good understanding.
In general, such a path in all sectors. It is the same with cars: earlier, a car mechanic and three drivers were attached to the car. Now driving a car is the simplest process in which we all participate every day. No one thinks that a car is something complicated.
DevOps or systems engineering will not go anywhere - high-level and operational efficiency will increase.
- I also heard an interesting idea that in fact the work will increase.
Dmitry : Of course, one hundred percent! Because the amount of software that we write is constantly growing. The number of issues that we solve with software is constantly growing. The amount of work is growing. Now the DevOps market is terribly overheated. This is evident from salary expectations. In a good way, without going into details, there should be juniors who want X, middles who want 1.5X, and seniors who want 2X. And now, if you look at the Moscow DevOps salary market, the junior wants from X to 3X and the senior wants from X to 3X.
No one knows how much it costs. Your salary level is measured by your confidence - a complete madhouse, to be honest, terribly overheated market.
Of course, this situation will change very soon - some saturation should come. With software development, this is not the case - despite the fact that everyone needs developers and everyone needs good developers, the market understands how much it costs - the industry has settled down. This is not the case with DevOps.
- From what I heard, I concluded that the current system administrator should not be very worried, but it’s time to download skills and prepare for the fact that tomorrow there will be more work, but it will be more highly qualified.
Dmitry : Absolutely. In general, we live in 2019 and the rule of life is: lifetime learning - we learn all our life. It seems to me that now everyone knows and feels it, but knowing a little is necessary. Every day we have to change. If we do not, then sooner or later we will be dropped off on the sidelines of the profession.
Get ready for sharp 180-degree turns. I do not exclude a situation when something changes dramatically, they come up with something new - it happens. Hop! - and now we act differently. It is important to be prepared for this and not to steam. It may happen that tomorrow everything that I do will be unnecessary - nothing, I’ve studied all my life and am ready to learn something else. It's not a problem. You should not be afraid of job security, but you need to be prepared to constantly learn something new.
Wishes and a minute of advertising
- Will you have any wish?
Dmitry : Yes, I have a few wishes.
The first and mercantile - subscribe to YouTube . Dear readers, go to YouTube and subscribe to our channel. In about a month, we will begin an active expansion into the video service. We will have a bunch of educational content about Kubernetes, open and different: from practical things, up to laboratories, to deep fundamental theoretical things and how to apply Kubernetes at the level of principles and patterns.
Second mercantile wish - go to GitHuband put stars, because we eat them. If you don’t give us stars, we will have nothing to eat. It's like mana in a computer game. We are doing something, doing, trying, someone says that these are terrible bicycles, someone that everything is generally wrong, and we continue and act absolutely honestly. We see the problem, solve it and share our experience. Therefore, give us an asterisk, it will not decrease from you, but it will come to us, because we eat them.
The third, important, and no longer mercantile wish - stop believing in fairy tales. You are professionals. DevOps is a very serious and responsible profession. Stop playing in the workplace. Let you click and you will understand it. Imagine that you will come to the hospital, and there the doctor will experiment with you. I understand that this can be offensive to someone, but most likely this is not about you, but about someone else. Tell others to stop too. It really spoils life for all of us - many begin to relate to exploitation, to admins and DevOps, as to dudes who again broke something. This was “broken” most often due to the fact that we went to play, and did not look with a cold consciousness that it was like that, but that way.
This does not mean that you do not need to experiment. We need to experiment, we do it ourselves. To be honest, we ourselves also play sometimes - this, of course, is very bad, but nothing human is alien to us. Let us declare 2019 the year of serious thoughtful experiments, not games on prod. Probably so.
- Many thanks!
Dmitry : Thank you, Vitaly, both for the time and for the interview. Dear readers, thank you very much if you suddenly reached this point. I hope that at least a couple of thoughts we brought you.
In an interview, Dmitry touched on werf. Now it is a universal Swiss knife that solves almost all tasks. But it was not always so. At DevOpsConf at the RIT ++ festival, Dmitry Stolyarov will talk about this tool in detail. The report “werf is our tool for CI / CD in Kubernetes” will have everything: problems and hidden nuances of Kubernetes, solutions to these difficulties and the current implementation of werf in detail. Join on May 27 and 28, we will create ideal tools.