Let's try to evaluate Kubernetes

Original author: Daniel Morsing
  • Transfer
Hello, Habr!

For a while we have been looking at books on Kubernetes, fortunately, they are already published in Manning and O'Reilly. We can agree that Kubernetes in our area is still interesting more from an introductory and engineering than from a practical point of view. However, we still put here the cover of a book about Kubernetes and a translation of an article by Daniel Morse, who made an interesting teaser on his blog about this system.

Enjoy reading!

Introduction

Going on tour of Go language conferences, I listen to about three different lectures about how you can still download Kubernetes about three times a year. At the same time, I had a very unsteady idea of ​​what Kubernetes is, I did not have to talk about deep knowledge. But now I had free time, and I decided: I will try to port a simple application to Kubernetes, I will have fun with it and describe my first impressions.

I’m not going to write a post from the category “Acquaintance with Kubernetes”, since it’s better to read about this in the official documentation , I won’t even get into it. In fact, the documentation is very good. In particular, I really liked the Concepts section when I tried to get a general idea of ​​the system. Documentation Authors - Hats Off!

I'll tell you about the application. I have my own blog server on which the original of this article lies. It so happened that he works on a small Linode instance, which I manage manually. It might seem that using a whole cluster management stack to deploy such a small application is firing sparrows from a cannon; in fact, it is. But I found that this is how it is convenient to practice working with the system. At the time of publication of the original, my blog was working on a single-node cluster of Google Container Engine.

Hearth life cycles

Kubernetes is famous for its streamlined scheduler. For each deployed system managed via Kubernetes, a group of containers (the so-called “under”) is assigned, in which machines (so-called “nodes”) can be started. In the process of rolling out and scaling resources, new pods are created and eliminated, depending on the requirements of the replica. Such planning provides a more rational use of resources, but it seems to me that the pods themselves are not as revolutionary as the environment in which Kubernetes operates. Kubernetes provides out-of-the-box image management, an internal domain name system, and roll-out automation. Therefore, it seems to me advisable to build such a system where specific pods are correlated with specific nodes, so any need for a scheduler actually disappears.

One of the tasks that, apparently, cannot be solved within the framework of this life cycle is that the system does not know how to work with hot caches. In principle, keeping such caches is a bad idea, but in practice they come across in some clusters. If you have a memcache container, and you are running a server with it, then upgrading such a server inevitably requires killing memcache first. There is a mechanism for working with state-saving hearths, but the entire state must be dumped to disk, and then this state is read back when the state is rescheduled. If such a need actually arises, then you are unlikely to be impressed by the prospect of waiting until all this economy is restored from disk.

Networks

The network that provides information exchange between the pods is arranged quite nicely. On the Google Cloud Platform, each host receives / 24 subnets in a closed range of 10.0.0.0/8, and each subnet then receives its own IP. Then, at each hearth, you can use any port range for your services that you like. This isolation eliminates a situation in which several applications would try to connect to the same port. If you want to reset the http server on port 80, you can do so without worrying about other http servers. Most applications are "trained" to avoid 80, but many things open debugging servers on 8080, and I know how whole systems crashed because of this.

Another advantage of this work with namespaces is that a similar system allows you to deal with various programs that can be configured to interact with non-standard ports. In fact, it’s really damn difficult to keep DNS on any port except 53, because it will not be possible to send a request to such a DNS.

The basis of this isolation is the ratio "each node has its own subnet." If the provider gives you only one IP for each node, you will have to install some kind of overlay network, and such networks (in my experience) do not work too well.

So, the communication between the pods is well established, but you still need to accurately determine all the IP addresses in order to ensure communication between them. Kubernetes has so-called “services” for this. By default, each “service” receives a specific internal cluster IP. This IP can be recognized by the internal domain name system and connected to it. If you find several pods suitable for a particular service, then load balancing between them will automatically turn on, the processing of which will be done by the node that initiated the communication.

I still can’t decide whether I like “a single IP for the entire cluster”. Most applications are pretty poorly able to process several DNS values ​​obtained in bulk, and will choose the result that they get first - because of which the distribution of resources will be uneven. A single IP per cluster removes this problem. However, here we fall into the classic trap - we confuse the discovery of services and maintaining them in a "live" form. Turning to work with ever larger clusters, we meet more and more asymmetric segments. It is possible that the node on which it is hosted under will be able to perform a health check and report about it to the Kubernetes host, but it will not be accessible from other nodes. When this happens, the load balancer will still try to reach the failed node, and since there is only one IP for the entire cluster, you will not have a backup option. You can try to end the connection and try again, in the hope that when the load is redistributed, the TCP connection will go through another node, working, but this solution is not optimal.

When deploying Kubernetes, I advise you to organize such a health check: find out how many requests have been submitted to a specific sub since the previous check. If this amount is below a certain limit - mark that under the junk. Thus, the load balancer must quickly learn to avoid this sub, even if it is reachable from the Kubernetes host. You can configure services to receive IP hearths directly, but if you do not need a network identifier, I don’t see the need for this.
Since the hearths have closed IPs that are not routed through the public Internet, we need some kind of mechanism for translating local ports into routable ones if you need to access information from outside the cluster. Usually, I’m not enthusiastic about creating NAT in such cases, so the transition to IPv6 would be most logical: we simply allocate a publicly routed subnet to each node, and the problem goes away by itself. Unfortunately, Kubernetes does not support IPv6.

In practice, NAT is not such a big problem. When working with traffic coming from outside the cluster, Kubernetes encourages us to actively work with services that interact with cloud-based load balancers that provide a single IP. Since many Kubernetes developers have been orchestrating on Google, it is not surprising that the mechanism described is based on the Maglev model .

Unfortunately, I have not yet figured out how to not only establish the interaction of the service with the outside world, but also provide high availability in an environment where there is no such load balancer. You can tell Kubernetes to redirect all traffic that reaches the cluster to an external host, but IPs are not taken into account when distributing the hearths across the nodes. If it is output from the node to which this IP is routed, then the node has to organize NAT - which means we are loading it with extra work. Another problem is the so-called port conflicts at the external IP level. If you have a controller that updates external IPs (as well as information in the DNS service that you use), depending on which nodes these IPs “landed” on, it may happen that two pods simultaneously want traffic from port 80,
While I am somehow surviving with a cloud load balancer. It's easier, but I'm not going to use Kubernetes anywhere outside the cloud for the foreseeable future.

A little more about cloud magic

Another area in which cloud magic is actively used is the management of persistent storage. The existing options for getting to the disk, which can be accessed from the hearth, are clearly tied to working with cloud providers. Yes, you can create a repository, which in essence is just a removable disk on the host, but this feature is still in alpha state and is not used in production. It would be interesting to check whether it is possible to bootstrap the NFS service running on Kubernetes, and then arrange for Kubernetes to manage persistent storage through it, but I do not plan to take on such a task.

It seems to some that cloud magic is the last straw forcing everyone to abandon it, but today cluster computing is so dependent on cloud services that it is very difficult to do without clouds. I notice that people try not to get involved in this magic in order to avoid dependence on the supplier, however, while they underestimate the cost of self-development, as well as the many implicit moments implied when working on a cloud platform. Kubernetes provides a consistent interface through which it is convenient to engage in this cloud magic. So, you can trust him, and he (as a minimum) it is technically standardized

output

So our little excursion to Kubernetes has ended - I hope you enjoyed it. Naturally, the article deals with toy problems, so I can simply not catch the difficulties that arise when using Kubernetes on a large scale. I also work on a cloud platform, so I have no idea what the machine interface is made of, maybe wax. However, as a user of an innovative system, I can guess why many people are looking for alternative ways to download it.

Only registered users can participate in the survey. Please come in.

Relevance of the topic

  • 72.5% Yes, I'm interested in O'Reilly 74
  • 2.9% The book is not needed, the technology is too far from life 3
  • 8.8% The book is not needed, the technology becomes outdated too quickly 9
  • 29.4% Better Publish a Book on Docker 30
  • 17.6% Cool, is it possible that Peter resumes the tradition with Friday translated articles ?! 18

Also popular now: