diafour June 21, 2019 at 12:06

GitOps: comparing Pull and Push methods

Transfer

Note perev. : In the Kubernetes community, a trend called GitOps is gaining popularity, as we personally saw when visiting KubeCon Europe 2019. This term was coined relatively recently by the head of Weaveworks - Alexis Richardson - and means the use of familiar tools for developers (primarily Git, whence the name itself) for solving operational problems. In particular, we are talking about exploiting Kubernetes through storing its configurations in Git and automatically rolling out changes to the cluster. Matthias Jg talks about two approaches to this rollout in this article.

Last year (in fact, formally it happened in August 2017 - approx. Transl.)a new approach to deploying applications in Kubernetes has appeared. It is called GitOps, and it is based on the basic idea that deployment version tracking is done in a secure Git repository environment.

The main advantages of this approach are as follows :

Versioning deployments and change history . The state of the entire cluster is stored in the Git repository, and deployments are only updated by commits. In addition, all changes can be tracked using the commit history.
Kickbacks using familiar Git commands . Simple git resetallows you to discard changes in deployment'ah; past states are always available.
Ready access control . Typically, a Git system contains a lot of confidential data, so most companies pay special attention to protecting it. Accordingly, this protection extends to operations with deployments.
Policies for deployments . Most Git systems initially support policies for different branches - for example, only pull requests can update the master, and another member of the team must check and accept the changes. As with access control, the same policies apply to deployment updates.

As you can see, the GitOps method has many advantages. Over the past year, two approaches have gained particular popularity. One is based on push, the other on pull. Before looking at them, let's first see what typical Kubernetes deployments look like.

Deployment Methods

In recent years, various deployment methods and tools have been established at Kubernetes:

Based on native Kubernetes / Kustomize templates . This is the easiest way to deploy applications to Kubernetes. The developer creates the basic YAML files and applies them. To get rid of the constant rewriting of the same patterns, Kustomize was developed (it turns Kubernetes patterns into modules). Note perev. : Kustomize has been integrated into kubectl with the release of Kubernetes 1.14 .
Charts Helm . Helm charts allow you to create sets of templates, init containers, sidecar'ov, etc., which are used to deploy applications with more flexible configuration options than in the template-based approach. This method is based on template YAML files. Helm fills them with various parameters and then sends them to Tiller, the cluster component, which deploys them in the cluster and allows for updates and rollbacks. The important thing is that, in fact, Helm simply inserts the necessary values into the templates and then applies them in the same way as it is done in the traditional approach (for more details on how it all works and how you can use it, read our article on Helm - approx. .). There is a wide variety of ready-made Helm charts covering a wide range of tasks.
Alternative tools . There are many alternative tools. All of them are united by the fact that they turn some template files into Kubernetes friendly YAML files and then apply them.

In our work, we constantly use Helm charts for important tools (since a lot of them are already ready, which greatly simplifies life) and Kubernetes “clean” YAML files for deploying our own applications.

Pull & push

In one of my recent blog posts, I introduced the Weave Flux tool , which allows you to commit templates to the Git repository and update deployment after each commit or push container. My experience shows that this tool is one of the main in promoting the pull approach, so I will often refer to it. If you want to know more about how to use it, here is a link to the article .

NB! All the benefits of using GitOps are retained for both approaches.

Pull Based Approach

The pull approach is based on the fact that all changes are applied from within the cluster. Inside the cluster, there is an operator that regularly checks the associated Git and Docker Registry repositories. If any changes occur in them, the state of the cluster is updated internally. It is usually considered that such a process is very safe, since no external client has access to cluster administrator rights.

Pros:

No external client has the right to make changes to the cluster; all updates are rolled from the inside.
Some tools also allow you to synchronize updates to Helm charts and bind them to a cluster.
Docker Registry can be scanned for new versions. If a new image appears, the Git repository and deployment are updated to the new version.
Pull tools can be distributed across different namespaces with different Git repositories and permissions. Thanks to this, it is possible to use the multitenant model. For example, team A can use namespace A, team B can use namespace B, and an infrastructure team can use global space.
As a rule, tools are very lightweight.
Combined with tools like the Bitnami Sealed Secrets statement , secrets can be stored encrypted in the Git repository and retrieved within the cluster.
There is no communication with CD pipelines, since deployments occur within the cluster.

Cons :

Managing deployment secrets from Helm charts is more complicated than usual, since you first have to generate them in, say, sealed secrets, then decrypt them with an internal operator and only after that they become available for the pull tool. Then you can launch the release in Helm with values in already deployed secrets. The easiest way is to create a secret with all the Helm values used for deployment, decrypt it and commit in Git.
Using the pull approach, you find yourself tied to tools that operate on pulls. This limits the ability to customize the deployment deployment process in the cluster. For example, working with Kustomize is complicated by the fact that it must be executed before final templates arrive in Git. I'm not saying that you cannot use individual tools, but they are more difficult to integrate into the deployment process.

Push Based Approach

In the push approach, an external system (mainly CD pipelines) starts deploying to the cluster after committing to the Git repository or in case of successful execution of the previous CI pipeline. In this approach, the system has access to the cluster.

Pros :

Security is determined by the Git repository and build pipeline.
Deploying Helm charts is easier; there is support for Helm plugins.
Secrets are easier to manage, because secrets can be used in pipelines, as well as stored in Git in an encrypted form (depending on the user's preferences).
Lack of binding to a specific tool, since any of their types can be used.
Container version updates can be triggered by the assembly pipeline.

Cons :

The data for accessing the cluster is located inside the build system.
Updating deployment containers is still easier to do with the pull process.
It’s very dependent on the CD system, because the pipelines we need are probably originally written for Gitlab Runners, and then the team decides to switch to Azure DevOps or Jenkins ... and you will have to migrate a large number of build pipelines.

Bottom line: Push or Pull?

As usual, each approach has its pros and cons. Some tasks are easier to accomplish with one and more difficult with the other. At first, I spent deployments manually, but after I came across several articles about Weave Flux, I decided to implement GitOps processes for all projects. For basic templates, this turned out to be easy, but then I began to encounter difficulties in working with Helm charts. At that time, Weave Flux only offered a rudimentary version of the Helm Chart Operator, but even now some tasks are more complicated due to the need to manually create secrets and apply them. You can say that the pull approach is much more secure, since the cluster credentials are not available outside it, and this increases security so much that it costs extra effort.

After thinking a bit, I came to the unexpected conclusion that this is not so. If we talk about components that require maximum protection, this list will include storage of secrets and CI / CD-systems, Git repositories. The information inside them is very vulnerable and needs maximum protection. In addition, if someone enters your Git repository and can push the code there, then he will be able to deploy everything he wants (regardless of the chosen approach, it will be pull or push) and infiltrate the cluster systems. Thus, the most important components requiring protection are the Git repository and CI / CD systems, not the cluster credentials. If you have well-configured policies and security measures for systems of this type, and the cluster credentials are retrieved in pipelines only as secrets,

So, if the pull approach is more time consuming and does not give a gain in security, is it not logical to use only the push approach? But someone may say that in the push approach you are too tied to the CD-system and, perhaps, it’s better not to do so in order to make migrations easier in the future.

In my opinion (as always), you should use what is more suitable for a particular case or combine. Personally, I use both approaches: Weave Flux for pull-based deployments that mainly include our own services, and a push approach with Helm and plugins that simplifies the application of Helm charts to the cluster and allows you to easily create secrets. I think there will never be a single solution that is suitable for all cases, because there are always a lot of nuances and they depend on the specific application. At the same time, I highly recommend GitOps - it greatly simplifies life and improves security.

I hope my experience on this topic will help determine which method is more suitable for your type of deployment, and I will be glad to know your opinion.

PS Note from the translator

In the minuses of the pull model there is a point about the fact that it is difficult to put rendered manifests in Git, but there is no minus that the CD pipeline in the pull model lives separately from the rollout and, in fact, becomes a Continuous Apply category pipeline . Therefore, even more efforts will be required in order to collect their status from all deployments and somehow give access to the logs / status, and preferably with reference to the CD system.

In this sense, the push model allows you to give at least some guarantee rollout, because the lifetime of the pipeline can be made equal to the rollout lifetime.

We tested both models and came to the same conclusions as the author of the article:

The pull model is suitable for us for organizing system component updates on a large number of clusters (see the article about addon-operator ).
The GitLab CI-based push model is well suited for rolling out applications using Helm charts. In this rollout deployment'ov within the pipelines is monitored using the werf tool . By the way, in the context of our project, we heard the constant “GitOps” when we discussed the pressing problems of DevOps engineers at our booth at KubeCon Europe'19.

PPS from the translator

Do you use GitOps?

5.5% Yes, pull 2 approach
27.7% Yes, push 10
11.1% Yes, pull + push 4
0% Yes, something else 0
55.5% No 20

Tags: