What is common between peeling eggs and DevOps?
- Transfer
Here is a translation of the Patrick Lee Scott article posted on hackernoon.com. The author offers to get acquainted with several important principles that will help you pump in DevOps.
A couple of days ago I tried to peel an egg in a stupid way, and my girlfriend Angeli asked me why I was doing this.
She took another hard-boiled egg and hit it on a cutting board, then pressed it and rolled it on the table. It took only 3 seconds to clean. And I stood and took off the shell piece by piece.
What I want to say by this: we exaggerate the importance of the efforts.
The best solutions are simple and effective. Is it difficult to clean a rusty bolt? Maybe yes. But only if you do not know that this can be done with Coca-Cola. Still complicated? Not. You just need to put the bolt in the glass and wait.
Without knowing simple techniques, you cannot apply them. Instead of implementing, you conduct experiments; instead of replication, you do research.
If in programming you use some kind of approach again and again, because it allows you to simplify the problem being solved, this approach becomes a template or the so-called best practice.
Despite complex and intimidating names like Command Query Responsibility Segregation (CQRS) or Event Sourcing (ES), these practices help solve problems. Especially those that arise when building distributed systems.
If you look at the development as a whole, we will see that there are more universal principles, for example, “Keep it Simple, Stupid” (KISS) and “Don't Repeat Yourself” (DRY). I would like to talk about similar patterns and principles in relation to DevOps.
DevOps is often presented as the promised land, on which birds sing and the sun shines. But without using the right methods, DevOps will turn into hell, and you will pierce all your fingers with a shell (like me).
Creating DevOps systems, I found several solutions for myself in addition to universal principles like KISS and DRY. While they can not be called templates, but they will help you quickly clean the egg. These solutions (in random order):
If you have the opportunity to purchase a finished product or if you have the necessary and convenient tool in the public domain, use this.
Do not reinvent the wheel - buy it.
Did you know that you can use the same mail server that Craigslist uses? And what is it for free? Need a mail server - do not create a new one, work with an existing one.
I like to look for tools in awesome-self-hosted lists or use the Helm Hub for this.
Despite the fact that Helm is a fairly new tool for finding software, this site has already been created: https://v3.helm.sh/
With Helm, you can easily search for ready-made bikes.
Let's go back in time when I was just starting to program. In 9th grade, I learned my first “real” language - QBasic. Before that, I’ve been creating HTML and CSS sites for a couple of years. Then the Internet was new, and although people knew how to create packages, they did not share them, as we are now. Usually I used floppy disks for storage - it would be great to find them together with a game like arkanoid, which I wrote in Java Swing in grade 11 ...
All I used then was standard libraries, baby! At least at 13 I knew only about them. Although I am sure that some java pros (hello, Vlad and Nick!) Were already cool then. ;)
Now everything is wrong. Today you can find whole UI libraries. You can find a library to elegantly and easily deal with dates with a single command.
It would be great if there was a tool to use and install fully functioning subsystems, right? Need a database? install database
Good news: such a tool exists! With Helm, you can also install fully functioning subsystems that are packaged and ready to use.
The principle is the same as for NPM and Gradle, but the packages that we manage in Helm are called charts. Charts mark up containers so that they run on Kubernetes in a variety of ways.
It turns out a turnkey solution, buying a bike. Charts are the wheel, and containers with descriptions that run inside Kubernetes are wheels.
What's cool about the charts is the ability to pack services and run them in any Kubernetes cluster and even share them with other people if you want.
This means that you can describe the whole environment using code:
Need to upgrade your marketing site to 1.2.0? Just make changes and commit.
I once sat at the table, looked at my code and tried to track the bug. Users have been complaining about him for several weeks now, and finally I had a little free time to hang around.
Tada-a-am! Found! Fixed!
I sat behind a partition at my workplace in a sunlit room and shouted to the others: “I fixed the bug!”
Next Tuesday, when users see the release, they will definitely appreciate my efforts! But for this we will have to pack up and try to move the release from the place ... Maybe it will work out if nothing happens ...
Well, okay, if the release still comes next Tuesday, users will definitely appreciate it!
That's how I was doing deployment at my first college job, as a junior programmer.
Since then, much has changed.
Now I use trunk-based development and deploy modules many times a day. When I send a Pull Request, the bot will post a corresponding comment with the review code with the collected environment after the tests have passed and the builds have been collected.
Today you don’t have to scream through the partition in the office.
The more freedom you give programmers — the freedom to control the parts of the infrastructure they need — the easier it is for you to work as a DevOps engineer.
In the first lesson, we saw that for updating in production it is enough to change one digit and commit. In an ideal world in which we pack applications into charts, each programmer in a team has the opportunity to influence how the tool works at the production stage. Kubernetes understands charts as descriptions of containers, which is why they describe resource requirements.
The logic is something like this: I can’t guess how much memory or what CPU settings will be required for a new service (and I think that I'm not the only one). Therefore, I also deploy monitoring and alerts with rules according to which my team and I will be informed in Slack that these settings need to be tweaked. That is, the system will inform you about the necessary settings when we deploy it. I used to sit for hours, sending requests through Prometheus and adjusting the settings, just like I used to pick out the shell. And now I have learned to do everything wisely.
I can already hear you say: “It sounds too complicated!” Not. Just set the chart.
If you can automate or simplify something - go ahead. For example, what if you could assign a DNS path by simply deploying the service with the label "expose: true"? This is where the operators appear. This is a more advanced Kubernetes tool, and you should get to know it. But let's not get too deep into the details.
This was a real revelation for me. I had to look at things from a different angle, so listen carefully.
For more than ten years, I thought that in the world there are only a few types of environments. In the simplest scenario, there is staging and production of the environment. First deployed to staging, then tested, moved to the next stage, deployed, tested and so on. As soon as everything is stuck together - the son is integrated - you can go into production.
I followed this pattern year after year and never once doubted it. It has always been so.
And this scheme is another unproductive way to get rid of the shell.
In essence, a chart is a dependency chart. It can be used to represent the environment, and then the deployment process in production comes down to deploying a single chart.
If each team, project or group connected by a common context has its own chart, several environments appear that allow you to group and update all services in a single transaction.
Do not take production as a huge all-encompassing environment, in fact it is a lot of small production.
Major changes are frightening in its scale. And if you have several small development environments, you can isolate the changes and make the system more flexible.
In addition, if everything is arranged in the form of charts, the variables will be set out and can be changed from the outside, in the same way. For example, to connect a less powerful kafka cluster to preproduction, where it is not required, you just need to change the configuration.
So, we sorted out the production. What to do with databases? With Kafka? With security issues?
If you read carefully, then remember: I wrote that databases can be included in the chart package.
Kubernetes has a separate API for running databases and other stateful applications in a convenient form - StatefulSets.
The very essence of Kubernetes is to improve container launch reliability. With it and using the Velero tool (installed via Helm Chart), you can create a backup of the entire Kubernetes cluster, together with the disks attached to it, for example, those that were created by StatefulSet, and restore everything with one command. In addition, it is easy to configure automatic backup on a schedule.
Backups, recovery in one team and Kubernetes manager will help you deploy a completely new cluster and restore its backup with just two teams. A maximum of three, if you first want to create a new backup.
Instead of thinking in servers, you can operate on entire clusters.
Is pre-production annoying? Take it out of sight - to the distance of backup, recovery and change of DNS.
Have you ever used a VPN with pleasure?
No seriously. Is it even possible?
Google had such an attempt. But a couple of years ago, the company announced that it would not use a VPN. I think they also did not enjoy it.
Google switched to trustless networks. They do not have a master key that is suitable for any network, but at the entrance to each service there is a key in the form of an SSO login screen. Want to log in to the monitoring service? Log in using your company username and password.
Only a small shift is needed: instead of using a VPN for authorization, use a proxy.
The VPN also hosts the Kubernetes management link on the private network, while SSO does not. But they have their own authorization mechanism. In the case of Google Cloud and AWS, this is Identity and Access Management (IAM) and the ability to whitelist IP addresses.
If it is possible to work with less bulky architecture without much loss, why not? Be like Google: Move from VPN to “no trust” systems.
Ah, it seems to you that systematizing is a long time? Nonsense! In the future, this will save you hours, days, and even weeks. To work!
Systematization is the only realistic way to recreate infrastructure more than once.
If each setting item can be improved by changing and committing something in git, essentially your entire technological organization is declarative. Any developer with access to git can easily maintain any system in working condition.
To do this, use several small repositories. Monorepos make people cut corners and depend on structures with artificial importance. Instead, it’s better to use many small repositories that you can link later.
My friend Matt and I are creating a tool called metawhich will help to do this at the development stage. Helm does this at the production stage: for him everything is a dependency chart!
Do not pick the shell from the egg piece by piece. Beat and roll.
A couple of days ago I tried to peel an egg in a stupid way, and my girlfriend Angeli asked me why I was doing this.
She took another hard-boiled egg and hit it on a cutting board, then pressed it and rolled it on the table. It took only 3 seconds to clean. And I stood and took off the shell piece by piece.
What I want to say by this: we exaggerate the importance of the efforts.
The best solutions are simple and effective. Is it difficult to clean a rusty bolt? Maybe yes. But only if you do not know that this can be done with Coca-Cola. Still complicated? Not. You just need to put the bolt in the glass and wait.
Without knowing simple techniques, you cannot apply them. Instead of implementing, you conduct experiments; instead of replication, you do research.
If in programming you use some kind of approach again and again, because it allows you to simplify the problem being solved, this approach becomes a template or the so-called best practice.
Despite complex and intimidating names like Command Query Responsibility Segregation (CQRS) or Event Sourcing (ES), these practices help solve problems. Especially those that arise when building distributed systems.
If you look at the development as a whole, we will see that there are more universal principles, for example, “Keep it Simple, Stupid” (KISS) and “Don't Repeat Yourself” (DRY). I would like to talk about similar patterns and principles in relation to DevOps.
DevOps is often presented as the promised land, on which birds sing and the sun shines. But without using the right methods, DevOps will turn into hell, and you will pierce all your fingers with a shell (like me).
Creating DevOps systems, I found several solutions for myself in addition to universal principles like KISS and DRY. While they can not be called templates, but they will help you quickly clean the egg. These solutions (in random order):
- Do not do what others have done before you.
- Let developers be as productive as possible.
- Production is a myth.
- Transfer everything to the cluster and back up the whole.
- VPN is too complicated; solutions are easier.
- Organize, automate and conquer!
Lesson 1. Do not do what others have done before you
If you have the opportunity to purchase a finished product or if you have the necessary and convenient tool in the public domain, use this.
Do not reinvent the wheel - buy it.
Did you know that you can use the same mail server that Craigslist uses? And what is it for free? Need a mail server - do not create a new one, work with an existing one.
I like to look for tools in awesome-self-hosted lists or use the Helm Hub for this.
Despite the fact that Helm is a fairly new tool for finding software, this site has already been created: https://v3.helm.sh/
With Helm, you can easily search for ready-made bikes.
Let's go back in time when I was just starting to program. In 9th grade, I learned my first “real” language - QBasic. Before that, I’ve been creating HTML and CSS sites for a couple of years. Then the Internet was new, and although people knew how to create packages, they did not share them, as we are now. Usually I used floppy disks for storage - it would be great to find them together with a game like arkanoid, which I wrote in Java Swing in grade 11 ...
All I used then was standard libraries, baby! At least at 13 I knew only about them. Although I am sure that some java pros (hello, Vlad and Nick!) Were already cool then. ;)
Now everything is wrong. Today you can find whole UI libraries. You can find a library to elegantly and easily deal with dates with a single command.
It would be great if there was a tool to use and install fully functioning subsystems, right? Need a database? install database
Good news: such a tool exists! With Helm, you can also install fully functioning subsystems that are packaged and ready to use.
The principle is the same as for NPM and Gradle, but the packages that we manage in Helm are called charts. Charts mark up containers so that they run on Kubernetes in a variety of ways.
It turns out a turnkey solution, buying a bike. Charts are the wheel, and containers with descriptions that run inside Kubernetes are wheels.
What's cool about the charts is the ability to pack services and run them in any Kubernetes cluster and even share them with other people if you want.
This means that you can describe the whole environment using code:
- name: backup
repository: http://jenkins-x-chartmuseum:8080
version: 0.0.2
- name: monitor
repository: http://jenkins-x-chartmuseum:8080
version: 0.0.3
- name: marketing-site
repository: http://jenkins-x-chartmuseum:8080
version: 1.1.10
- name: denormalizer-service
repository: http://jenkins-x-chartmuseum:8080
version: 1.0.0
- name: mongo
repository: https://kubernetes-charts.storage.googleapis.com/
version: 1.0.0
Need to upgrade your marketing site to 1.2.0? Just make changes and commit.
Lesson 2. Let developers be as productive as possible.
I once sat at the table, looked at my code and tried to track the bug. Users have been complaining about him for several weeks now, and finally I had a little free time to hang around.
Tada-a-am! Found! Fixed!
I sat behind a partition at my workplace in a sunlit room and shouted to the others: “I fixed the bug!”
Next Tuesday, when users see the release, they will definitely appreciate my efforts! But for this we will have to pack up and try to move the release from the place ... Maybe it will work out if nothing happens ...
Well, okay, if the release still comes next Tuesday, users will definitely appreciate it!
That's how I was doing deployment at my first college job, as a junior programmer.
Since then, much has changed.
Now I use trunk-based development and deploy modules many times a day. When I send a Pull Request, the bot will post a corresponding comment with the review code with the collected environment after the tests have passed and the builds have been collected.
Today you don’t have to scream through the partition in the office.
The more freedom you give programmers — the freedom to control the parts of the infrastructure they need — the easier it is for you to work as a DevOps engineer.
In the first lesson, we saw that for updating in production it is enough to change one digit and commit. In an ideal world in which we pack applications into charts, each programmer in a team has the opportunity to influence how the tool works at the production stage. Kubernetes understands charts as descriptions of containers, which is why they describe resource requirements.
The logic is something like this: I can’t guess how much memory or what CPU settings will be required for a new service (and I think that I'm not the only one). Therefore, I also deploy monitoring and alerts with rules according to which my team and I will be informed in Slack that these settings need to be tweaked. That is, the system will inform you about the necessary settings when we deploy it. I used to sit for hours, sending requests through Prometheus and adjusting the settings, just like I used to pick out the shell. And now I have learned to do everything wisely.
I can already hear you say: “It sounds too complicated!” Not. Just set the chart.
If you can automate or simplify something - go ahead. For example, what if you could assign a DNS path by simply deploying the service with the label "expose: true"? This is where the operators appear. This is a more advanced Kubernetes tool, and you should get to know it. But let's not get too deep into the details.
Lesson 3. Production is a myth
This was a real revelation for me. I had to look at things from a different angle, so listen carefully.
For more than ten years, I thought that in the world there are only a few types of environments. In the simplest scenario, there is staging and production of the environment. First deployed to staging, then tested, moved to the next stage, deployed, tested and so on. As soon as everything is stuck together - the son is integrated - you can go into production.
I followed this pattern year after year and never once doubted it. It has always been so.
And this scheme is another unproductive way to get rid of the shell.
In essence, a chart is a dependency chart. It can be used to represent the environment, and then the deployment process in production comes down to deploying a single chart.
If each team, project or group connected by a common context has its own chart, several environments appear that allow you to group and update all services in a single transaction.
Do not take production as a huge all-encompassing environment, in fact it is a lot of small production.
Major changes are frightening in its scale. And if you have several small development environments, you can isolate the changes and make the system more flexible.
In addition, if everything is arranged in the form of charts, the variables will be set out and can be changed from the outside, in the same way. For example, to connect a less powerful kafka cluster to preproduction, where it is not required, you just need to change the configuration.
Lesson 4. Transfer everything to the cluster and back up the whole
So, we sorted out the production. What to do with databases? With Kafka? With security issues?
If you read carefully, then remember: I wrote that databases can be included in the chart package.
Kubernetes has a separate API for running databases and other stateful applications in a convenient form - StatefulSets.
The very essence of Kubernetes is to improve container launch reliability. With it and using the Velero tool (installed via Helm Chart), you can create a backup of the entire Kubernetes cluster, together with the disks attached to it, for example, those that were created by StatefulSet, and restore everything with one command. In addition, it is easy to configure automatic backup on a schedule.
Backups, recovery in one team and Kubernetes manager will help you deploy a completely new cluster and restore its backup with just two teams. A maximum of three, if you first want to create a new backup.
Instead of thinking in servers, you can operate on entire clusters.
Is pre-production annoying? Take it out of sight - to the distance of backup, recovery and change of DNS.
Lesson 5. VPN is too complicated, there are easier solutions
Have you ever used a VPN with pleasure?
No seriously. Is it even possible?
Google had such an attempt. But a couple of years ago, the company announced that it would not use a VPN. I think they also did not enjoy it.
Google switched to trustless networks. They do not have a master key that is suitable for any network, but at the entrance to each service there is a key in the form of an SSO login screen. Want to log in to the monitoring service? Log in using your company username and password.
Only a small shift is needed: instead of using a VPN for authorization, use a proxy.
The VPN also hosts the Kubernetes management link on the private network, while SSO does not. But they have their own authorization mechanism. In the case of Google Cloud and AWS, this is Identity and Access Management (IAM) and the ability to whitelist IP addresses.
If it is possible to work with less bulky architecture without much loss, why not? Be like Google: Move from VPN to “no trust” systems.
Lesson 6. Organize, Automate, and Conquer
Ah, it seems to you that systematizing is a long time? Nonsense! In the future, this will save you hours, days, and even weeks. To work!
Systematization is the only realistic way to recreate infrastructure more than once.
If each setting item can be improved by changing and committing something in git, essentially your entire technological organization is declarative. Any developer with access to git can easily maintain any system in working condition.
To do this, use several small repositories. Monorepos make people cut corners and depend on structures with artificial importance. Instead, it’s better to use many small repositories that you can link later.
My friend Matt and I are creating a tool called metawhich will help to do this at the development stage. Helm does this at the production stage: for him everything is a dependency chart!
Conclusion
Do not pick the shell from the egg piece by piece. Beat and roll.