The evolution of the supply chain, or reflections on Docker, deb, jar and more
One day, I decided to write an article about the delivery of docker and deb packages in the form of containers, but when I started, for some reason I suffered in the distant days of the first personal computers and even calculators. In general, instead of dry comparisons of docker and deb, these are the thoughts on evolution, which I present to your court.
Any product, whatever it is, must somehow get to the product servers, must be configured and launched. This article will be about this.
I will reflect in the historical context, “what I see - that I sing”, what I saw when I started to write code and what I’m observing now, what we ourselves are using at the moment and why. The article does not pretend to be a full-fledged study, some points are missing, this is my personal view of what was and what is now.
So, in the good old days ... the earliest delivery method that I found was with tape cassettes. I had a computer BK-0010.01 ...
The Age of Calculators
No, there was an even earlier point, there was also a MK-61 and MK-52 calculator .
So, when I had MK-61 , the way to transfer the program was to use a regular piece of paper in a box on which the program was recorded, which, if necessary, was manually written to the calculator. If you want to play (yes, even there were games on this antediluvian calculator), you sit down and enter the program in the calculator. Naturally, when the calculator was turned off, the program went into oblivion. In addition to calculator codes written out on paper personally, the programs were published in the magazines Radio and Technique of Youth, as well as in books of that time.
The next modification was the MK-52 calculator, he already had some semblance of non-volatile data storage. Now the game or program didn’t have to be manually driven in, but, having done some magic passes with buttons, it loaded itself.
The volume of the largest program in the calculator was 105 steps, and the size of the permanent memory in the MK-52 was 512 steps.
By the way, if there are fans of these calculators who are reading this article - in the process of writing the article, I found both a calculator emulator for android and a program for it. Forward to the past!
A small digression about MK-52 (from Wikipedia)
MK-52 flew into space on the Soyuz TM-7 spacecraft. It was supposed to be used to calculate the landing trajectory in case the on-board computer fails.
MK-52 with the memory expansion unit "Electronics-Astro" since 1988 was supplied to the ships of the Navy as part of a navigational computing kit.
The first personal computers
Let's get back to the times of BK-0010 . It is clear that there was more memory there, and it was no longer an option to drive code from a piece of paper (although at first I did just that, because there was simply no other medium). The main means of storage and delivery of software are audio cassettes for tape recorders.
Storage on a cassette was usually in the form of one or two binary files, everything else was contained inside. Reliability was very low, I had to keep 2-3 copies of the program. The load time was not happy either, enthusiasts experimented with different frequency coding in order to overcome these shortcomings. At that time, I myself had not yet been engaged in professional software development (apart from simple basic programs), therefore, unfortunately, I won’t tell you in detail how everything was arranged inside. The fact of having only RAM on the computer for the most part determined the simplicity of the data storage scheme.
The emergence of reliable and large storage media
Later, floppy disks appear, the copying process is simplified, and reliability is growing.
But the situation changes dramatically only when sufficiently large local storages appear in the form of HDD.
The type of delivery fundamentally changes: installers appear that control the system configuration process, as well as cleanup after deletion, since the programs are not just read into memory, but are already copied to the local storage, from which you need to be able to clear and unnecessary if necessary.
At the same time, the complexity of the supplied software increases.
The number of files in the delivery increases from units to hundreds and thousands, conflicts of library versions and other joys begin when different programs use the same data.
In those days, the existence of Linux was not yet open for me, I lived in the world of MS DOS and, later, Windows, and wrote in Borland Pascal and Delphi, sometimes glancing towards C ++. To supply products in those days, many used InstallShield ru.wikipedia.org/wiki/InstallShield , which quite successfully solved all the tasks of deploying and configuring software.
Age of Internet
Gradually, the complexity of software systems becomes even more complicated, from a monolith and desktop applications there is a transition to distributed systems, thin clients and microservices. Now you need to configure not just one program, but their set, and so that they are friends all together.
The concept has completely changed, the Internet has come, the era of cloud services has come. So far, it’s only interpreted in the initial stage, in the form of sites, nobody dreamed especially of services. but this was a turning point in the industry of both the development and the actual delivery of applications.
For myself, I noted that at this moment there was a change in the generations of developers (or it was only in my environment), and I had the feeling that all the good old delivery methods were forgotten at one moment and everything started from the very beginning: they began to make the entire delivery with knee-high scripts and proudly called it “Continuous delivery”. In fact, a period of chaos began, when the old is forgotten and not used, but there is simply no new.
I remember the times when in our company, where I worked then (I won’t call), instead of building through ant (maven wasn’t popular yet or not at all), people just collected jar in IDE and commited him in svn. Accordingly, the deployment was to get the file from SVN and copy it via SSH to the desired machine. So simple and clumsy.
At the same time, the delivery of simple sites to PHP was made quite primitive by simply copying the corrected file via FTP to the target machine. Sometimes there wasn’t such a thing - the code was edited live on the product server, and it was special chic if there were backups somewhere.
RPM and DEB packages
On the other hand, with the development of the Internet, UNIX-like systems began to gain more and more popularity, in particular, it was at that time that I discovered RedHat Linux 6 for myself, around 2000. Naturally, there were certain tools for software delivery there, according to Wikipedia, RPM as the main package manager appeared already in 1995, in the version of RedHat Linux 2.0. And from then until now, the system has been delivered in the form of RPM packages and quite successfully exists and develops.
Debian family distributions went a similar way and implemented the delivery in the form of deb packages, which is also unchanged to this day.
Package managers allow you to deliver the software products themselves, configure them during the installation process, manage dependencies between different packages, remove products and clean up the excess during uninstallation. Those. for the most part, this is all that is needed, which is why they lasted several decades with little or no change.
Clouds added to the package managers installation not only from physical media, but also from cloud repositories, but basically little has changed.
It is worth noting that at present there are some creeps towards avoiding deb and switching to snap packages, but more on that later.
So, this new generation of cloud developers, which did not know either DEB or RPM, was also growing slowly, gaining experience, products became more complicated, and some more reasonable delivery methods were needed than FTP, bash scripts and similar student crafts.
And here Docker enters the scene, a kind of mixture of virtualization, resource allocation and delivery methods. It is now fashionable, youth, but is it needed for everything? Is it a panacea?
According to my observations, very often Docker is offered not as a reasonable choice, but simply because it is, on the one hand, talked about in the community, and those who offer it only know it. On the other hand, for the most part they are silent about the good old packaging systems - they are and are, they are doing their work quietly and imperceptibly. In such a situation, there is no other choice — the choice is obvious - Docker.
I’ll try to share my experience of how we implemented Docker, and what happened as a result.
Initially, there were bash scripts that deployed jar archives to the necessary machines. Managed this process by Jenkins. This worked successfully, since the jar archive itself is already an assembly containing classes, resources, and even configuration. If you add everything to it to the maximum - then expand it with a script - this is not the most difficult thing you need.
But the scripts have several drawbacks:
- scripts are usually written in haste and therefore are so primitive that they contain only one of the most successful scripts. This is facilitated by the fact that the developer is interested in a speedy delivery, and a normal script requires a decent amount of resources.
- as a consequence of the previous paragraph, the scripts do not contain the uninstall procedure
- no established upgrade procedure
- when a new product appears, you need to write a new script
- no dependency support
Of course, you can write a fancy script, but, as I wrote above, this is development time, and not the smallest, but, as you know, there is always not enough time.
This all obviously limits the scope of this deployment method to the simplest systems. The time has come to change this.
At some point, freshly baked middles began to come to us, seething with ideas and raving with a docker. Well, the flag in hand - do it! There were two attempts. Both unsuccessful - let's say so, because of big ambitions, but the lack of real experience. Was it necessary to force and finish by any means? It is unlikely - the team must evolutionarily grow to the desired level before it can use the appropriate tools. In addition, using ready-made images of the docker, we often came across the fact that the network worked incorrectly there (which, perhaps, was also connected with the dampness of the docker itself) or it was difficult to expand other people's containers.
What inconvenience have we encountered?
- Network problems in bridge mode
- It is inconvenient to look at the logs in the container (if they are not taken anywhere separately to the file system of the host machine)
- Periodically strange ElasticSearch hangs inside the container, the reason has not been established, the container is official
- It’s awkward to use the shell inside the container - everything is greatly trimmed, there are no familiar tools
- Large containers to collect - expensive to store
- Due to the large size of the containers, it is difficult to support multiple versions
- Longer build, unlike other methods (scripts or deb packages)
On the other hand, is it worse to deploy a Spring service in the form of a jar archive through the same deb? Is resource isolation really needed? Is it worth losing convenient tools of the operating system, stuffing the service into a heavily trimmed container?
As practice has shown, in reality this is not necessary, a deb package is enough in 90% of cases.
When does good old deb still fail and when did we really need a docker?
For us, this was a deployment of services in python. A lot of libraries needed for machine learning and not available in the standard delivery of the operating system (and what was there were of the wrong versions), hacks with settings, the need for different versions for different services living on the same host system led to that the only reasonable way to supply this nuclear mixture was docker. The complexity of assembling the docker container turned out to be lower than the idea of packing it all in separate deb packages with dependencies, and no one in their right mind would have taken it.
The second point where you plan to use docker is for deploying services using the blue-green deploy scheme. But here I want to get a gradual increase in complexity: first, deb packages are collected, and then a docker container is assembled from them.
Back to the snap packages. They first officially appeared in Ubuntu 16.04. Unlike the usual deb packages and rpm packages, snap carry all the dependencies. On the one hand, this avoids the conflict of libraries, on the other hand, it means more significant sizes of the resulting package. In addition, this can affect the security of the system: in the case of a snap, all the changes to the included libraries should be monitored by the developer who creates the package. In general, not everything is so simple and the general happiness of their use does not come. But, nevertheless, this is quite a reasonable alternative, if the same Docker is used only as a means of packaging, and not virtualization.
As a result, we now use both deb packages and docker containers in a reasonable combination, which, perhaps, in some cases we will replace with snap packages.
Only registered users can participate in the survey. Please come in.
What do you use for delivery?
- 32.9% Self-written scripts 52
- 13.9% Copy pens to FTP 22
- 20.8% deb packages 33
- 13.2% rpm packages 21
- 1.8% snap packs 3
- 53.7% Docker-images 85
- 12.6% Virtual Machine Images 20
- 5% Clone the entire HDD 8
- 8.2% puppet 13
- 25.3% ansible 40
- 12.6% Other 20