Centralized continuous deployment for the year

In one of our previous posts about DevOps, we promised to talk about the technological component of our CI / CD pipeline.

To describe the whole picture in colors and fully share our emotions (to paint pain and bloody tears), I’ll tell you about what we started a couple of years ago and what we came to today.

About myself

I have been working at Raiffeisen for over five years. At first I was a freelance specialist, and now I am in charge of the department that supports the CI / CD pipeline and develops automation. During this considerable period of time, my team and I managed to work, probably with 90% of all the software used in the bank, to get a lot of useful experience, of which I would like to share.

What was the part of the technologies and approach used two years ago and what we came to

As in any large company that did not have an approved set of Middle and Software, over time we came to a huge stack of technologies used.

Development teams used completely different programming languages, frameworks and tools. Support teams also used what they can, with varying degrees of success. That is, Dev had his own tools, Ops had his own. Automation if it was, then at the level of "cmd calls the batch file, it runs a VBS script that creates an object in COM +, which ..." NO, EVERYTHING, I DO NOT WANT TO REMEMBER IT!

So, about technology. Let's start with the CD, if I may call it that.

Several installations of subversion, in common people - SVN, a couple of underground installations of GIT, all this is spinning either on the wonderful Win 2003, or on the new "ironic" RHEL 5. And of course it’s not connected, from the word NO. Many preferred not to use version control systems at all. The knowledge and documentation base was either in the same place in SVN (yes) in the form of msdoc instructions, or on some intra-team wiki.

It should be noted that there were 3-4 teams that were actively building their automated process. What could not but rejoice. But their experience was difficult to scale due to the number and differences in the technologies used. Jenkins, TeamCity, Bamboo, Artifactory, Nexus, in general, everyone did what they wanted and how they could.

“We will build our conveyor, with blackjack and courtesans”

All these tools were supported by the developers themselves, who spent part of their time on this instead of sawing new features.

Remedy was used as a ticket management system, and this is my personal, terrible, fierce pain. It seems to me that those who worked will understand, and if you were lucky enough not to deal with Remedy, I can only envy.

The test management system was no less beautiful - HP ALM. As for the bug tracking system, I have no complaints about it, the whole trouble is integration. But it is possible that this is just my personal “love” for HP software products.

The first important change in our zoo, from my point of view, was the appearance of Jira. After Remedy, it was a breath of fresh air: light, fast, comfortable! At first, Jira decided to try several development teams, and as a result, it became a general banking system. At first, all work on non-industrial environments was transferred to it, and now we are introducing JIRA SD as a system for managing incidents and changes. It is also important that JIRA began to work together on the tasks of both Dev and Ops. Confluence was not long in coming.

Then we took on the centralization of all source code storage systems. They raised the centralized SVN stand, into which the repositories of all teams were moved. Also, as an alternative to SVN, they raised Bitbucket, and many teams immediately switched there. As a result, all the self-written code began to be stored in two products, and each team decided independently what to use it.

By this time, the idea of creating a full-fledged conveyor for CI / CD automation was already ripe. The most heated debate went around the choice of a tool for automating the assembly and installation of software, as well as a tool for storing binary objects. We looked at the available on the market for a very long time, compared, read, studied, tried and tested something, including the self-made instrument developed at our head office in Vienna.

As a result, for several reasons, we settled on Bamboo:

Out of the box of other goodies in integrations with Atlassian products.
Missing features are 90% covered by available plugins.
There is a prepared, developed SDK for writing your own plugins.
Support from a major vendor.

I will tell a little about the architecture of Bamboo. It uses an application server with software, a server with a database, and the so-called Bamboo agents. That is, this is a set of servers (the number depends on the license purchased) on which the software is installed, which is necessary for your assemblies and installations. When creating an assembly plan, you need to specify the necessary components in the requirements. When you start the assembly, a free agent is searched that meets your requirements, on which the application is built. It unfolds on the same principle. At the same time, both assembly and deployment are performed identically in all environments, industrial and non-industrial!

Artifactory was chosen as a tool for storing binary objects. It took us about a month to compare and choose between Nexus and Artifactory. Various sources on the Internet talked about the superiority of one or the other tool. Both were suitable for us, but in the end the licensing policy of Jfrog won. And we still have never regretted our choice, the tool is convenient, actively developing.

For a couple of months, we have raised and integrated non-industrial and industrial conveyor environments. RHEL 7 was chosen as the OS for all products without much thought, and PostgreSQL was chosen as the DBMS.

The fascinating process of switching teams to a centralized CI / CD pipeline began. The more they switched, the more software needed to be installed on Bamboo agents. Let me explain: installing and debugging about 45 components on 10 agents took about 5 months.

This whole story prompted us to review most of the internal team and part of the general banking processes. But optimization alone did not help to achieve the desired speed. And once we realized that it became impossible to maintain and install what software teams needed in manual mode. We discussed with the developers, suspended the installation of new and updated current software, and went to study automation.

The automation support team does not know how to automate! WTF ?!

Considering that over time Bamboo agents will become much more than 25, we decided to use a combination of Bamboo and Ansible to create and update agents. What Ansible turned out to be more interesting for us than other configuration management tools:

ease of development
lack of a server part,
Well, all the other advantages of such software.

It took about two months to prepare the first beta script for RHEL agents. With Windows, as always, there were a lot of problems; 3-4 months have already been spent here. At the same time, we are still fighting a war with nuget and chocolatey. But on the other hand, it takes only two hours instead of two weeks to create and configure an agent. By the way, today we have ~ 50 agents, and the number of installed components is approaching 120. It would seem that here it is happiness, two hours - and the new agent is ready! But no. Creating a new server is no less beautiful story, given the number of teams involved in this process. With all the coordination and change management, we spent about a week creating a new RHEL server. And then a couple more days to transfer the root.

With great power comes great responsibility.

It was necessary to change something.

They started to think. In the meantime, they added SonarQube, a tool for static code analysis and technical debt management, to the pipeline. Cool software!

But back to our sheep: in the IaaS era it was very painful to realize that we were creating a virtual machine for a week. Why not use AWS / GoogleCloud / Azure or any other cloud? The answer is simple - the legislation of the Russian Federation prohibits, and our security service was not happy (to put it mildly) with the idea of using an external cloud.

Then we will do our own!

At that time, the bank was already experimenting with vRealize, we seized on this opportunity and joined the experiment. As a result, we got a pool of resources that servers with the necessary OS could manage and create from templates.

Instead of a week, it took an hour to create a dozen servers, and our joy knew no bounds.

But we decided not to stop there. If you automate, you need to learn how to create servers in vRealize from Bamboo. A week of research and another week to implement our own plugin - and the first server was created directly from the Bamboo deploy project. By that time, people decided that we absolutely did not need SVN, and with it Fisheye + Crucible: Bitbucket has much more features, it is more modern and convenient. We helped the teams move to Bitbucket and turned off the old-time SVN.

What else to automate?

Dev / Ops / DevOps / BizDevOps - what prevents all these people from having automated assembly and installation? Work with changes management system requests! But how can we be without change management, this is an important thing! <crafty> If we examine the situation more carefully, it becomes clear that it is not the process itself that hinders, but the need to plan, coordinate and record the result of the change MANUALLY!

And again, the opportunity to write your own Bamboo plugins comes to the rescue! A week of research, two weeks to implement, and we create from the same deploy project the first pre-agreed CRQ in an industrial environment.

What are we doing now?

We launched a pilot use of several tools for ChatOps. Dragging for HipChat, because despite all its shortcomings, I have not yet found a better onpremise solution. Again, through the use of a large stack of Atlassian products, we get a bunch of goodies. For example, the integration of all tools in the Jira task allows you to track all the commits, assemblies, and deployments performed during the implementation of this feature. Enabling HipChat in the process will allow, for example, installing a specific release in a specific environment, running auto-tests, returning the result to the chat and sending users notifications that you can start UAT by entering one command in the chat window ...

Our pipeline now looks like this:

The mutual integration of all these applications allows us to automate almost all the routine activities associated with software production. We have more time left to improve the quality of our products or for some research.

We promise to come back and share new interesting experience.

Tags: