Build the development process and the CI pipeline, or How to become a DevOps developer for QA

  • Tutorial

  1. large Java project with front in Angular,
  2. developed by a small team (~ 15 people),
  3. using heaps (about 40 pieces in parallel) of feature brunches,
  4. in git repository;
  5. several virtual servers in a private Amazon cloud that can be used for development tasks;
  6. a developer who is a little tired of Java and wants to do something really useful for setting up processes.

It is required:

  1. provide the opportunity for the QA team of engineers to test each feature brunch, either manually or automatically, on a dedicated stand that does not interfere with the rest.

Spaceship Management Console QA booth

Here you come to work in a small startup with American roots ...

Still a small startup, but with a promising product and big plans to conquer the market.

And at first, while the development team is very tiny (up to 10 people), the development of the code base is carried out in the general repository on GitHub Enterprise, with quick allocation of small features, brunching from master, and fast release cycles with merch of feature brunches directly to the same master . The team leader is still able to track who committed what, and each commit not only read, but also understand whether it is correct or not. Thus, pull requests are opened, and quickly merged by the developer himself with the oral approval of the lead (or rejected by him).

To ensure the integrity of the code base, the team relies on unit and integration tests, of which a couple of thousand have already been written, and about 60% coverage is provided (at least for the backend). The development lead also runs a full cycle of tests on the master before release.

The process looks something like this:


couple of months pass . The startup shows viability, investments allow us to increase the development team to 15 people. Mostly front-endors come in and begin to quickly expand the facade that end users see and use. The facade is tested by front-fronts directly on their working poppies, they write some cases on Selenium, but the development lead no longer has time to run them before release, because Selenium is known for its leisurely.

And then two fakaps happen, literally one after another.

First, one of the back-endors accidentally makes push force in master (the poor fellow got cold, then spent time, didn’t think about it), after which two weeks of the whole team’s work have to be restored according to the commit from miraculously surviving local copies - everyone has long been accustomed to doing the first thing pull yourself.

Then one of the major features developed by the front-lineers for about a couple of months in a separate branch, and green according to all UI tests, after the merge in master sharply turns it red, and almost crashes the work of the whole product. We heard breaking changes in our own API. And the tests did not help to catch them. It happens. But a mess.

So before a startup, the question of establishing a QA team, and indeed, the rules for working with feature brunches and the general development methodology, coupled with discipline, is fully raised. And it also becomes obvious that the code before the pull request must review not only the development leader (he already has a lot to do), but also other colleagues. A normal growth problem, in general.

So we came to the point “ Given: ”.

No, I never planned to become a build engineer. But after a successful demonstration of the lead in the development of the project assembly and the run of unit tests on TeamCity installed on the local developer server in the corner, someone had to set this up for battle mode. And I just had some free time between features.

Well, let's get started.

First, set up the TC head instance in the Amazon cloud (+ two free agents), hang them up to listen to commits in the Gihaba repository (virtual PRAD makes virtual HEAD for each github - listening to changes is very easy), and automatic assemblies with unit test runs will go by themselves . As someone commits, so after five minutes, and the assembly queues. Conveniently.


But not enough.

The github at that time still had a very unpleasant interface for viewing pull requests, and leaving comments there was also not an ice. It hurt too long a footcloth of screens had to be squandered. That is, it was possible to take away the right to merge from team members, but it was impossible to provide a normal code review without third-party services. In addition, I also wanted to get integration with Jira at the same time, so that the features were attached to the tasks, and the tasks were attached to the features themselves.

Fortunately, Atlassian has a similar solution, it is called BitBucket Server, and at that time it was also called Stash. It just allows you to do all this integration magic with other Atlassian products, and is very convenient for code review. We decided to migrate.

But this wonderful product, unlike the github, does not create virtual HEADs for every PR, and after migration there was nothing to listen to. Post-commit hooks didn’t work either, due to the lack of time for everyone to deal with them well.

Therefore, at first the integration of the cache with TeamCity was done through a crooked crutch. Having become entranced with hooks, an overseas colleague, instead of using the built-in REST API to view pull requests, desperate to whip up on bash the log parsing, which always revolves around it tail -f, looks for changes in the desired type with a grape, and then pulls the REST API TC. Not the most logical approach, and some builds started to double, but what can you do, once.

Looking ahead - when the time came, I was able to rewrite humanly, picking up changes through regular REST, parsing the JSON response with the great and powerful jq utility - a mega-thing for any devo that does toolbar integration! - and pulling TeamCity only when it is really necessary. Well, at the same time, I registered the script with the system daemon so that it starts at reboot itself. Amazonian instances occasionally need to be restarted.

There are two pieces of the puzzle.


During this time, QA engineers arose in the team.

Poor thing! For a day, switch locally between five feature brunches, collect and run them manually !? Yes, you do not wish this to the enemy !!!

I admit honestly: I sincerely love QA engineers (especially girls). And, in general, I'm not alone. Even colleagues from NY, who originally believed in unit tests, as it turned out, love them. Only they didn’t even know about it when they vaguely formulated the problem: “you need to somehow research such a question so that you can automatically launch somewhere in our cloud an instance of the application for each brunch, well, like, so that the business can see with your eyes , what exactly is happening with the feature being developed right now. Would you? "

“Okay,” I said (well, who else? Whoever once got into DevOps is the last one), “And the item“ Required: ”arrived.

An interesting task. After all, if you manage to set up an automatic deployment based on the build, then in one fell swoop you can meet the needs of both business and our poor QA. Instead of tormenting with the assembly locally, they will go to the cloud for a finished copy.

Here it must also be said that the application consists of several WAR containers that run under Apache Tomcat. WAR, as you know, is a regular ZIP archive with a special directory structure and manifest inside. And when you build the application, its configuration (the path to the database, the path to the REST endpoints of other WARs, and so on) is sewn somewhere inside the resources. And in order to feed WAR to the tomcat, you need to register it in the configs, from where to get it, at what url, and on which port to deploy it.

And if we want to start many instances of the same WAR at once? Configure tomcat on the fly to scatter them on different ports or urls? And what to do with configs wired inside WAR resources?

Some kind of bad statement of the question.

So we will go the other way. For example, when running WAR in the debugger, IDEA feeds the -Dcatalina.basepath to the directory copy via the command line key $TOMCAT_PATH/confand launches the WAR not in a single piece, but in the exploded form, that is, unzipped, so that on-the-fly files with bytecode can be replaced.

Having looked at what IDEA does, we try to repeat and improve this algorithm. To begin with, we create a huge virtual instance in the Amazon cloud with hundreds of disk space (and in the exploded form our application is quite bold) and gigabytes of RAM.

We raise nginx there - because in nginx it is quite simple to establish a rule for redirecting HTTP requests to the addressстартап.ком/путь/до/REST/endpointon localhost:#####/путь/до/REST/endpointand back. ##### - this is a specific port number, which is configured in the volume configurations. Yes, there’s nothing to even try to run all feature brunches under one tomcat, instead, for each of them we will start a separate directory $TOMCAT_PATH/confand run our tomcat. It is many times simpler and more reliable, and there are no problems with parallelism.

We think where to get the numbers, so that they do not coincide for different copies. Build number? No, in this case QA will get confused about which feature which instance belongs. The revision number is dropped for the same reason. Well, there’s nothing to do, we force all developers to name the branches so that they necessarily include the task number from Jira based on feature-#####-что-нибудьor bugfix-#####-что-нибудь. Here are the last three digits of the number and will be included in the port number. It’s also beautiful.

We add an additional build step to the timcity builds that build WAR, which throws them on that bold Amazonian instance via SSH, and also pulls a bash script on SSH that does the following:

  1. unpacking WARs in the directory / deployments / d ###,
  2. copy from / deployments / skel conf directory for tomcat,
  3. causing knurling of a single instance of the database from the dump (the database dump lies in the source tree, so it is also at hand),
  4. using awk, sed, grep, find, and such-and-such mother correcting tomcat configs from a copy of conf, as well as configs in resources unpacked by WAR so that they have the correct ports, paths to the base, REST endpoints, and everything else.

After that, it remains only to run the tomcat with the key -Dcatalina.base=/deployments/d###, and you 're done.


So, wait a minute, and our favorite QA engineers, will they manually go to the cloud via SSH and run the command from the command line? Somehow not super. You could automatically pick it up, but it’s inconvenient, since the feature brunch is already under 60, and the memory even in the bold instance is still not rubber. Will brake.

Think, head, buy a hat. A! So you can also write a console for managing root instances, if everything is lying around /deployments/d###. Go through the subdirectories, spit out for each start / stop link, for example.

nginx has already been raised, configure classic CGI in it - like two bytes about a firewall. What is classic CGI? This is when an HTTP request with all headers is submitted to the standard input of a binary, and some environment variables are set, and an HTTP response is taken from the standard output, also with all headers. It’s also easier than steamed turnips, all this can be done literally with your hands.

Hands? So do not write me the directory handler / deployments in bash? Because I probably can. How to write, but how I will post it onстартап.ком(it will be available only from the startup’s internal network, like all instances) ... Sometimes I want something not only useful, but also slightly abnormal. Such as a minimal bash HTTP request handler.

So I wrote. Actually, a bash script that, with the help of awk, sed, grep, find, and such and such a mother, runs through the / deployments subdirectories, and detects where what is in what. The build number, the revision number, the name of the feature brunch - all this garbage, and so, just in case, was already transmitted from TC along with a WAR-nickname.

Earned with a half kick. One drawback is that itстартап.ком/refresh?start=d###is not very convenient to parse input commands of the form with the help of bash regulators and niksovye utilities. But it’s already my own fault - I came up with global slash commands and a question-action-mark for instances. Yes, and external utilities were called there for 60 subdirectories many hundreds of times, which is why the console did not work quickly.

On the other hand, it is possible to determine whether a particular instance is running from the output of the standard ps (the same grep to help), and you can also call, for example, netstat or mysql -e "SHOW DATABASES"without leaving the cash desk, and put this into standard output by slightly editing the gray or Avcom for readability. For diagnosis, very good, convenient.

And the appetite comes with eating, so soon in the console there are commands for killall -9 java(sometimes you want to start the week from scratch), uptime, and several other usefulnesses. The most important thing is the ability to delete an application instance along with the database. According to the crown, of course, the / deployments directory will be cleaned up in two weeks (it was originally provided), but sometimes you want to remove the out-of-print copy of the build redirected by the PR leader out of sight, so as not to callus.

A little more time passes, and the set of test cases grows to such an extent that QA engineers have to create quite a few entities in the instance database in order to go through the full regression cycle for a large feature. And this is not one day. And if during this time the developer managed to feed something to the branch following the results of the code review, then the instance database will be deployed again after the build, which will cause entities to be lost. Oops

We add the ability to take a snapshot for a defunct instance. We attach it already to the number of the gig revision (there the digits, according to the results of the experiments, are quite unique), and put it in /deployments/s###(another letter of the prefix so that the copies and images have different name spaces). We deploy it with approximately the same script as with the city, only we copy the database not from the dump, but the existing one.

So QA engineers get the opportunity to test a specific revision before turning blue, during this time the developer can commit as much as necessary to the main revision fix branch. Then, before the release, only these point changes in the main instance will be checked.


Wow! In just six months from the chaotic process, when developers commit features to whom much, we came to a logical, harmonious system of continuous integration pipeline, where each step is regulated, and each tool is as automated as possible.

As soon as the developer creates PR, the process of deploying the test instance is already, consider, started, and literally an hour later (if you are lucky - the number of parallel feature brunches soon increased to hundreds with the growth of the team, I had to raise as many as seven instances under TC) QA will be ready to testing the feature. Drive at least manually, at least with scripts via the REST API, and if necessary, diagnose it and deal with bugs using the test instance management console.

Well, then the lyrics. After some time, everyone was tired of the console brakes, and I had to remember my youth, rewriting it from bash (sorry, all the abnormality of this small project was lost at once) to plain boring PHP (however, it’s not in Java to do such tasks, actually), but one of the frontiers was able to convert the UI from old-school plain HTML to a completely modern Angular application. However, I insisted on maintaining the interface a la the nineties, just for fun. Added the ability to view stdout and stderr on the tomcat. We made a special CLI interface for calling the REST API right on the spot, and also a few little useful things.

It turned out to be terribly convenient. *

Just look at the happy faces of the QA engineering team!

* Want one?
Write me. I will gladly consider job offers in places where you need experienced (over 10 years of experience) specialists with Primary Skill == Java, and the ability to sometimes do this kind of abnormal programming. Or steer processes. You can do it all at once.

I can’t move to Moscow alone. But to work remotely - with pleasure.

Also popular now: