Reksoft June 25, 2019 at 17:39

How we broke the old hut and built a skyscraper in its place

Zurab Bely, team leader, Java practice, tells his story of working in a project for one large company and shares his experience.

How I settled ...

I got into the project at the end of the summer of 2017 as an ordinary developer. I can’t say that at that time I liked it a lot: the technologies used in the project were old, communication within the team was minimized, communication with the customer was difficult and unproductive. So the project met me. At that time, I had only one desire: to get out of it quickly.

I’ll tell you a little about the project as a whole. This is the official portal of one large company with general information, news, promotions and other content. All marketing newsletters contain links to certain pages of the site, that is, the load is stably average, but at certain points in time it can reach high values. The stability and accessibility of the web application requires special attention - every minute of service downtime leads to large losses for the customer.

Shanty that squinted in the wind

At first, I mainly studied the technical condition of the project, fixed minor bugs and made minor improvements. From a technical point of view, the application looked terrible: a monolithic architecture built around an outdated commercial version of dotCMS, code written in Java 6th version, when the ninth server-side rendering of the client part on the Velocity framework, which by then had not been several years ago was supported. Each instance was launched in JBoss AS and routed using Nginx. Memory leaks led to constant restart of the nodes, and the lack of normal caching led to an increase in server load. But the biggest splinter was the changes made to the CMS code. They ruled out the possibility of a painless upgrade to a more recent version. A good illustration of this was the transition from version 3. 2 to 3.7, which was just ending at that time. The transition to the next minor version took more than a year. No popular solutions, such as Spring Framework, React.js, microservice architecture, Docker, etc., were out of the question. Going deeper into the project, the consequences of such a technical condition became visible. The most acute of them was the inability to run the application locally for development and debugging. The whole team of 8 people worked at one development stand, where a copy of the production version of the application was deployed. Accordingly, only one developer could debug their code at the same time, and rolling the updated code blocked the entire team. The apogee was a failed sale, during which millions of letters, SMS and push notifications were sent to different users through tens of channels - tens of thousands of sessions were opened at the same time. The servers could not stand it, and most of the time the portal was unavailable. The application does not scale well. There was only one way to do this: deploy another copy side by side and balance the loads between them using Nginx. And each delivery of the production code involved a lot of manual work and took several hours.

Six months after my involvement in the project, when the situation was already starting to get out of control, a decision was made to radically change the situation. The transition process has begun. Changes affected all areas: team composition, work processes, architecture and technical component of the application.

We built, built ...

First of all, personnel changes have occurred. Replaced by several developers, they made me a team lead. The transition to modern solutions began with the involvement of people in the team who had experience working with them.

Procedural changes were more global. By then, development was already underway on the Agile- + Scrum methodology, two-week sprints with delivery at the end of each iteration. But in fact, this not only did not increase the speed of work, but, on the contrary, slowed down. Daily rallies dragged on for one and a half to two hours and did not produce any results. Planning and grooming turned into disputes, swearing or simple communication. There was something to do with this. It was initially very difficult to change anything in this vein - on behalf of the customer, the team almost lost confidence, especially after an unsuccessful sale. Each change had to be substantiated, discussed and proved for a long time. Oddly enough, but the initiative came from the customer. On their part, a scrum-master was involved to control the correct application of approaches and methodologies, debugging processes and tuning the team to work. Although he was attracted to just a few sprints, it helped us to properly assemble the foundation. The approach to communication with the customer has changed a lot. We began to discuss problems in processes more often, retrospectives began to take place more productively, developers were more willing to give feedback, and the customer, for his part, went forward and supported the transition process in every way.

But, honestly, I’ll honestly say: there were quite a few moments when some changes within the team were carried out “blindly”, and after the appearance of positive results, this was reported to the customer. For six months, the attitude has changed to a comfortable working communication. This was followed by several teambuildings, one-day and two-day meetings of the entire development team with the customer’s team (marketer, analyst, designer, product oouner, content managers, etc.), joint visits to the bar. After a year, and to this day, communication can be called friendly. The atmosphere has become friendly, relaxed and comfortable. Of course, it does not do without conflicts, but even in the happiest family there are sometimes quarrels.

No less interesting changes occurred during this period in the application code, in its architecture and used solutions. If you are not technically savvy, you can safely skip the entire text to the conclusion. And if you are lucky just like me - welcome! The entire transition can be divided into several stages. About each in more detail.

Stage 1. Identification of critical problem areas.

Everything was as simple and clear as possible. First of all, it was necessary to get rid of the dependence of a third-party commercial product, to cut a monolith and to make it possible to debug locally. I wanted to separate the client and server code, to distribute it architecturally and physically. Another problem place is qualification. The project completely lacked any automatic testing. This made the transition a little difficult, as everything had to be checked manually. Given that there have never been technical tasks for the functional (the specifics of the project are affected here), there was a high probability of missing something. Having painted the problem areas, we once again looked at the list. It looked like a plan. It's time to build a skyscraper!

Stage 2. Updating the code base.

The longest-running stage. It all started with the transition to a service architecture (not to be confused with microservices). The idea was the following: to break the application into several separate services, each of which will deal with its specific task. Services were supposed to be not “micro,” but I also did not want to put everything in one boiler. Each service was supposed to be a Spring Boot application written in Java SE 8 and run on Tomcat.

The first was the so-called. "Content service", which has become a layer between the future application and the CMS. It has become an abstraction on the way to content. It was assumed that all the requests that we previously made directly in the CMS will be performed through this service, but already using the HTTP protocol. Such a solution allowed us to reduce connectivity and created the possibility of subsequently replacing dotCMS with a more modern analogue or even eliminating the use of commercial products and writing our own solution tailored for our tasks (looking ahead, I will say that this is the way we went).

Immediately created the ground for separation of the front and backend code. They created a front-end service, which became responsible for distributing code written on the react. We screwed npm, configured the node and debugged the assembly - everything is as it should according to the modern trends of the client part.

In general, the functionality was allocated to the service according to the following algorithm:

created a new Spring Boot application with all the necessary dependencies and settings;
ported all the basic logic (often wrote it from scratch, referring to the old code, only to make sure that you didn’t forget about any nuance), for example, for the caching service, these are the options to add to the cache, read from it and disable it;
all new functionality has always been written using the new service;
gradually rewriting old pieces of the application into a new service in order of importance.

At the start, we had a few of them:

Content service. Acted as a layer between the application and the CMS.
Cache service. Simple repository on Spring Cache.
AA service. At the start, he was engaged only in the distribution of information about an authorized user. The rest remained inside dotCMS.
Front service. Responsible for distributing client code.

Stage 3. Autotests.

Taking into account all the experience of the project, we decided that the presence of functional tests greatly simplifies life and the possible further update of the application. It's time to introduce them into the project. Unit tests of the code, sadly to say this, stalled almost immediately. They took a lot of time to write and support, and we had very little of it, because, in addition to rewriting the code, current tasks on new functionality hung on us, and bugs often surfaced. It was decided to focus only on testing the application interface using Selenium. This, on the one hand, made it easier for us to perform regression testing before delivering to production, on the other hand, it became possible to do refactoring on the server side, monitoring the state on the client side. The team didn’t have an automator, and writing and maintaining the relevance of autotests requires additional costs.

Stage 4. Deployment automation.

Now that we have separate services, when the frontend has separated from the backend, when the main functionality began to be covered by self-tests, it's time to open a can of beer and automate all the manual work of deploying and supporting the application locally, on demo and prod servers. Cutting the monolith into pieces and the use of Spring Boot have opened new horizons for us.

The developers were able to debug the code locally, running only that part of the functionality that is needed for this. Test stands finally began to be used for their intended purpose - there already got more or less debugged code, ready for initial and qualification testing. But there was still a lot of handmade work that wasted our precious time and energy. After studying the problem and sorting through the solutions, we settled on a bunch of Gradle + TeamCity. Since the project was built by Gradle, adding something new didn’t make sense, and the scripts that were written turned out to be platform independent, they can be run on any OS, remotely or locally. And this allowed not only using any solution for CI / CD, but also painlessly changing the platform to any other. TeamCity was chosen due to its rich built-in functionality,

At the moment, there are more than 100 optimization scripts and more than 300 tasks in the CI system to run them with different parameters. This is not only a deployment to test benches and delivery to production, but also work with logs, server management, routing and just solutions for routine tasks of the same type. Some of the tasks have been removed from our shoulders. Content managers were able to flush the cache themselves. The guys from technical support got the opportunity to pull individual services on their own, to carry out primary resuscitation actions. The sleep of developers has become deeper and calmer.

Stage 5. Own CMS.

After it was possible to abstract from the commercial CMS, it became possible to receive data from another source, without rewriting the application code. Where to get this or that data was decided by the content service. After searching for ready-made solutions and analyzing them, we decided to write our own content management system, since none of them fully met our needs. Writing your own CMS is an endless process, new needs and wishes constantly appear. We selected a few basic features and went to the beautiful world of development. To launch the first version in prod, we had one and a half man-months. As functionality is ready in the new CMS, we transfer content from the old one to it. All new pages have nothing to do with dotCMS anymore.

Stage 6. Proofreading.

Having rolled up our pants, we began our journey into the world of hipster programming. This stage for my project was the final in the process of restructuring. It continues to this day. The main area for which this stage generally appeared is scaling. The Docker + Kubernetes + Consul bundle allows not only to facilitate deployment to different servers and manage each service separately, but also to flexibly scale the entire application, do it only in those places where it is necessary, and only for as long as it is required. In more detail I can paint only when we fully switch to this solution in production.

... and finally built. Hurrah!

A year has passed since the application update began. These are now 26 REST services written in Spring Boot. Each has detailed API documentation in the Swagger UI. The client side is written in React.js and separated from the server side. All the main pages of the portal are covered with tests. Conducted several major sales. The transition to modern technologies, getting rid of old code and stable operation of servers strongly motivate developers. From “as we said, we do it” the project moved into the mainstream, where everyone is interested in success, offers their own options for improvements and optimization. The attitude of the customer to the team has changed, a friendly atmosphere has been created.

This is the result of the work of the whole team, each developer and tester, managers and customers. It was very hard, nervous and sometimes on the verge of a foul. But team cohesion, competent planning and awareness of the results made it possible to overcome all difficulties.

Tags:

How we broke the old hut and built a skyscraper in its place

How I settled ...

Shanty that squinted in the wind

We built, built ...

... and finally built. Hurrah!

Also popular now: