Development of a new product branch: how to get rid of impractical and keep useful



    Hello, Habr! My name is Dmitry, I am a developer at ISPsystem. Recently, we released in beta testing a new version of the virtual machine control panel. Today I will tell you how we decided what to take from an old product, and what is better to refuse. I’ll go through the most important issues for us: a library for working with libvirt, support for various operating systems during product installation, transition from monolith to microservices, deployment of virtual machines.

    The article is about VMmanager . This is a system for managing, deploying and monitoring virtual machines based on KVM and OVZ virtualization. The fifth generation came out in 2012. Since then, the interface has become very outdated, and the centralized architecture has prevented the development of the product. It's time to make a new version.

    The first story. We use the work of house elves

    Work with libvirt: consider options, select libraries

    As a tool for managing KVM virtualization, our product uses libvirt. In 2012, a library written in C was chosen to work with it, so it was more convenient for that development team. As a result - a large amount of code written in C ++, calling the C-library, which implements direct work with libvirt.

    And now, on the threshold of a new project, we look back and examine our product, weigh whether it is worth taking a particular solution / technology; what has proven itself and what needs to be remembered and never repeated.

    We sit down and do a retrospective of many years of work on the previous version of the product. We are patient, take stickers and write three types of pieces of paper:
    1. What succeeded in the product? What did users praise? What have never heard complaints? What did you like yourself?
    2. What failed? What were the problems constantly? What hindered the work, and why did they start a new branch?
    3. What can be changed? What do users ask for? What do team members want to change?

    The group of people who zealously spoil the paper should include both those who have been in close contact with the product for centuries, and those who can have a fresh look at the product. Do not forget Feature Request and product manager. Ready-made stickers are glued to the board, they will definitely help us.

    Back to the story. We examine a piece of code where the C ++ 98 standard peacefully coexists with C-library calls. We recall that 2018 is the year and decide to leave him alone. But how to repeat the functionality of working with virtual machines (VMs), making the code more compact and convenient for work?

    We study the issue, understand that no matter what solution and in which language we choose, it will be a wrapper over the C-library. As an interesting option, it is worth noting the library on Go from DigitalOcean, it uses the RPC protocol to communicate with libvirt directly, but it has its drawbacks. We settled on the Python library .

    As a result, we got the speed of writing code, ease of use and reading. It is worth explaining these beautiful words.
    • Speed . Now we can quickly prototype a certain part of the work with the domain directly from the console on the debug server, without rebuilding the main application.
    • Simplicity . Instead of calling many C ++ methods in a certain handler, we have a Python script call with parameter passing.
    • Debugging is also as quick and painless as possible. In my opinion, in the long run this can carry an interesting user experience. Imagine, the system administrator, unhappy that his virtual machines are waiting for a shutdown before destroy, goes and redefines the script for the host_stop method. Can I also write a panel for you?

    As a result, we got a simple and convenient tool for working with virtual machines at the server level.

    The second story. A well-packaged product does not need additional caresses

    Product distribution: we refuse from many packages and we pass to Docker


    VMmanager 5 is distributed as a set of linux packages. CentOS 6/7 and, until recently, Debian 7 are supported. What does this mean? This means more build servers for CI / CD, more testing, more attention to code. We must remember that when in the official repository of CentOS 7 qemu version 1.5.3, in CentOS 6 it is 0.12.1. At the same time, the user can use repositories in which the version of this package is much higher. This means that you need to support different versions of api when working with VMs, in particular, during migration. We must remember the difference between initializers (init, systemd), take into account the difference in the names of packages and utilities. Those utilities that work on CentOS will not work on Debian, or their versions in the official repositories vary widely. For each push, you need to collect packages for all versions, and it is advisable not to forget to test them, too.

    All this in the new product does not suit us. In order not to support different logic, we abandon several systems and leave only CentOS 7. Is the problem resolved? Not really.

    We also do not want to check the version of the operating system before installation, whether the necessary utilities are available, what rules are installed in SELinux, and we do not want to reconfigure the firewall and repository lists. I would like to once - and that's all, by clicking to destroy every second to deploy the whole environment and the product itself. It is said - done, the project is wrapped in a docker container.

    Now it’s enough to do:
    # docker pull vmmanager
    # docker run -d vmmanager:latest

    The panel is up and running.

    Of course, I'm exaggerating, the user must install Docker for himself, and we have more than one container, and currently VMmanager runs in swarm mode as a SaaS service. About what we encountered when choosing Docker, and how to solve it, you can write a separate article.

    The fact itself is how important it is to simplify the development, and most importantly, the deployment of your product, install.sh which once occupied 2097 lines .

    As a result:
    1. A homogeneous product installation environment simplifies program code and reduces assembly and testing costs.
    2. Distributing the application as a docker container makes deployment simple and predictable.

    The third story. First relationship with microservices

    Architecture: we abandon the monolith in favor of microservices, or not


    The fifth version of the product is a large monolithic system with the outdated C ++ standard. As a result, the problematic implementation of new technologies and the difficulty of refactoring legacy code, poor horizontal scaling. In the new branch, they decided to use the microservice approach as one of the ways to avoid such problems.

    Microservices are a modern trend that has both pluses and minuses. I will try to present my vision of the strengths of this architecture and talk about solving the problems that it brings to the project. It is worth noting that this will be the first look at the microservice architecture in practice from the side of an ordinary developer. Aspects that I probably won't mention are covered in a good review article .

    Positive sides


    A small service provides many opportunities.
    Besides the convenience of writing, testing and debugging, microservices introduced a new programming language to the project. When your project is a monolith, it is hard to imagine that one day you will try to rewrite part of it in another language that interests you. In microservice architecture - please. In addition to the programming language, you can also try new technologies, with the only caveat that all this will be justified for the business. For example, we wrote some of the microservices in Golang, while saving a fair amount of time.

    Team scaling
    We can divide many people who used to commit to one repository and tried to keep the structure of the monolith in their head into several teams. Each team will be engaged in its service. In addition, the entry of a new person into the work is much simpler and faster, due to the limited context in which he will work. On the other hand, there are fewer people-aggregators of world knowledge, who can always be found out about any aspect of a huge system. Perhaps in time I will reconsider my attitude to this point.

    Independent degradation
    I would attribute the independent degradation to both the positive and negative sides of microservices, because who needs your application if, for example, lies an authorization service? However, this is still a positive side. Previously, collecting statistics from several hundred virtual machines made our monolith work hard, at the time of peak load, the wait for a user request to increase increased significantly. A separate statistics collection service can collect it without affecting other services, while it can still be scaled by adding new hardware or by increasing the number of collectors of the same statistics. And we can even select a separate server for Graphite, where this service records statistics. With a monolith, where there is one base, this is impossible.

    Negative sides


    Request Context
    All my debugging in the monolith came down to two requests in the console:
    # tail -n 460 var/vmmgr.log | grep ERR
    # tail -n 460 var/vmmgr.log | grep thread_id_with_err

    Done! I can track the entire request, from its receipt to the system until an error occurs.

    But what about now, when the request travels from the microservice to the microservice, accompanied by additional calls to neighboring services and records in various databases? To do this, we implemented request info, which contains the request identifier and information about the user or service that produced it. So it becomes easier to track the entire chain of events, but the desire comes to write a log aggregation service, because we, after all, have a microservice architecture. You can also look towards Elasticsearch, this issue is open and will be resolved soon.

    Data inconsistency
    Data in microservices is decentralized, there is no single database in which all information is stored. Pondering this article, I went over the main interactions between microservices in my mind - where we can get duplicates, where we use internetwork transactions - and I realized that we solved the problem of inconsistency with a monolith.

    We really built a monolith with one main base, wrapping most of the transactional actions in it. And around the monolith all microservices were collected that do not affect the consistency of the main data. The exception is a bunch of service authorization + monolith. The problem in this case is that the base application database does not contain users as such, their roles and additional parameters, all this is in the authorization service.

    The system user can work with virtual machines in a monolith, while in the authorization service his rights may change, or he will be completely blocked. The system must respond to this in a timely manner. In this situation, data consistency is achieved by checking user parameters before executing any request.

    As for the remaining microservices, the inability to register in the statistics service does not affect the operation of the virtual machine, and this action can always be repeated. Well, we are doing a statistics collection microservice. But the define domain service (creating a virtual machine using libvirt) will never see the light, since who needs a blank of the machine without its actual existence.

    The fourth story. Fresh is the enemy of the good

    VM Deployment: Installing from Images instead of Installing over a Network

    In the fifth version of the product, the deployment of a virtual machine takes quite a long time by real standards. The reason for this is installing the operating system over the network.

    For Centos, Fedora, RedHat - this is a kickstart method :
    1. Create a kickstart file.
    2. Specify the link to the response file in the kernel parameters linux inst.ks = <link to the kickstart file>.
    3. Run kickstart installation.

    Kickstart-file is quite flexible, in it you can describe all the installation steps, starting from its method and setting the time zone, ending with the partitioning of the disk and network settings. The url parameter in our templates indicates that the installation is from a remote server.

    For Debian and Ubuntu, the preseed method :
    It is similar to the previous one, this method is also built around the configuration file and its contents. In it, we also configured the installation over the network.

    The installation for FreeBSD is similar, but instead of a kickstart file, there is a shell script of our own production.

    Positive aspects of the approach


    This installation option allows you to use one template in our two products: VMmanager and DCImanager (management of dedicated servers).

    The deployment of virtual machines is quite flexible, the panel administrator can simply copy the operating system template and change the configuration file as he sees fit.

    All users always have up-to-date versions of operating systems if they are updated in a timely manner on a remote server.



    Negative sides


    As practice has shown, the installation flexibility was not needed by VMmanager users: in comparison with dedicated servers, few people worried about specific kickstart file settings for virtual machines. But waiting for the installation of the OS was really an inadmissible luxury. The flip side of the relevance of operating systems is that part of the installer is on the network, and part is local to initrd . And their versions must match.

    These are solvable problems. You can create a pool of installed machines and create your own repository for operating systems, but this entails additional costs.

    How to solve these problems without creating repositories and pools? We selected operating system image files. Now the installation process looks like this:
    1. Copying the OS image to the virtual machine disk.
    2. Increase the main section of the image by the size of free space after copying.
    3. Basic setting (setting a password, time zone, etc.).

    Everything new is well forgotten old. We used OS images in VDSmanager-Linux, the progenitor of VMmanager.

    But what about installation flexibility? Practice has shown that most users are not interested in specific network settings and disk mapping on virtual machines.
    And the relevance of the data? It can be achieved by the presence of images with the latest OS versions in the repository, and minor updates can be installed in the initial configuration script. Thus, the virtual machine will already be created and running, and going to it, you will find the conditional yum update running.

    In return, we get a ready-made virtual machine, the deployment of which depends only on copying the disk, increasing the disk partition and starting the operating system. The implementation of this approach to working with machines gives us the opportunity to create our own images and share them. The user can install a LAMP bundle or some complex environment on the virtual machine , then make an image of this machine. Now, other people will not have to waste time installing the necessary utilities.

    We implemented the configuration and modification of partitions using the utilities from the libguestfs suite . For example, changing a password on a linux machine turned from 40 lines of code, consisting of mount, chroot and usermod, into one line:

    command = "/usr/bin/virt-customize --root-password password:{password} --domain '{domain_name}'".format(password=args.password, domain_name=args.domain_name)

    As a result, we made getting the finished virtual machine as fast as possible. It is worth making a remark that with the network setup and the installation of internal scripts, the deployment time has slightly increased. We solved this issue by displaying the installation steps on the front end, thus filling in the pause formed between the creation and the complete readiness of the machine.

    We also got a more flexible approach to deploying virtual machines, based on which it is convenient to create your own images with the necessary environment.

    What did you manage to do


    In the sixth version of the product, we tried to take into account the main drawback of the fifth: the complexity of user interaction with the product. We have reduced the lead time for key actions. Together with a non-blocking interface, this makes it possible to work with the panel without a forced wait. Containerization has made the product installation process easier and more convenient. The use of modern technologies and various programming languages ​​has simplified support and maintenance for both programmers and technical support specialists. Switching to microservices allowed adding new features quickly and with minor restrictions.

    In conclusion, I want to say that the new product is a good opportunity to try other development approaches, new technologies. It’s worth remembering why you are doing this, what new things it will bring to you and your users. Go for it!

    We invite the Habr community to see the beta version of VMmanager 6 and leave your feedback. To do this, go to my.saasvm.com , log in and connect a dedicated server (CentOS 7 x64, Internet access, public IP address).

    If you do not have a server, write to us at help@ispsystem.com or chat on the site , we will provide testing equipment from our partner, Selectel.

    Read more in the news on the ISPsystem website .

    Also popular now: