
When does reboot time matter or why does IBM use CRIU on mainframes
In today's world, when the bright future is predicted for microservices, it seems strange to engage in technologies that help update code without rebooting. After all, microservices and containers are much easier to “kill” and recreate. Nevertheless, we continue to work on the CRIU live migration system, and the guys from IBM are actively helping us with this. Why? Let's try to explain.

In the wake of universal virtualization, the convergence and success of container architectures, patching is starting to seem somewhat rudimentary. Why install updates and reboot when you can take and create a container again? And this is true for those cases when it comes to user applications and services, development and testing. But as practice shows, the infrastructure on which it all revolves requires a completely different approach. The stability and constant availability of heavy services, such as databases, allows microservices to start at any time and use any data.
It is obvious to everyone that systems that start up and get warm for a long time should not reboot too often, but best of all, that they never restart at all. And the more powerful the system, and the more microservices depend on it, the less profitable is the shutdown of its operation in order to reboot. One example of solving this problem is ReadyKernel technology, which allows you to install updates to the Linux host OS, which runs many virtual machines and containers, without rebooting it. Another solution for reducing downtime of various services is offered by our CRIU project.
Despite the doubts that CRIU met at the stage of formation of this OpenSource tool (however, Gates who first spoke with the tablet was also laughed at), today CRIU is integrated into OpenVZ, Docker, LXC, CoreOS containers; included in Linux distributions Ubuntu, Debian, OpenSUSE, Altlinux and several others, and is also supported by developers from various companies, including IBM. By the way, it is interesting that it was the Blue Giant that made one of the largest contributions to CRIU - today the tool works on several platforms at once: x86_64, ARM, aarch64, PPC64 and s390. And two of them - PowerPC64 and s390 - are the brainchild of IBM. Support for the tool at the last was announced literally in the summer of 2017.

In order to explain why the largest company in the field of developing hardware platforms and software requires such tools, you need a little insight into the essence of the project itself. CRIU allows you to “freeze” an application so that it can then be launched on another host or in a different container. With the correct use of this tool, the application should not even guess that it was moved while continuing to work, as if nothing had happened. As already mentioned, microservices do not need this at all, but it turns out to be very useful for those tasks that are solved, including on mainframes.

The microprocessor architecture for the high-performance s390 servers is unique; IBM is developing it in its mainframe lineup. Multiprocessor and multi-threaded systems allow you to work with huge amounts of data, which imposes its own characteristics on the architecture of the OS and applications. In the summer of 2017, patches from IBM developers came to CRIU, which make it possible to use CRIU on s390. The fact is that CRIU is a low-level tool, its code is close to the kernel code, and therefore its adaptation to each new architecture is required. For CRIU to start working, it was necessary to implement support for platform-specific functions. From simple to complex, IBM developers have provided support for system calls, native data types, process virtual address space descriptors, added the necessary compiler settings, images for the registers, the necessary jumps for the spurious code that we implement in the process to “freeze” it, as well as architectural specifics like TLS / GOT. You can get acquainted with the content of the work donehere : Springboard for introducing parasitic code on s390 Users of IBM platforms are faced with a sufficiently large amount of software that is too difficult to manipulate a la “kill, recreate”. In massive applications, it is much more convenient to maintain the state of services, for example, in order to restore work faster when there is a power loss. The ability to migrate containers in a "live" state allows you to free servers for maintenance or load balancing and so on.
And this applies not only to mainframes. IBM's involvement in the CRIU project is not limited to the s390 architecture. A few years ago, we received IBM patches to support PPC64. These IBM solutions are designed for workstations and simpler servers - not mainframes. But the most interesting contribution IBM developers made to the CRIU project is the lazy migration technology.

This happens as follows: the container moves from one host to another without the contents of its memory. This approach allows you to reduce the size of images by an order of magnitude, and it is very effective for those applications that hold huge amounts of data in memory. For example, if we are talking about the JVM, its full image can occupy tens of megabytes (and this does not take into account the memory that the program running in it will allocate for itself), while its size without the contents of the memory will be several tens of kilobytes. Thanks to this, migration occurs many times faster, reducing the pause in work. The essence of what add-ons from IBM do is to provide remote access to memory and its asynchronous migration if necessary.
Nevertheless, there are many tasks when the system still needs to be rebooted. And here, the ability to stop the application is also useful. CRIU allows you to stop the container, reboot the system, and start the container in it again. Thus, we solve the problem of patching for difficult situations when it is not possible to update the system without rebooting.
Extensive support for the CRIU project allows us to say that today every developer can use the ability to “freeze” and “live migration” applications on 5 different architectures. IBM's contribution to the development of the project made it possible not only to use the capabilities of CRIU on PPC64 meiframes and servers, but also to use the “lazy migration” mechanisms on other platforms.
Moreover, the changes that have occurred have encouraged us to create a separate Compel library that allows you to infect processes with spurious code, forcing you to execute certain instructions. Today, Compel is used in the CRIU project, as well as in the new live application patching system. We will talk about it and the Compel library itself in the next post.

In the wake of universal virtualization, the convergence and success of container architectures, patching is starting to seem somewhat rudimentary. Why install updates and reboot when you can take and create a container again? And this is true for those cases when it comes to user applications and services, development and testing. But as practice shows, the infrastructure on which it all revolves requires a completely different approach. The stability and constant availability of heavy services, such as databases, allows microservices to start at any time and use any data.
It is obvious to everyone that systems that start up and get warm for a long time should not reboot too often, but best of all, that they never restart at all. And the more powerful the system, and the more microservices depend on it, the less profitable is the shutdown of its operation in order to reboot. One example of solving this problem is ReadyKernel technology, which allows you to install updates to the Linux host OS, which runs many virtual machines and containers, without rebooting it. Another solution for reducing downtime of various services is offered by our CRIU project.
CRIU becomes the standard
Despite the doubts that CRIU met at the stage of formation of this OpenSource tool (however, Gates who first spoke with the tablet was also laughed at), today CRIU is integrated into OpenVZ, Docker, LXC, CoreOS containers; included in Linux distributions Ubuntu, Debian, OpenSUSE, Altlinux and several others, and is also supported by developers from various companies, including IBM. By the way, it is interesting that it was the Blue Giant that made one of the largest contributions to CRIU - today the tool works on several platforms at once: x86_64, ARM, aarch64, PPC64 and s390. And two of them - PowerPC64 and s390 - are the brainchild of IBM. Support for the tool at the last was announced literally in the summer of 2017.

In order to explain why the largest company in the field of developing hardware platforms and software requires such tools, you need a little insight into the essence of the project itself. CRIU allows you to “freeze” an application so that it can then be launched on another host or in a different container. With the correct use of this tool, the application should not even guess that it was moved while continuing to work, as if nothing had happened. As already mentioned, microservices do not need this at all, but it turns out to be very useful for those tasks that are solved, including on mainframes.

The microprocessor architecture for the high-performance s390 servers is unique; IBM is developing it in its mainframe lineup. Multiprocessor and multi-threaded systems allow you to work with huge amounts of data, which imposes its own characteristics on the architecture of the OS and applications. In the summer of 2017, patches from IBM developers came to CRIU, which make it possible to use CRIU on s390. The fact is that CRIU is a low-level tool, its code is close to the kernel code, and therefore its adaptation to each new architecture is required. For CRIU to start working, it was necessary to implement support for platform-specific functions. From simple to complex, IBM developers have provided support for system calls, native data types, process virtual address space descriptors, added the necessary compiler settings, images for the registers, the necessary jumps for the spurious code that we implement in the process to “freeze” it, as well as architectural specifics like TLS / GOT. You can get acquainted with the content of the work donehere : Springboard for introducing parasitic code on s390 Users of IBM platforms are faced with a sufficiently large amount of software that is too difficult to manipulate a la “kill, recreate”. In massive applications, it is much more convenient to maintain the state of services, for example, in order to restore work faster when there is a power loss. The ability to migrate containers in a "live" state allows you to free servers for maintenance or load balancing and so on.
#include "common/asm/linkage.h"
.section .head.text, "ax"
/*
* Entry point for parasite_service()
*
* Addresses of symbols are exported in auto-generated criu/pie/parasite-blob.h
*
* Function is called via parasite_run(). The command for parasite_service()
* is stored in global variable __export_parasite_cmd.
*
* Load parameters for parasite_service(unsigned int cmd, void *args):
*
* - Parameter 1 (cmd) : %r2 = *(uint32 *)(__export_parasite_cmd + pc)
* - Parameter 2 (args): %r3 = __export_parasite_args + pc
*/
ENTRY(__export_parasite_head_start)
larl %r14,__export_parasite_cmd
llgf %r2,0(%r14)
larl %r3,__export_parasite_args
brasl %r14,parasite_service
.long 0x00010001 /* S390_BREAKPOINT_U16: Generates SIGTRAP */
__export_parasite_cmd:
.long 0
END(__export_parasite_head_start)
And this applies not only to mainframes. IBM's involvement in the CRIU project is not limited to the s390 architecture. A few years ago, we received IBM patches to support PPC64. These IBM solutions are designed for workstations and simpler servers - not mainframes. But the most interesting contribution IBM developers made to the CRIU project is the lazy migration technology.

This happens as follows: the container moves from one host to another without the contents of its memory. This approach allows you to reduce the size of images by an order of magnitude, and it is very effective for those applications that hold huge amounts of data in memory. For example, if we are talking about the JVM, its full image can occupy tens of megabytes (and this does not take into account the memory that the program running in it will allocate for itself), while its size without the contents of the memory will be several tens of kilobytes. Thanks to this, migration occurs many times faster, reducing the pause in work. The essence of what add-ons from IBM do is to provide remote access to memory and its asynchronous migration if necessary.
Nevertheless, there are many tasks when the system still needs to be rebooted. And here, the ability to stop the application is also useful. CRIU allows you to stop the container, reboot the system, and start the container in it again. Thus, we solve the problem of patching for difficult situations when it is not possible to update the system without rebooting.
Conclusion
Extensive support for the CRIU project allows us to say that today every developer can use the ability to “freeze” and “live migration” applications on 5 different architectures. IBM's contribution to the development of the project made it possible not only to use the capabilities of CRIU on PPC64 meiframes and servers, but also to use the “lazy migration” mechanisms on other platforms.
Moreover, the changes that have occurred have encouraged us to create a separate Compel library that allows you to infect processes with spurious code, forcing you to execute certain instructions. Today, Compel is used in the CRIU project, as well as in the new live application patching system. We will talk about it and the Compel library itself in the next post.
Only registered users can participate in the survey. Please come in.
What do you think is the main advantage of IBM participating in the CRIU project?
- 9.5% Application for many applications on PPC64 and s390 2
- 33.3% Possibility to use the achievements of IBM on other platforms 7
- 28.5% The very support of a large company draws attention to the project 6
- 19% The more contributors, the better. That's all! 4
- 9.5% Yes, there are no advantages here, I don’t understand why this is necessary 2