“One of the daily processes is accelerated from 3 hours to 15 minutes”: Andrei Bogoslovskikh on in-memory computing in SberTech
The words "in-memory computing" sound tempting and futuristic. Who would not want to eliminate the “bottleneck” of the speed of a hard disk by storing and processing data in memory? But in practice, there are nuances: for example, due to the volatility of RAM, the data still needs to be duplicated in constant, and the gain is obtained when reading, but not when writing. What is the real deal with this? Sberbank Technologies has extensive
relevant experience , where they are now actively working with Apache Ignite and even invested in the company GridGain that created it . Therefore, we decided to ask a few questions about this experience: of course, it cannot be blindly transferred to any other company, but it is still valuable. Andrey Bogoslovskikh answered them, Director of the Competence Center of the business development support platform.
- Last year, at our JPoint conference, you said that cooperation with GridGain is at the beginning of a long journey - what is the situation now?
- A year ago, about 200 people worked in the direction of distributed data processing in Sberbank Technologies, and now more than 1000. Moreover, these are not only specialists who, together with GridGain employees, are developing a new Sberbank technology platform, but also application teams that rewrite the existing Sberbank application software for a new IT platform. The core of the platform should be ready by the end of 2017, in 2018, active implementations and circulation will begin. There is also a lot of work to monitor the new IT system, increase its reliability and manageability.
- Why did you initially need Apache Ignite? Can you share specific numbers showing how its use affects your tasks?
- Due to the simplification of access to accounts (mobile banking, Internet banking), the average transaction amount decreases, but their number rises sharply. Even 10 years ago, cellular communication was mainly paid for in the offices of a telecom operator for 1-2 months in advance. Now customers pay for communication through a mobile bank several times a month, but in smaller amounts. This is just one example of an ever-increasing number of transactions.
Therefore, high system performance is extremely important for us, which Apache Ignite can help. Tests show that under the current system, one of Sberbank's daily batch processes takes about three hours, and on Apache Ignite 15 minutes. Also, the new system is expected to produce more than 10,000 client operations with bank cards per second, compared to the current 3,000 operations per second at the peak (and usually around 500).
- Do you think Apache Ignite is suitable for those who have to work with such a gigantic scale, or can small companies be no less useful?
- The product is extremely interesting. Apache Ignite is applicable not only in large business, but is already ready for use “out of the box” in medium and small companies - this is certainly its great advantage. And we, by participating in it as an open source project, are trying to make the level of maturity, reliability and effectiveness of Sberbank IT solutions more accessible for everyone.
- That is, SberTech commits in Apache Ignite, although you work closely with the company GridGain behind it, so that they themselves probably take into account your needs when developing?
- At the end of last year, the company's management decided to create a separate direction for the development of open source projects, which are key in the company's IT landscape.
The benefit of the company in the development of open source is to increase the availability of products for a wide range. SberTech has a great and truly unique experience in developing projects that meet the very stringent requirements of reliability, availability, performance, scalability. Such experience is difficult to get elsewhere (tasks of a similar level are set by companies of the scale of Amazon, Google, etc.). Few in the developer community have come across our use cases.
We decided to develop key streams in open source: reliability, productivity, elimination of technological debt, expanding the ecosystem and opportunities for project integration.
Apache Ignite was the first such open source project, and we work closely with GridGain. There are many requirements for the implementation of projects, so SberTech has separate teams in this area.
- And what is the nature of the SberTech commits in Apache Ignite - something related to banking specifics?
- We have no task to develop open source projects for the business specifics of Sberbank. They only introduce functionality that will be useful to all users of the project, including potential customers. If we talk about the current participation, the team worked on the releases of Apache Ignite 1.9 and 2.0. Commits can be divided into two categories - bug fixes and new functionality, the separation is approximately equal. With regard to the new functionality, special attention is paid to complex transactional scenarios, increasing the reliability and controllability of the cluster.
- Based on your experience with Apache Ignite, what can you recommend to other people who are not yet using the project, but who are interested in it? In what cases is it suitable, and in which problems may arise?
- I advise you not to be afraid of changes, but to start small. Apache Ignite has extensive functionality to reduce latency and improve the performance of your applications. For large companies, the key may be the ability to ensure business growth, the amount of computing and processed data while reducing TCO. You can use the product in simple tasks (for example, speeding up the response of services and reducing the time for accessing data), and for more complex tasks (distributed computing in memory, distributed SQL database). DML support has already been sufficiently implemented, full support for DDL according to community plans should appear in the near future. If we talk about completely new features, then this is ML support. A beta version of the machine learning engine, including support for distributed mathematics, is included in the released release of Ignite 2.0.
As for the problems, in my experience, they can arise from a lack of competence, since the project is still quite complicated. Therefore, I can advise you to connect to our development community.
- Working with Apache Ignite is not the first year, you see the situation with in-memory computing in dynamics - how is everything developing?
- According to my feelings, in-memory computing is moving from the field of exotic to the field of standard development tools. For example, NSPK has created a unified payment system for the MIR card based on grid technologies. Most of the major payment system developers have considered the possibility of using grid-technologies in the next versions of their systems.
- In terms of performance, in-memory computing can significantly benefit, but at the same time, large amounts of RAM are associated with high cost. Can you give any numbers that give an idea of how in practice this affects your “iron” expenses in comparison with traditional approaches?
- The cost of a cluster, which is equivalent to the RAM size of the current core-banking systems, is 1/5 of the price of Hi-End solutions. At the same time, the x86-based cluster has significantly greater processor power, is better adapted to mass-parallel operations, and scales horizontally.
The cluster is expected to consume more energy, from 30 to 200%, depending on the nature of the load. It also takes about 5 times more space in the data center compared to RISC solutions.
- And if you look at how the cost of such solutions changed over time and try to look into the future, does it seem that in-memory computing expects growth due to lower costs?
- I’m not ready to give financial forecasts, but the RAM becomes cheaper every year, and the energy consumption of the memory drops. Who would have previously thought that SSDs would supplant HDDs from the TOP segment? I think that something similar will happen with RAM and with NvRAM solutions, which will give a new impetus to the development of in-memory computing.