Interview with Sergey Lukyanov, OpenStack Savanna Technical Project Leader

Original author: Rafael Knuth
  • Transfer
Interviewed by Rafael Knuth We

present you the 10th interview from a series of conversations with the technical project managers of the OpenStack initiative on the Mirantis blog. Our goal is to educate as many members of the technical community as possible and to help understand how you can contribute to OpenStack and how to benefit from it. Of course, the following is the viewpoint of the interviewee, not Mirantis.

So, an interview with Sergey Lukyanov, technical director of OpenStack Savanna .

Mirantis: Please tell us about yourself.

Sergey Lukyanov: I am a senior developer and technical manager at Mirantis Inc, where I have been working for more than 3 years. I am mainly responsible for architecture design and the OpenStack community. I have experience participating in projects for processing large amounts of data and working with relevant technologies - Hadoop, HDFS, Cassandra, Twitter Storm, etc., as well as the development of projects on an industrial scale. At the moment, I am involved in various open-source projects, including Twitter Storm and OpenStack.

Question: How did you come to OpenStack? Why are you participating in the project?

Answer: I have been actively working on OpenStack for about a year, and even before that, I watched its development since the Diablo release. Active development began for me with writing that part of the Swift cloud storage code that allowed me to learn from the outside on which physical machines this or that data is located (later this helped in the implementation of local computing for Savanna). Then, directly, I started working on the Savanna project and, at the same time, took part in the development of other OpenStack projects - Oslo, Swift, Nova client, Hacking, Pbr, Jeepyb, etc. My main goal within OpenStack is to increase the number of services and features that it provides in order to make it more convenient for application developers to use this platform and to make it as widespread as possible.

Question: What are you responsible for as the technical manager of the Savanna project?

Answer: I mainly supervise the project. This includes monitoring and managing bugs and blueprints on Launchpad, coordinating work on checking new code using the Gerrit system, holding weekly IRC meetings with our team and meetings at the OpenStack Design Summit. It seems to me that the technical leader of the project is, first and foremost, the person who coordinates the work of all teams within the framework of his project and makes sure that the general direction of his development coincides with the tasks and goals set. In addition, I take one of the first places in the number of new changes made to my project, as well as in the number of checks of the written code from other team members.

Question: What role does the Savanna project play in OpenStack? What is its significance?

In my vision, OpenStack is not only and not so much a technical infrastructure, but rather an extensive community of developers working on an incredibly large ecosystem of closely related and actively developing projects. And all of this is what constitutes the entire cloud platform. And here I see a great opportunity for the future development of this ecosystem by introducing and integrating it with other open-source initiatives and the communities that develop them. And just the integration of OpenStack with Apache Hadoop is a great example of this. From a user's point of view, processing large amounts of data can ultimately be useful for most OpenStack initiative projects.

Question: What is truly unique and new in the Savanna project?

Answer: The Savanna project has applied for becoming the official incubated OpenStack project at the last stage of the Havana cycle as part of the Data Processing program. Today Savanna provides the implementation of basic infrastructure operations in the following two areas:

-supplying and managing Hadoop clusters based on Hadoop vendor tools such as Apache Ambari to provide access to the Hortonworks data processing platform;

-planning and processing of Hadoop tasks, including creation, execution, etc.

I would also like to clarify that Savanna does not offer any Data API due to a very long list of potential problems with processing big data. In the future, we plan to support not only Hadoop, but also other means of processing large amounts of data.

Question: Tell us about the Savanna community - who is involved in this project?

Answer: The project began with a small team at Mirantis. Today, about 30 people work on it as part of the Havana cycle, the backbone of the team is employees of Mirantis, Red Hat and Hortonworks, the remaining participants are employees of HP, IBM, UnitedStack and Rackspace.

Question: What has the Savanna community achieved so far?

Answer: Today we have a service that provides operation and manages clusters with support for scaling (and increasing and decreasing cluster sizes, including adding new types of computing nodes), anti-affinity (including to guarantee the reliability of data nodes ) and the use of locally stored data for computing (for more efficient Hadoop tasks). For storing cluster configuration data, we use templates of groups of nodes and clusters. If we talk about our second and main functionality of Elastic Data Processing (EDP), the Savanna project supports the simple execution of tasks like jar, Pig and Hive through the Oozie task scheduler, including the ability to read and write data from and to Swift storage. Regarding the possibility of expanding functional capabilities, then this principle is ensured by the presence of a plug-in mechanism that now contains two plug-ins for accessing Hadoop clusters: the Vanilla plug-in, which simply installs all the necessary services, and the Hortonworks Data Platform plug-in, which installs Apache Ambari for startup and configuration of the Hadoop cluster. And, of course, a plugin for the OpenStack Dashboard, which reflects the full functionality of our project.

Q: What features will Savanna provide as part of the OpenStack Icehouse release?

Answer: The main goal is to increase the efficiency of integration with other OpenStack projects and infrastructure. The main change planned for the Icehouse release is Heat support for resource management in order to replace it with direct management through other OpenStack services. We are also working on the integration of Savanna and DevStack gate to check for new changes to the project (Devstack itself already has Savanna support), and move on to testing the API and comprehensive testing in Tempest. In addition, I hope to see so-called guest-agents in Savanna Icehouse that will solve all the current access problems between the cloud controller and the guest operating systems running on it; no need to call ssh / API directly. As part of the EDP functionality, we would like to improve the execution of task flows in general, to implement support for new functions, task types, data sources, etc. I also expect that at least one more new plugin will be implemented, supported by vendors, for example, IDH (Intel Distribution for the Apache Hadoop project) already under consideration.

Question: What do you want people to know about the project?

Answer: The goal of the Savanna project is to provide the OpenStack community with data processing tools. At the moment, our focus is on the Hadoop ecosystem, but discussions are already underway and concepts are being developed to support other tools, such as Apache Spark and Twitter Storm. That is, we are currently working on collecting requests for EDP and adding new features and data processing tools.

Question: Are there any general misconceptions regarding the Savanna project?

Answer: Opinion on the availability of the Data API in our project. Savanna does not have a Data API, but there are two levels of management APIs: one provides the functioning / management of clusters another manages the process of completing tasks and their flows. And again about the purpose of the project. We would like to offer comprehensive solutions and tools in the field of data processing, and not a one-time solution for one infrastructure. Our area of ​​activity is data processing.

Question: In what cases can Savanna be used?

Answer: During the implementation of the Savanna project, we keep in mind several use cases. First, data cluster management (today Hadoop clusters). Another application for cloud platforms is the use of idle computing power in the event of peak loads. As well as the ability to manage the load during data processing (various Hadoop jobs at the moment) in a few clicks without special knowledge in the field of data processing tools.

Question: What is your vision for the Savanna project?

Answer: I see Savanna as a service that provides data processing / cluster support tools, the main function of which is to provide elastic data processing operations, for example, to perform certain tasks, etc.

Question: Who would you like to see among the participants of the Savanna project?

Answer: I would like to see participants of two types. We need people interested in implementing various Hadoop distributions and (especially) other data processing environments. We also really need operators - people who will start using Savanna to manage the load of processing their data and help us by sending us their comments and suggestions for improving the project.

Question: What functionality does improvement and testing need now?

Answer: Integration with Heat needs testing; this will replace a very large part of the resource management code. We are working on porting integrated integration testing to Tempest, and here we need help both in porting old tests to this platform, and in writing new ones. And you also need to continue testing Savanna in various operating systems in combination with various guest operating systems.

Question: How can people start working with Savanna?

Answer: I hope now it is not very difficult. Installation can be performed using DevStack, you only need to load the disk image based on diskimage-builder available in CDN into Glance. At docs.openstack.org/developer/savannaProvides detailed usage guides for developers, administrators, and users. And, of course, our team is working to simplify this process, especially in view of the expectations of developers of new plugins and, as a result, new project participants. If you have questions, our team can be found on the #savanna IRC channel at freenode.net or using the openstack-dev@lists.openstack.org e-newsletter (specifying the [savanna] prefix in the subject line).

Question: Thank you for your time, Sergey.

Answer: Thank you.

Also popular now: