Inspired by production and basketball: how Yandex prepares a programming championship

    At the end of May, our programming championship will begin . It will take place online and will allow you to test yourself in one of four areas: backend or frontend development, machine learning, or data analytics. The tasks for the sections were developed in the management of machine intelligence and research, search and geoservices.


    All participants will first have to overcome the qualification round. After applying, you will choose when to take it. The qualification is 4 hours and includes from 4 to 6 tasks. We will invite the best to participate in the finals, which will be held on June 1, also online. The results will be announced on June 5th. The winners in each direction will receive 300 thousand rubles, the second place - 150 thousand rubles, the third - 100 thousand. Registration is openand will last until the last day of the qualification round - May 26, but it is better to send the application early.

    In this post we will share the experience of holding such competitions - in terms of the audience and the preparation of complex algorithmic problems.

    * * *

    The championship is a development of the idea that we implemented in 2017–2018 in the Yandex.Blitz series. The difference is that Blitz was just a series of separate contests in different directions. They were united only by the format, and they took place at completely different times. Be sure to read habrastati with analysis of tasks for each competition: in machine learning , backend , frontend and mobile development .

    In preparing the championship and this post, we talked a lot with those who took a high place in Blitz and then got a job at Yandex. It was important to take into account the real experience of the guys, the view from the participant, to make the competition even more transparent and interesting.

    Why is it worth participating?

    The championship, like the past Blitz, is a short way to the company: participants from the top will be able to come to us to work according to a simplified interview scheme. But we are waiting not only for those who are looking for work and considering Yandex. We expect that the competition will be joined by representatives of two more categories of developers. The first category is those who are interested in algorithms, are engaged in sports programming, and also participate a lot (or participated) in olympiads and other competitions. We will offer such people worthy of their level tasks and interesting experience in a piggy bank.

    The second category is experienced programmers and analysts. They will have the opportunity to demonstrate their experience and background. The fact is that we have compiled very diverse tasks. This distinguishes the championship from the competitions at Kaggle - not for the better or for the worse, just Kaggle provides a few other possibilities. There, the compilers usually give such conditions and data that allow you to test yourself in a specific area (if desired, participants have time to study it). Rounds of our championship pass in a matter of hours and record current knowledge. You can not understand, for example, in voice technologies or computer vision, but show such thinking that in the future will allow you to quickly plunge into any topic. Of course, a comparison with Kaggle is relevant only for the ML track of the championship.


    So, the main idea remains unchanged: to offer participants tasks that are close to military ones - those that Yandex developers and analysts really encounter. So you can understand the level and specificity of these tasks, see what issues you have to face in your work if you get a job in a company. In addition, the tasks that we set up for the contest will help participants to assess how well they have pumped in specific areas, whether they have ideas that can really be converted into improved services and applications.

    Those who took part in the Blitz of 2017 and 2018 saw that the tasks were partly dictated by their source code from combat projects. But the combat overtones of development in a corporation often lie in the need to understand algorithms - even in areas that, at first glance, are far from algorithms, such as front-end and mobile development. So, the contests dedicated to these two topics were often judged by the participants as being close to fighting. But two other contests - in algorithmic programming and machine learning - would require an understanding of the algorithms, even without any “combat” subtext. They also had such subtext, but it was not always possible to discern it according to the conditions of tasks. However, this did not stop the participants from competing, but to us - to implement the main idea of ​​Blitz.

    Ideas ideas

    When the tasks for the competition in sports programming are not compiled by themselves, but on the basis of tasks that actually arise in the services, the process of compiling them is completely different. The reason is that in services, the leader or colleague brings the task to the developer in a different wording, in a different context than when the condition comes from the organizers of the competition to their participants. A full-time programmer or even an intern, especially if he worked at the company for some time, is much deeper immersed in the processes of his department than an external (even very talented) developer. They cannot formulate the problem in the same way, especially since the bidder is required to come up with a solution in much less time. His development environment is also different: there is only an input and output file, and the employee works in the repository, in the internal interface,

    "Cleaning" conditions

    So we took the tasks from the military environment, but then we always asked ourselves - will their participants understand? Sometimes it turned out: in order to make the condition understandable for a wide audience of developers, you need to write a large preamble to it, introduce terminology with which the specialist in the company has long been familiar, etc. Such an approach would not always work: in a competition it is important that the condition is capacious so that you can quickly read it and go to the most important thing - to develop a solution. Therefore, in cases where, together with the preamble, the condition would become too cumbersome, we tried to reformulate it and remove the need for a preamble. Another wording was often required also because the original task contained internal Yandex information that should not be disclosed outside the company. As a result, the task could become more abstract,

    It is interesting that the opposite situation - when the condition was immediately able to formulate succinctly and not lose proximity to production - often led to the fact that the task turned out to be difficult. For example, this was seen in the Blitz finale of machine learning, in tasks related to image recognition. This year's championship is no exception. Participants, among other things, are waiting for tasks about machine translation - well-formulated, difficult to implement and really taken from a combat project (Yandex.Translate).

    What we check

    The question arises - making the task abstract in comparison with production, are we not simplifying it? In a way, yes, we simplify it - to solve it no longer requires experience with the internal infrastructure of Yandex, as well as preliminary communication with colleagues. We don’t need to be familiar with the code review process, we don’t need to make the code beautiful, etc. But we keep the most informative part of every task that requires algorithmic thinking. And if you solve it, even in a somewhat simplified form, it will still mean that you are an excellent programmer. And an excellent programmer will quickly get acquainted with the internal infrastructure, will delve into the code review process and switch from a sports mode of writing code to an industrial one. It's like in basketball: the main thing for the player is the dimensions and a good understanding of the game, and the throw can be taught.

    We mentioned algorithmic thinking - in the sense that you need to be able to implement the desired algorithm using your chosen language: without additional libraries. Most likely, in real work (both before and after the competition), you will use various additional libraries that simply call the necessary algorithms and greatly reduce the amount of code. The ability to connect them is just from the category of what “can be taught”. It’s more interesting for us to make sure that when you call some library, you understand what and how it does. Knowing the algorithms from the inside, you will more effectively apply them - already without the need to implement them yourself.

    Analytics competition

    Talking about the objectives of the championship in this post, we often recall Yandex.Blitz. But now, participants can choose the direction of data analytics, in which we did not conduct Blitz. This is a new track with its own specifics. If you choose it, then knowledge of the algorithms will also be a plus, but to a lesser extent than in machine learning or backend development tracks.

    The general idea here is the same as in other areas: to check the skills that are used by specialists in Yandex. So the question is, what skills can come in handy?

    The key skills of a good analyst in Yandex are the ability to generate hypotheses, to extract a useful signal from fuzzy task conditions, ambiguous or noisy data. Our analysts usually write in Python and work with large data streams, for example, with Yandex.Metrica logs, user sessions, technical server logs, etc.

    For solving analytical problems within the framework of the championship, as well as for further work in Yandex, it is very useful to know the basics of mathematical statistics and probability theory. This is the basic knowledge that helps to make correct, data-based conclusions about processes.

    Also popular now: