Data Science in Russia: languages, technology and more

    In December 2017, we conducted a survey among various groups of Russian users, one way or another related to data analysis. We wanted to find out what programming languages, technologies and tools are used by experts in this field. This is also important for the development of PyCharm , which is already quite popular among analysts. A better understanding of the needs of data analysts will make our product even more convenient.

    Later we conducted a similar study in other countries, and we had the opportunity to compare the situation in Russia with the global one. Here we will share the most interesting observations, more complete data on Russia and infographics are published on our website . Baseline data available here.(all answers to open-ended questions have been deleted for privacy reasons). Soon we will also publish the results of a worldwide study.

    image

    Profile of a Data Science Specialist

    The survey analyzed the responses of 373 Russians and 1965 respondents from around the world. In terms of age, Russian specialists in the field of Data Science practically do not differ from their foreign colleagues, but foreign specialists have a higher level of formal education. Among the Russians surveyed, 59% have a bachelor’s degree, and only 20% have a master’s degree, while in the world 45% of respondents have a bachelor’s degree, and 36% have a master’s degree.

    The Data Science area is relatively young, about half of the respondents (46%) work there for 1 to 3 years. And only 18% have experience of 3-6 years. It is significant that for the absolute majority of respondents (those who have experience from 0 to 6 years, and such> 90%), the average age is in no way connected with experience. This is probably due to the youth of the sphere and the fact that people from adjacent areas are actively moving into it.

    Many people solve data analysis tasks along with programming and other job responsibilities. Only 50% of respondents (in the world, 36%, according to our survey) indicated data analysis as the main professional activity, 33% of respondents combine data analysis with basic professional duties.

    Programming languages

    Python is the dominant data analysis language in Russia and in the world. Abroad, the share of using Python and R in data analysis is 73% and 40%, respectively, in Russia Python is much more popular than R - 84% versus 25%.

    Technologies and tools

    More than 60% of respondents in one way or another use the tools for deep learning. TensorFlowTM is the most popular framework - 49%, Keras is in second place with 39%.

    Apache Spark is used by 40% of respondents, including 92% of those who program on Scala. All for whom Scala is the main language, use Apache Spark. The share of those who program only in Python and use Spark is about 14% (if you do not take into account the possibility to use Spark from Lua and Julia, then this share will increase to 20%).

    Wage

    Knowledge of big data technology is the key to high wages. The average salary of a specialist outside the big data technology stack is 127 thousand rubles. Salaries of specialists vary significantly depending on qualifications and work experience, but on average they are much higher in the field of big data analysis. Interestingly, despite the popularity of Apache Spark, respondents with knowledge of this technology are inferior in salary to specialists who own Apache Pig and Apache Hive - 157 thousand rubles against 177 and 166 thousand respectively. Knowledge of Apache Hadoop / MapReduce makes it possible to earn an average of 150 thousand rubles.

    In the question of the dependence of salaries on the programming language, we are no different from the whole world: the specialists in Scala earn more than others - an average of 173 thousand rubles. They are followed by respondents with knowledge of Java - 158 thousand, and Python - 143 thousand. At the same time, the salary of specialists using Python is 4-5% higher than those using R (136 thousand), which is quite consistent with the situation in the world .

    More details about the state of the Data Science sphere in Russia can be found in the full report with infographics . Our study does not pretend to absolute representativeness, since we distributed a link to the survey in the channels, where a fairly active part of the Data Science community is represented:

    • in Slack Open Data Science (ODS) communities,
    • sent directly to companies that have data analysis units,
    • distributed to the participants of the conference SmartData, distributed in thematic user groups, etc.

    However, our review gives a certain idea of ​​the industry in Russia.

    For those who want to conduct an independent analysis and draw their own conclusions, available source data . All answers to open-ended questions have been deleted for privacy purposes.

    We plan to continue to monitor the trends in Data Science and conduct similar surveys. If you want to participate in our future research, subscribe to the last page of our report . We will be glad to see you among our respondents.

    Also popular now: