Big Data Specialist: Curriculum from the Lab for New Professions

Today I am pleased to present on Habr the educational program " Big Data Specialist" - an intensive three-month course "Laboratory of new professions" for developers and IT infrastructure experts who want to enter the rapidly growing Big Data industry.



The hottest profession in IT
A few words about us: “Laboratory of new professions” is an educational project Digital October. We research the labor market, find promising fast-growing niches in the field of IT and digital, and develop training programs aimed at established professionals who want to make a quick leap in their careers along with a growing industry.

The Big Data industry gives room for such a leap. After a surge of interest in big data and a wave of initial disappointments, the technology begins to find application in a wide range of real projects. For example, in December Yandex announced the launch of Yandex Data Factory , and last week another startup in the Big Data field received $ 56 million in investment.

International sources are also optimistic: IDC calls big data one of the most important technology trends of 2015, and the authoritative magazine Inc. includes Big Data in this year’s top 6 critical technology competencies . Glassdoor’s recruiting service called Data Scientist the hottest IT profession of 2015.

And most importantly: in Russia, in spite of the crisis, there is also a huge demand for Big Data specialists - we learned this empirically. At the initial stages of program preparation, leading technology companies began to turn to us in search of employees specializing in big data. Now we have six specific offers from companies (these are banks, retail and mobile operators), ready to take our graduates on board. In general, the time to enter the industry is the most favorable.


If we look at Big Data on the maturity curve (hype cycle), about which Habr recently wrote , now the technology is at the stage of “enlightenment slope”

Learning through practice
So, the goal of our program is to teach developers and technicians in practice how to solve the most important tasks that Big Data specialists do. A similar approach is reflected in the structure of the course. The course consists of three specific cases, each of which takes 1 month. It:
  • Social Graph Analysis
  • Creation of multiclass classifiers based on analysis of web logs
  • Development of recommendation systems

From our point of view, these are the most important tasks in the field of data analysis, and we will provide students with a complete immersion in each topic.

Only industry representatives teach.
Our teachers are experienced practitioners who make large-scale data-driven applications with their own hands. The classes will be taught at different times by Valery Topinsky (ex-Yandex, ShAD), Konstantin Kruglov (founder of the Data-Centric Alliance), Kinshuk Mishra (Spotify) and other people from leading Russian and international companies using Big Data in their work. The industry is changing very rapidly, and we give skills that are relevant right now.

Each student will also receive a personal tutor who will regularly monitor progress in the implementation of laboratory work, watch the code, help with finding solutions and give feedback.


CEO “Labs for new professions” Dmitry Repin (right) and director of educational programs Alexander Turilin open the Kaggle lesson

, real data arrays and master classes
Each of the cases in our program is taught in three stages. First, students learn to see common patterns and make out well-designed tasks from Kaggle. Then they carry out independent projects with real data under the guidance of tutors. For example, in the second case, students will try to optimize the algorithms for displaying ads in the Data-Centric Alliance system - and those who achieve good results will not only receive the approval of the teachers, but will be able to completely repay the cost of training.

In the first two stages, students are given practical skills in a full cycle of working with big data. It:
  • Hadoop / HDFS / HBase Deployment
  • Data preprocessing and cleaning
  • Building a prediction model
  • Choosing the optimal machine learning algorithm
  • Model calibration

The third part of the case allows you to get acquainted with ready-made tools and best practices. Speakers from Yandex, Sbertech, Spotify, MTS, IBM, Cloudera will conduct master classes and talk about how they collect, store and use big data in their companies. In this section, students will be able to evaluate how the specificity of data in various industries affects the choice of analysis tools and algorithms.

Post-course interview
As mentioned above, while we were preparing the course, several companies contacted us at once in the hope of finding specialists on big data. Each student who successfully completes the final qualification tasks (by the way, difficult ones) will have the opportunity to pass an interview. The strategic partners of the program are Sberbank Technologies and the Data-Centric Alliance; they are ready to recruit a large number of qualified people. We will closely monitor the successes and preferences of each student to help find a job that matches his interests and competencies.



Online as offline
Another nice feature is the opportunity to learn from anywhere in the world without losing any quality. Classes are held at the Digital October Center three times a week in the evenings, but they can also be attended remotely. Students who study online, feel like full participants in the classes thanks to professional multi-camera shooting - in the video conferencing mode, you can ask the teacher questions and actively participate in the discussion. Also, records of all classes are available to students in your account.

Our student requirements
Now in Russia there are several educational projects for those who want to connect their lives with big data - this is the Yandex School of Data Analysis and the master's program Big Data Systems in HSE - though they last for several years and are aimed at students and recent students graduates.

We, on the contrary, are guided by established professionals who want to combine study with work. The course turned out to be difficult and eventful, it will not work for beginners. We honestly warn that the program "Big Data Specialist" will have to work very intensively - both in the classroom and beyond, and we are waiting for trained students. The minimum requirements are:
  • Good working knowledge of the basics of probability theory and mathematical statistics
  • Experience in developing applications from 2 years
  • Also (very desirable) to know the basics of machine learning theory.

PS And we also thought that it was worth giving five talented young developers a chance to enter the program for free. So the Big Data Young Champion competition was born. See the terms of participation on our Facebook page.

Also popular now: