Sberbank Data Science Day live stream November 10


    On November 10 (tomorrow!) A big Sberbank Data Science Day conference will be held in Moscow at the October Cinema Center , where the winners will be awarded SDSJ 2018, speeches by a large number of international and Russian experts in the field of Data Science, the ML section and the use of artificial intelligence in science and business. And many more interesting things!

    Under the cut and on the website program. We also tell how the winners of the Sberbank Data Science Journey were rated.


    The conference is divided into several thematic blocks, here is the schedule:

    Main hall

    11:00 - 11:30. Opening of the conference.
    11:30 - 12:30 Panel discussion “Data analysis and artificial intelligence technologies in the digital economy”
    12:30 - 13:15. "Biologically-based methods and architectures in deep learning." Sergey Bartunov, Deep Mind
    13:15 - 14:00. "Conversational Agents as an Intelligent Digital Companion to Understand Human Emotion and Express". Soo-Young Lee, KAIST
    15:00 - 15:45. "Scalable automatic machine learning." Andrei Spiridonov, H2O
    15:45 - 16:30.Panel Discussion “Trend for Innovation: Using DS / AI and Improving Customer Experience”
    17:15 - 18:00 Award ceremony for the winners of the Sberbank Data Science Journey and Classic AI competitions (competition for artificial intelligence)

    Hall of Science

    12:30 - 13:45 .DS / AI technology: AutoML
    13:45 - 14:45. DS / AI technology: Computer Vision
    14:45 - 15:45 .DS / AI technology: Natural Language Processing (NLP)
    15:45 - 16:30 .DS / AI technology: Reinforcement Learning
    16:30 - 17:15 .DS / AI Technologies: Speech Analytics

    Business Hall (Hall 1)

    12:30 - 13:45. DS / AI applications in banking and finance
    13:45 - 15:00 Application of DS / AI in medicine and bioinformatics
    15:00 - 16:15. Application of DS / AI in banking and financial sectors
    16:15 - 17:15. Brainwriting: create a platform for AI research

    Business Hall (Hall 2)

    12: 30 - 14:45. Application DS / AI in retail
    14:45 - 16:30. DS / AI applications in industry
    16:30 - 17:15. Using DS / AI in media and telecom

    Community Hall

    12:30 - 13:15. Poster Session Lightning Talk posters presentation
    13:15 - 15:00. Presentation of open projects in the field of DS / AI "AI Open Projects"
    15:00 - 15:45. Analysis of the decisions of the competition AIC
    15:45 - 17:15.Sberbank Data Science Journey competition review

    Winners Sberbank Data Science Journey

    This year we offered to solve problems using AutoML technology. Until the end of November 3, participants unloaded their decisions, and in the next 12 hours they selected the best from their decisions. Now the choice is for the jury. At the conference, we will award the winners of the Sberbank Data Science Journey.

    Participants were provided with ready-made data sets from Sberbank. All 24 datasets involved in the competition were collected by various departments: the retail unit, the risk unit and the technology unit. All of them were specially prepared and impersonal. The basis was information such as:

    • Approved limit share
    • Card delivery time
    • Different types of scoring
    • Feedback on the offer card
    • Response to other product offerings
    • ATM breakdowns
    • Information about cash withdrawals at ATMs
    • Account balances and other information

    For evaluating decisions, dataset groups were selected: check (open for participants), public (hidden from participants, but you can see the result during the competition), private (set, which summarizes the competition).

    In each such set there are three regression problems and five on binary classification. The solutions worked on data sets of various sizes: from 1MB and 300 lines to 1GB and 1 million lines. The jury prepared the datasets even before the start of the competition, the testing system has already checked them automatically, and on the site now you can see the results (taking into account the limitations associated with the intrigue).

    Decisions were made in the format of archives with a code. The participants had to build an algorithm that implements a complete cycle of solving a machine learning problem automatically, receiving data as input, returning a ready answer at the output.

    The decisions of the participants were to fit into the specified restrictions:

    • resources available to the solution
    • the solution does not have access to Internet resources
    • The maximum size of the packed and unpacked archive with the solution: 1 GB
    • The archive is unpacked into a file system in RAM (ramfs) available to the write solution.
    • the rest of the container is read only.
    • CSV with a set of data does not exceed 3 GB
    • Restrictions are needed in order to achieve a fair comparison, putting the participants in equal technical conditions.

    Here is the evaluation system in this competition:

    1. For each task (dataset) for the test part of the sample, a task-specific metric is considered (RMSE for regression, ROC-AUC for binary classification).
    2. For each task (dataset), the value of the participants' metrics is translated into a common scale according to the following scheme. For the best metric solution (among all sent and successfully tested solutions) 1 point is given, the baseline solution is estimated at 0 points. Participants who are on the metric between the best and baseline solutions receive a proportional score between 0 and 1. Decisions on the quality of the bottom baseline are rated at 0 points. If the best solution and the baseline solution coincide, then all participants receive 0 points. If the participant's decision gives an error on the task or does not pass through the time limit, then it gets 0 points for this task.
    3. The final result of each participant is counted as the sum of the results for each task after being converted into a common scale. In general leaderboard participants are ranked by the final result.

    The results of the competition are available here .

    In addition to the main standings, the participants competed for the prize in the nomination “Best Public Solution”. Throughout the competition, they published their approaches to solving the AutoML problem on GitHub, and the winners were determined by the number of GItHub stars.

    At the conference there will be a separate section dedicated to SDSJ'18, where the winners will tell about their decisions and answer all questions.

    Once again we leave a link to the online broadcast of the conference, so that all those interested can watch Sberbank Data Science Day.

