Open broadcast from the main hall of SmartData 2017: it's not about solutions - it's about evolution



    As we have repeatedly reported earlier, this year the JUG.ru Group company decided to look into the future and figure out what is the need for two gray boxes to interact with each other to let a dose of sacred knowledge on Big Data and machine learning enter our world - we made the SmartData 2017 conference , which will be held in St. Petersburg on October 21.

    Why are we hosting a Big Data and Machine Learning conference? Because we cannot but collect. And in order to turn as many developers as possible into our fraternity, we traditionally open a free online broadcast from the first conference hall.

    So, the free online broadcast from the main hall of SmartData 2017 will begin on October 21, 2017 at 9:30 am Moscow time. Only you, we and the future. This time the broadcast will be available in 2k - get your 4k monitors!



    A link to the online broadcast of the first track of the SmartData 2017 conference and a brief description of the reports are under the cut.


    Watch the online broadcast



    In the first track of the conference, which takes place in the main hall, there are:

    • Vitaliy Khudobakhshov - Name is a feature
    • Mikhail Kamalov - Recommender systems: from matrix decompositions to deep learning in line mode
    • Sergey Nikolenko - Deep convolutional networks for object detection and image segmentation
    • Dmitry Bugaychenko - From click to forecast and vice versa: Data Science pipelines in Odnoklassniki
    • Artem Marinov - We segment 600 million users in real time every day
    • Alexander Krasheninnikov - Hadoop high availability: Badoo experience
    • Ivan Yamshchikov - Neurona: why did we teach the neural network to write poems in the style of Kurt Cobain?

    In between reports, when the speakers and participants on the site leave behind the looking glass of the discussion zones, we show the viewers of online broadcasts on the non-session events of the conference and take fascinating interviews from speakers and interesting guests. If during the interview you have your own question - write it in the Telegram-chat conference . Here's what it looked like on JPoint:



    First track program


    9:30 - 10:30 // Opening, interview with the JUG.ru Group team, opening remarks from the organizers and partners of the conference.

    10: 30-11: 20 Vitaliy Khudobakhshov - The name is a feature

    No matter how strange it may seem to an educated person, the probability of being lonely / lonely "depends" on the name. We will talk about love and relationships, or rather, what exactly can the data of a social network tell about this. It’s about the same as saying: “The probability of being hit by a car if your name is Seryozha is higher than if you were called Kostya!” Sounds pretty wild, doesn't it? Well, at least unscientific. Thus, we will talk about the most unexpected and counterintuitive observations that can be made using data analysis in social networks. Of course, we will not ignore the questions of the statistical significance of such observations, the influence of bots and false correlations.



    11: 40-12: 30 Mikhail Kamalov - Recommender systems: from matrix decompositions to in-depth learning in streaming mode

    At present, recommender systems are actively used both in the field of entertainment (YouTube, Netflix), and in the field of Internet marketing (Amazon, Aliexpress ) In this regard, the report will consider the practical aspects of the use of deep learning, collaborative and content filtering and filtering by time as approaches in recommendation systems. In addition, we will consider the construction of hybrid recommendation systems and modification of approaches for online learning at Spark.



    12: 50-13: 40 Sergey Nikolenko - Deep convolutional networks for object detection and image segmentation

    Convolutional neural networks have long become the main class of models for image processing. In the report, we will discuss how networks that recognize individual objects turn into networks that distinguish objects from a host of others. We’ll talk about the famous YoLo, single-shot detectors, and the line of models from R-CNN to the recently appeared Mask R-CNN.



    14: 25-15: 15 Dmitry Bugaychenko - From click to forecast and vice versa: Data Science pipelines in Odnoklassniki

    Machine learning is fun, but you need to do a lot of boring things to work in industry. In this report, we will consider all the technologies, algorithms, and methods necessary for your machine learning to shine like a diamond in a gold frame.

    As an example, we will consider one difficult task - personalizing a news feed. Without going into the details of machine learning, we will talk about data collection (batch and in real time), ETL, as well as about the processing necessary to obtain a model.

    But just getting a model is not enough, so we’ll also talk about how to get model-based forecasts in a complex, highly loaded distributed environment and how to use them for decision making.

    In this report, we will talk about the processing and storage technologies of the Hadoop ecosystem data, as well as much more. This report will be useful to those who are engaged in machine learning, not only for entertainment, but also for profit.



    15: 35-16: 25 Artem Marinov - We segment 600 million users in real time every day

    Every day, users make millions of actions on the Internet. The FACETz DMP project needs to structure this data and segment it to identify user preferences. Let's tell how we, using Kafka and HBase:

    • segment 600 million users after switching from MapReduce to Realtime and how we did it;
    • process 5 billion events every day;
    • store statistics on the number of unique users in a segment during streaming processing;
    • monitor the impact of changes in segmentation parameters.



    16: 45-17: 35 Alexander Krasheninnikov - Hadoop high availability: Badoo experience

    Hadoop infrastructure is a popular solution for tasks such as distributed storage and data processing. Good scalability and a developed ecosystem bribe and provide Hadoop with a solid place in the infrastructure of various information systems. But the more responsibility is assigned to this component, the more important it is to ensure its fault tolerance and high availability. In the report, we will talk about ensuring the high availability of the components of the Hadoop cluster. In addition, we will talk:

    • about the “zoo” with which we are dealing;
    • why provide high availability: points of system failure and the consequences of failures;
    • about the tools and solutions that exist for this;
    • about our practical experience of implementation: preparation, deployment, verification.

    The report will be most useful to those who already use Hadoop (to deepen their knowledge). The report will be of interest to another part of the audience in terms of a review of the architectural solutions used in this software package.



    17: 50-18: 40 Ivan Yamshchikov - Neurona: why did we teach the neural network to write poems in the style of Kurt Cobain?

    In 2017, “artificial intelligence” is a phrase that can be heard from every iron. There are many examples of the application of machine learning and artificial neural networks in business, but in this report we will talk about the creative capabilities of AI. Tell us how we did Neurona , neurons Defense and pianola . We will discuss current challenges in building creative AI and talk about why this is important and interesting.



    To summarize our announcement, let us recall a quote from the popular movie: “Life on Earth is a mystery. But its components are a technical problem. ”

    Join now!

    Limitations


    • Broadcasting is provided on the basis of as is : we are sure that everything will be fine, but if suddenly something - do not blame me!
    • Video recordings . They will be available almost immediately, but only for conference participants who left feedback. And for everyone else, we will traditionally post them in 3-4 months on the YouTube channel of the conference .
    • You will not be able to watch what happens in other rooms . And there will be many interesting things . Next time buy tickets and see everything without restrictions.

    Also popular now: