Mitap in Petersburg: Data Engineering and not only

    Date engineers are people without whom analysts will fall asleep before the end of a query to the database, and the date of the Scientists will be drowned in the data. It's time to tell others and ourselves, why and how we work.

    Unfortunately, almost the only specialized conference for data analysts and data engineers in St. Petersburg this year was canceled, but we at the Wrike Tech Club decided for a long time not to be sad and arrange a cozy tube rally with great speakers on November 15th.

    Do you work with data that does not fit into RAM? Have to use distributed computing? Congratulations, you're a Data Engineer. For many in IT, this term sounds like just another of the buzzwords between Lean Analytics and Artificial Intelligence. We want to talk about data engineers as a separate specialty, and not as part of a small talk at the next Big Data Meet Up.

    Program and speakers:

    Alexander Eliseev, Wrike - Data Engineering: how to go from Data to Engineering

    We will talk about the approaches to processing Clicksteam and how our ideas have changed from analytical to data engineering, which engineering principles we have violated, and how to stop violating them in Data Engineering. I will talk about the problems that we encountered, using errors in designing data sources (from ETL with data marts to a more complex scheme), pipelines using AirFlow as an example, limitations of our technologies (ORC, Tableau, lack of resources, Jenkins pipelines). ). You will learn how we have changed our approach to pipeline design and data processing.

    Vitaly Khudobakhshov, JetBrains - Testing applications in Apache Spark

    The cost of errors in data analysis applications is often very high. But at the same time, the role of data in failures in comparison with the code is also much higher than usual. How to minimize errors in applications that are difficult to test and debug? How to write code and tests in such a case so that several hours of expensive machine time are not wasted? This is what I want to talk about a bit.

    Sergey Isaev, DataFabric - How to manage data and preserve knowledge using semantic technologies.

    I'll tell you about:

    • collecting, converting and managing data;
    • knowledge graphs;
    • ontological modeling of the domain;
    • related data;
    • application of semantic technologies for building intelligent information systems.


    Also popular now: