Lectures of Technopolis. Design of high-loaded systems (fall 2017)

    We are starting to publish lectures on Technopolis, an educational project of the Odnoklassniki team at Peter the Great St. Petersburg Polytechnic University. Creating highly loaded applications is not only designing and writing code, but also a huge number of other aspects throughout the product life cycle. We will go through the entire process of creating and using a highly loaded system. Particular attention will be paid to operating features, networks, load balancing, memory hierarchy, and everyday tools. We will also talk about monitoring, auditing and much more. Lectures of the course are given by a team of experts led by the leading developer at Odnoklassniki Vadim Tsesko.

    Lecture list:

    1. Introduction (Vadim Tsesko incubos )
    2. Typical architectures (Alexander Khristoforov)
    3. Operation (Ilya Shchanikov)
    4. Network stack (Dmitry Samsonov dmitrysamsonov )
    5. Balancing (Andrey Domas)
    6. Processors and memory (Alexey Gorbov)
    7. Data Warehouses (Sergey Egorichev)
    8. JVM (Andrey Pangin apangin )
    9. Monitoring (Sergey Sharapov Sharapoff )
    10. Clouds (Leonid Talalaev)

    Lecture 1. Introduction (Vadim Tsesko)

    Video on the Technostream channel

    Discussion of the features of the stages of development of the Web and information systems, starting with the advent of the Internet and ending with cloud computing. The objectives and content of the course, as well as the sources of independent acquisition of new knowledge on this topic are examined. In conclusion, we talked about an individual project in which each student develops in Java their distributed key-value storage with HTTP API.

    Storage development is divided into three stages: local version, distributed version, load testing and optimization. An informal specification is described for the store, and a set of functional tests is provided. The choice of technologies is not limited by anything (except for the Java language). Decisions are made as a pull request on GitHub.

    At the end of the introductory lecture in live demo mode, the first stage of the course project was completed. It is noted that at the third stage, such popular open load testing tools as wrk and Yandex.Tank are used, as well as async-profiler for profiling.

    Lecture 2. Typical architectures (Alexander Khristoforov)

    Video on the Technostream channel

    Using the example of the messenger, various architecture options, their advantages, disadvantages and pitfalls are considered. We touched on issues of performance and optimization. Popular ways to scale systems to both increase productivity and fault tolerance are discussed. Various replication options and data consistency issues are discussed in detail. In conclusion, we are talking about caching, the use of queues and microservices.

    Lecture 3. Operation (Ilya Shchanikov)

    Video on the Technostream channel

    The problem of ensuring high availability of distributed services is considered. We discussed the features of operation - various tasks, roles and teams that solve them. They touched on the most important aspects of ITIL, based on the practices developed in Odnoklassniki: configuration management, changes, availability, incidents and problems. It also talks about the differences in the tools used in administering large systems.

    Lecture 4. Network stack (Dmitry Samsonov)

    Video on the Technostream channel The

    entire network stack was examined in detail, starting with network cards, TCP / IP and ending with HTTPS and QUIC, both from the theoretical and practical points of view. Particular attention is paid to the practical features of the network implementation in Linux, the most important aspects of TCP, as well as tools for configuring and optimizing the network. In conclusion, the latest protocols and motivation for their development are described.

    Lecture 5. Balancing (Andrey Domas)

    Video on the Technostream channel We

    talk about what load balancing is, what tasks it solves, what problems it creates. Common solutions are presented at different levels of the network model: master-slave, L4, GSLB DNS. The topic of CDN functioning mechanisms is discussed: anycast and unicast. Additionally, some types of attacks that can be countered with balancing are considered.

    Lecture 6. Processors and memory (Alexey Gorbov)

    Video on the Technostream channel A

    brief description of the history of the industry, issues of the evolution of processor architecture and their mechanisms: from Von Neumann architecture to pipelined architecture, the specifics of multi-core architecture, multi-level caching and extended instruction sets. The second part of the lecture is devoted to the main aspects of RAM management using the Linux example: virtual memory, process memory, NUMA, pages, caching. In conclusion, typical problems are examined, their diagnosis and elimination.

    Lecture 7. Data Warehouses (Sergey Egorichev)

    Video on the Technostream channel

    We talked about the variety of data warehouses. We found out what basic characteristics they possess and how they differ from each other. We examined the basic principles of operation of HDD and SSD, their features, and also highlighted the strengths and weaknesses. We examined common optimization methods, the most popular I / O schedulers, and file systems. We discussed the need for monitoring data warehouses and what is especially worth paying attention to. We also determined the basic requirements for storages on the part of highly loaded systems and tried to answer the questions of choosing the type of repository that often arise during application design.

    Lecture 8. JVM (Andrey Panguin)

    Video on the Technostream channel The

    lecture is devoted to the design of modern virtual Java machines and the features of the development of highly loaded systems in Java. The main components of the JVM are analyzed: class loader, interpreter, JIT compiler and garbage collector. HotSpot JVM compiler optimizations and techniques for measuring the performance of Java programs are discussed. We consider memory management mechanisms and various GC algorithms: from Mark-Sweep to Shenandoah. Recommendations are given on optimizing server applications, combating pauses and making effective use of the Linux network stack in Java.

    Lecture 9. Monitoring (Sergey Sharapov)

    Video on the Technostream channel The

    lecture is devoted to monitoring systems in Odnoklassniki, which are charged with the task of detecting anomalies with both equipment and the business logic of the portal. It is about which opensource solutions are used, why the choice fell on them and what problems we encountered during their operation. The first part of the lecture outlines our approach to the selection of monitoring tools, when the question arises of introducing a new or finalizing the current solution. The second part is devoted to how our own system for monitoring business logic was developed and implemented, and how this system, consisting of many closely related components, is currently working.

    Lecture 10. Clouds (Leonid Talalaev)

    Video on the Technostream channel

    This lecture is about cloud computing, the advantages and disadvantages of using "Clouds" for various tasks. IaaS, PaaS, SaaS and classic IT models are compared. Talked about virtualization and its support in the OS and modern central processors. It shows what the differences between virtual machines and containers are, and "live" discusses how containerization works with the example of Docker. It explains what opportunities cloud providers provide with the example of AWS, Google Gloud, Azure, as well as why you sometimes need your own cloud for internal needs and, in the most general terms, how this is done.

    The playlist of all lectures is on the link , as well as on youtube by the link .

    The course project lies here .
    Various solutions can be found among closed pull requests .

    Broadcasts and video recordings of other courses of the Technopolis project can be found in the official group of the project in OK .

    We also remind you that the Technostream channel has up-to-date lectures and master classes on programming and data analysis from IT specialists from all educational projects of Mail.Ru Group - Technoatom, Technopark, Technopolis, Technosphere and Technotrek.

    Also popular now: