Kafka in Wargaming: a blitz

    Why Kafka? What are the general impressions? What is the composition of the clusters? Under the cut - a dozen short questions for Levon Avakian, who is responsible for reliability, application architecture, infrastructure and production in Wargaming.



    - How did you choose Kafka? What was used before? What alternatives were considered?

    Not a very correct question in relation to tank development. Apache Kafka was already used in the company for the needs of our Data Warehouse, and the integration task was initially, and then we saw that Kafka can be used for different tasks.

    - How many events are generated by your game cluster?

    A tank cluster is a cluster of clusters, the system is distributed and generates events in different Kafka. All clusters generate an average of 12 thousand messages, peaking at about 30 thousand messages per second.

    - And how many clusters do you have and what is their composition?

    The largest central cluster consists of five iron nodes. Smaller clusters that serve only tank peripherals are about three nodes plus virtuals. We have four local clusters for the CIS region.

    - How many producers and consumers you have? What are read / write rates?

    Good question. For local peripheral Kafka, one producer is a tank cluster, and consumers are dozens. According to raits: up to 75 thousand messages per second are written on the central cluster, an average of 12 thousand, up to seven thousand local and three thousand on average.

    - How big events do you write in Kafka? Is there a time limit for delivery?

    Limit 1 MB - no one asked. There are restrictions on delivery time for some consumers, for some not. Some read once a week.

    - Have you encountered any interesting features and bugs during sharding or replication?

    Faced with data loss during re-election due to topic settings. Dirty re-election was allowed and an incorrect ISR was chosen.

    - Did it happen to rest against a disk or a network?

    The network did not rested, we have 10 Gbps network interfaces. In the drive, too, did not rest. Rested against the ending file descriptors. Stability came after upgrading from java-1.7.0-openjdk-1.7.0.55-2.4.7.1.el6_5.x86_64 to jdk1.8.0_66-1.8.0_66-fcs.x86_64.

    - What kind of overhead does JVM bring in the case of Kafka? Do I need a special gc setting? How much memory does one instance spend in your case?

    12 GB of memory is allocated, everything else is standard.

    - Did you have to use any special features Kafka? Log Compaction?
    Used Log Compaction for some topics, but not for the project World of Tanks. Included on specific topics, but the result is not clear, no one gave feedback. They also increased offsets.retention.minutes up to seven days, so that the concierge, who read once a week, continued to read from where they left off.

    - Which Python libraries used to work with Kafka? What did you like?

    Just one of my reportsat Moscow Python Conf ++ will be about the experience of using various Python libraries for Kafka in WoT. Our assets are Kafka-python, confluent-kafka-python, aiokafka. Each of these libraries has its pros and cons.

    - What would you say about the advantages and disadvantages of file-based storage in comparison with in-memory? For what types of tasks would you recommend one or the other?

    Here the principle is simple. The file system is more reliable, but slower. In memory faster, but also lower reliability. Plus, there is an important limitation on the volume: you can store terabytes in the file system, and we still operate in gigabytes in memory. From here you can fantasize a lot, starting from a specific implementation.

    Based on the above: if you need to quickly, the volume is small and safety is not important, then in-memory, otherwise we look at file-based.

    - General impressions of Kafka? If you were doing the same task now, would you leave Kafka or look towards other solutions?

    Kafka is a good and simple tool for providing access to large amounts of data from the outside, which can then be slowly processed for different purposes, by different teams in different places. In WoT, we have many different tools for solving our problems, therefore, where it is appropriate to choose Kafka, we choose Kafka, where not, we look at other tools.

    Again, if you are interested in the details of our experience with Kafka, come to my report on Moscow Python Conf ++ . I hope it will seem interesting and useful to many.

    Also popular now: