Ad Exchange Server - not like others

    Ad Exchange for Real-Time Bidding (RTB) is one of AdTech solutions that modify the online advertising market. Its main function is the docking of a large number of SSP and DSP, which do not have direct integration between themselves, as well as the resale of various advertising traffic between them.

    Thanks to the order for the US market, we plunged into the specifics of building the Ad Exchange platform. And in this article we present some ideas and results.

    image

    Formulation of the problem


    Real-Time Bidding (RTB) provides the sale of advertising space on sites in real time to show relevant ads to the target audience.

    In short, the process diagram is as follows:

    image

    • the end user requests a web page or a mobile application where space is reserved for the banner (embedded code for the sale of advertising inventory - SSP, Supply Side Platform);
    • To ensure the maximum sale price of SSP inventory through Ad Exchange, it organizes bidding between various DSP (Demand Side Platform), whose goal is to buy inventory as cheap as possible;
    • after the announcement of the auction winner, the winning DSP sends the SSP code of the banner, which is shown to the user;
    • Another side of the process is DMP, a third-party system that provides the DSP with detailed information about the end user (beyond what can be sent to the SSP as cookies, etc.) to justify the purchase and the proposed price.

    There are quite a few Ad Exchange exchanges today. In addition, many SSPs implement their own trades (actually closing the Ad Exchange functionality). But our customer was sure that due to some new ideas he would be able to quickly enter the market and withstand competition.

    The exchanges work according to different principles: someone offers a higher margin, someone less, someone trades unique inventory, others focus on conditional consumer goods. The market is quite young and actively developing, so there are no proven business models over the years: everything is built on bold hypotheses and experiments. Most players work in a simple way: they receive a request from one of several SSPs with which they have been able to reach an agreement, and send it to all integrated DSPs in anticipation of a better bet. Ad Exchange revenue is the difference between the purchase price and the sale of advertising inventory from SSP and DSP minus operating costs.

    This scheme was proposed by our client to optimize by correctly distributing SSP requests to the DSP — not to send out obviously “losing” requests, thereby reducing operating costs. Due to this, you can reduce the commission of the stock exchange, not losing in income, and make your offer more attractive against the backdrop of competing Ad Exchange in the fight for SSP and DSP. And connecting more partners will give both income and stability in the market.

    To implement this strategy in the US market, we were tasked with making the Ad Exchange with a smart query distribution, which was supposed to provide a good redemption rate. In theory, for such a distribution, you can use a lot of information accompanying the request, even data from the above-mentioned third-party systems (DMP). However, complex analytics requires resources, so the task is really to find a balance between the costs of smart distribution and the gain (compared to other market players) from its implementation. On a relatively new immature market, building very complex solutions, squeezing out tenths of a percent of optimization, simply does not make sense.

    An important feature of the project, in addition to the expected high loads, was the fulfillment of the non-functional requirement for the speed of the auction put up by SSP. Adequate in this segment of the market is the timeout waiting for a response from the SSP of 300 ms, which was necessary to meet with calls to external systems (DSP).

    The project started in the fall of 2016. Thanks to the experience of the team in this area, after three months, we made the first prototype, and after another three months, the MVP (Minimum Viable Product) was ready, which allowed us to assemble the first analytics to launch a smart query distribution inside the Ad Exchange.

    The launch of MVP showed that the hypothesis about the commercial success of the project is correct - the Ad Exchange began to earn the client money. The initial development of the Ad Exchange was a deeper study of data - connecting to analytics information about end users from external systems. But at the MVP stage, it was decided to use only the data that the SSP has. That was enough to get the expected profit.

    Solution architecture


    The solution is built on the Chain of responsibility pattern, which allows not to fix the request route within the system, easily adding handlers and various services, from the auction itself to filtering tools.

    image

    The customer did not limit us to the stack of technologies used. Therefore, caring for the future development and support of the project, we built a horizontally scalable solution using Postgres and Hadoop.

    Ad Exchange itself is written in Java - at the same time we did not use any frameworks so as not to sag in the load (we worked at a low level).
    In order to meet the mentioned SSP timeout, we selected the garbage collector parameters (used by G1) and worked synchronously with a large number of requests — we used an HTTP client that does not block the stream, as well as an HTTP keep-alive protocol extension that allows you to send several requests within single tcp connection

    Software components are deployed on hardware leased from a hoster, since conditions of the task did not allow to use the cloud due to the overlapping of resources of virtual cloud machines (the allocation of the necessary resources may take time, but we do not have it). At the moment, Ad Exchange uses four physical servers, one of which is redundant (for seamless updates, etc.).

    The open source Apache Kafka is used as a message broker - it perfectly fits into our “one subscriber - many publishers” model, although it had to be “screwed up” a little so that repeated messages did not come.

    Each of the servers provides in normal mode the processing of the order of 10 thousand requests per second (these parameters were laid during the development of the solution). Now the average load is 15-20 thousand requests per second, and at the peak the request flow reached 40 thousand per second within a few hours, and the Ad Exchange coped with it perfectly.

    The distribution of requests between servers is performed by the software load balancer nginx, which is configured for our task. In our experience, nginx can hold up to 60–70 thousand requests per second without allocating a separate hardware balancer. If, in the future, the Ad Exchange load will be above this threshold, we are planning to purchase a hardware balancer, which will distribute the requests among several nginx of the same type.

    Monitors what is happening, subject to continuous load growth, the monitoring system, which is part of the Ad Exchange created.

    Storage


    Given the analytics bid during query distribution, the database is an integral part of our Ad Exchange. The system stores information about bids, bidders and deals made.

    It makes no sense to collect this amount of data for the entire period of Ad Exchange, so the storage has a multi-layered architecture. All auction data is stored for a week. On their basis, higher-level intermediate aggregates are built, which are stored for several months. And already on the basis of intermediate end assemblies are assembled, used in long-term analytics and for reconciliations with SSP and DSP. Among other information in these units there is data on how many bets were made and how much money the exchange will pay SSP or expects to receive from the DSP.
    Final aggregates are stored for the duration of the Ad Exchange.

    Collecting analytics and building aggregates provide separate services.

    So that the storage corresponded to the speed of the system itself, it also had to work with it. In particular, for some time we fought with the hoster, because data about transactions simply did not have time to register in the database. It turned out that the hardware problem with the RAID array was to blame. After replacing it, we were able to squeeze out 90 thousand queries per second to Postgres (on inserting data into the database).

    The rest of the Ad Exchange is stateless, which in future provides easy horizontal scaling. It does not store any data on requests - the maximum, the obtained information about which DSP to choose. So we can add new servers to process requests as needed.

    Traffic filtering


    The key element of the system, which allows to reduce the load and meet the timeouts indicated by the customer, is traffic filtering.

    According to the task performed by any Ad Exchange:
    • accepts requests from the SSP;
    • holds an auction (sends requests to several DSP, compares the proposed prices, identifies the winner);
    • agrees on a victory with SSP (informs the winner’s price minus his commission, waits for a response with the total price of the show);
    • completes the transaction (informs the necessary DSP about his victory, conducts user traffic).

    Clever query distribution in our Ad Exchange is enabled at the beginning of the auction.

    Receiving a request from an SSP with certain information (IP, user agent), we detail it using information accumulated in the system — known data about the user, a list of DSPs to which similar requests were sent, their responses, etc. This is necessary to select the most advantageous DSP combination for each request. Thanks to the selection of such a combination, the system allows not to send requests to those DSPs that do not bet or do, but are too low. To do this, a separate service in real time maps how the DSP responds to requests (these cards are stored in Redis).

    In parallel, we check the status of the DSP - if the share of responses within the timeout drops, the system automatically reduces the number of requests to this DSP. As soon as the load on the DSP decreases (and the proportion of correct answers increases in a reasonable time), the number of requests gradually returns to the previous level.

    Among the DSPs that responded in time, we conduct an internal auction - we highlight the best offer and send it to SSP. From the time of the request from the SSP to our response, it takes no more than 300 ms, in accordance with industry requirements.

    Since we give the data to the SSP where our auction is held, we need to consider the winning bid there. Their auction is engaged in the auction server at the next stage, when processing user traffic. Thanks to him, the DSP response map is enriched with new data (along with the information collected about the end user).

    Comparison of data obtained at the auction stage, and parameters known from user traffic, allows you to filter bots (clickers for advertising, search bots, etc.). Such traffic is not redeemed by DSP, and in the absence of its own filtering system, it turns into customer losses, which will have to be closed with a margin.

    It should be noted that the filtration of bots traffic was not launched immediately. But after the inclusion of simple locks, the margin gain was about 50%.

    By the way, in addition to automatic traffic filtering tools in our system, it is possible for the customer to manually change the threshold values ​​of a number of parameters, thus adjusting the margin.

    The user traffic itself is critical for us, but when processing it, it is no longer necessary to fit in 300 ms. It uses a separate processing system, which can hold the user a little, but will not allow losing this request.

    To ensure the stability of the solution, a subsystem was introduced, which, realizing the current Ad Exchange load, “cuts off” the requests for auctions that it cannot physically process. So the system is protected from uncontrolled load growth from the SSP.

    Perspectives


    To date, the Ad Exchange created by us works and brings a good profit. And we support and integrate new partners (DSP / SSP) as needed. In total, several dozen systems have already been integrated. Each such integration implies not only a software connection, but also comprehensive testing of the service, because under heavy loads, the problems of the connected service may affect other partners.

    In general, the market moves to the fact that SSP and DSP will connect directly, which will make exchanges unnecessary. But integration rests on the capabilities of SSP and DSP. Despite the existence of an open described API (OpenRTB protocol), it is not yet generally recognized in the market. For example, such a large player as Appnexus has integrated support for OpenRTB quite recently.

    Essentially, an Ad Exchange is a liquidity provider. So the decision in the near future is unlikely to lose its relevance. Moreover, the rest of the advertising market stock model is only gaining popularity.



    Article author: Nikolay Eremin

    PS We publish our articles on several sites of the Runet. Subscribe to our pages on the VK , FB or Telegram channel to find out about all our publications and other news from Maxilect.

    Also popular now: