Proxygen - Facebook C ++ HTTP Framework

Original author: Daniel Sommermann, Alan Frindell
  • Transfer
  • Tutorial
Proxygen is a collection of libraries for using the HTTP protocol in C ++, which includes, among other things, a very easy to use HTTP server. In addition to the classic HTTP / 1.1, the Proxygen framework supports SPDY / 3 and SPDY / 3.1 . HTTP / 2 will also be fully supported soon.

Proxygen was not conceived as a replacement for Apache or nginx - these projects are focused on creating sufficiently flexible and configurable web servers that allow fine-tuning to achieve maximum performance. The task of Proxygen is to work quite well on the default settings, giving the programmer an easy-to-use web server and web client that can easily integrate into existing projects. We want to help people build web services in C ++ at a lower cost and we believe that Proxygen is a great framework for this. You can read the documentation on it and connect to the development on Github .


Background


Proxygen began as a project to create a custom high-performance reverse proxy with load balancing about four years ago. We planned that Proxygen will become a library for generating proxy servers (you probably already guessed it from the name Proxygen). But since then it has seriously evolved. We are aware that there is already a decent number of programs that solve such problems (Apache, nginx, HAProxy, Varnish, etc), however, we decided to go our own way.

Why did we create our own HTTP stack?

Integration

The ability to quickly and easily integrate with your existing Facebook infrastructure was critical. For example, the ability to administer our HTTP infrastructure with tools like Thrift simplifies integration with existing systems. The ability to easily monitor and measure the performance of Proxygen using systems such as ODS (our internal monitoring tool) makes it possible to quickly respond to new data and modify the product. Creating our own HTTP stack gave us the opportunity to interact more closely with the systems and components we need.

Code reuse

We wanted to create the foundation for building network components for all of our projects. Currently, more than a dozen of our internal systems are built using Proxygen, including parts of such systems as Haystack , HHVM , our HTTP traffic balancers, and some parts of our mobile infrastructure. Proxygen is a platform where we can, for example, work on support for the HTTP / 2 protocol, and as soon as it is completely ready, get its support in all our products.

Scalability

We honestly tried to take existing products and scale them to our entire infrastructure. With some it even worked out, some options worked for a long time. But at some point it turned out that the product used was no longer able to keep up with the growth of our capacities.

Functional

At the time of writing Proxygen, a certain number of features were absent in other similar projects (and some are missing even now). Some of these features are very useful for us: SPDY, WebSockets, HTTP / 1.1 (keep-alive), TLS-false start, some features of load balancing. Building our own HTTP stack freed our hands in terms of implementing this functionality.

Initially launched in 2011 by several of our engineers, who sought to make the use of the HTTP protocol more efficient, Proxygen was developed by a team of 3-4 main developers and a dozen internal contributors. Milestones of the project:

  • 2011 - The beginning of the development of Proxygen. In the same year, the project began to process part of the real Facebook traffic.
  • 2012 - Adding SPDY / 2 support.
  • 2013 - Release in production SPDY / 3, the beginning of work on SPDY / 3.1
  • 2014 - Release in production SPDY / 3.1, start of work on HTTP / 2


There are some more important points in the development, but we think that the code will tell this story better than us.

We currently have several years of experience using Proxygen. The library has already processed trillions of HTTP (S) and SPDY requests. We believe that we have already reached the stage when this project is not ashamed to share with the community.

Architecture


The core of the HTTP layer is divided into four abstractions: session, codec, transaction, and handler. A session is created for each connection. Each session has a codec responsible for serializing and deserializing frames to HTTP messages. The session is responsible for sending each message from the codec to a specific transaction. The library user is responsible for writing handlers for messages arriving in the transaction. This design allows us to support new multiplexed protocols such as SPDY and HTTP / 2.

image

Proxygen actively uses the capabilities of the latest C ++ standard and depends on Thrift and Folly. We used move semantics to avoid the cost of copying large objects like body buffers and request and response headers. In addition, using non-blocking I / O and Linux epoll under the hood, we have created a server that is efficient both in terms of memory usage and CPU time.

HTTP server


The example server that we included in the release is a good starting point if you want to start with a simple, out-of-the-box, asynchronous server backbone.

EchoServer.cpp
/*
 *  Copyright (c) 2014, Facebook, Inc.
 *  All rights reserved.
 *
 *  This source code is licensed under the BSD-style license found in the
 *  LICENSE file in the root directory of this source tree. An additional grant
 *  of patent rights can be found in the PATENTS file in the same directory.
 *
 */
#include "EchoHandler.h"
#include "EchoStats.h"
#include "proxygen/httpserver/HTTPServer.h"
#include "proxygen/httpserver/RequestHandlerFactory.h"
#include 
#include 
#include 
#include 
using namespace EchoService;
using namespace proxygen;
using folly::EventBase;
using folly::EventBaseManager;
using folly::SocketAddress;
using Protocol = HTTPServer::Protocol;
DEFINE_int32(http_port, 11000, "Port to listen on with HTTP protocol");
DEFINE_int32(spdy_port, 11001, "Port to listen on with SPDY protocol");
DEFINE_int32(thrift_port, 10000, "Port to listen on for thrift");
DEFINE_string(ip, "localhost", "IP/Hostname to bind to");
DEFINE_int32(threads, 0, "Number of threads to listen on. Numbers <= 0 "
             "will use the number of cores on this machine.");
class EchoHandlerFactory : public RequestHandlerFactory {
 public:
  void onServerStart() noexcept override {
    stats_.reset(new EchoStats);
  }
  void onServerStop() noexcept override {
    stats_.reset();
  }
  RequestHandler* onRequest(RequestHandler*, HTTPMessage*) noexcept override {
    return new EchoHandler(stats_.get());
  }
 private:
  folly::ThreadLocalPtr stats_;
};
int main(int argc, char* argv[]) {
  gflags::ParseCommandLineFlags(&argc, &argv, true);
  google::InitGoogleLogging(argv[0]);
  google::InstallFailureSignalHandler();
  std::vector IPs = {
    {SocketAddress(FLAGS_ip, FLAGS_http_port, true), Protocol::HTTP},
    {SocketAddress(FLAGS_ip, FLAGS_spdy_port, true), Protocol::SPDY},
  };
  if (FLAGS_threads <= 0) {
    FLAGS_threads = sysconf(_SC_NPROCESSORS_ONLN);
    CHECK(FLAGS_threads > 0);
  }
  HTTPServerOptions options;
  options.threads = static_cast(FLAGS_threads);
  options.idleTimeout = std::chrono::milliseconds(60000);
  options.shutdownOn = {SIGINT, SIGTERM};
  options.handlerFactories = RequestHandlerChain()
      .addThen()
      .build();
  HTTPServer server(std::move(options));
  server.bind(IPs);
  // Start HTTPServer mainloop in a separate thread
  std::thread t([&] () {
    server.start();
  });
  t.join();
  return 0;
}


EchoHandler.cpp
/*
 *  Copyright (c) 2014, Facebook, Inc.
 *  All rights reserved.
 *
 *  This source code is licensed under the BSD-style license found in the
 *  LICENSE file in the root directory of this source tree. An additional grant
 *  of patent rights can be found in the PATENTS file in the same directory.
 *
 */
#include "EchoHandler.h"
#include "EchoStats.h"
#include "proxygen/httpserver/RequestHandler.h"
#include "proxygen/httpserver/ResponseBuilder.h"
using namespace proxygen;
namespace EchoService {
EchoHandler::EchoHandler(EchoStats* stats): stats_(stats) {
}
void EchoHandler::onRequest(std::unique_ptr headers) noexcept {
  stats_->recordRequest();
}
void EchoHandler::onBody(std::unique_ptr body) noexcept {
  if (body_) {
    body_->prependChain(std::move(body));
  } else {
    body_ = std::move(body);
  }
}
void EchoHandler::onEOM() noexcept {
  ResponseBuilder(downstream_)
    .status(200, "OK")
    .header("Request-Number",
            folly::to(stats_->getRequestCount()))
    .body(std::move(body_))
    .sendWithEOM();
}
void EchoHandler::onUpgrade(UpgradeProtocol protocol) noexcept {
  // handler doesn't support upgrades
}
void EchoHandler::requestComplete() noexcept {
  delete this;
}
void EchoHandler::onError(ProxygenError err) noexcept {
  delete this;
}
}


We benchmarked our echo server on a computer with 32 Intel® Xeon® CPU cores E5-2670 @ 2.60GHz and 16 GiB of memory, varying the number of workflows from one to eight. We started the client on the same machine in order to avoid network delays and this is what we got:

# Client settings:
# For each server workflow, 2 clients
# 400 simultaneously open connections
# 100 connection requests
# 60 test seconds
# The results indicate average value for the results of 10 tests
# simple GET, 245 bytes of request headers, 600 bytes of response (without saving to disk)
# SPDY / 3.1 allows up to 10 parallel connection requests

image

Although the echo server itself is quite primitive compared to a real web server, this benchmark still shows how efficiently Proxygen works with SPDY and HTTP / 2. The HTTP server from the Proxygen kit is easy to use and it immediately works quite productively, although we focused more on ease of use than on the highest possible speed. For example, the filter model in the server gives you the opportunity to process some common data blocks according to the algorithms defined for them, and in such a way that each individual block of the algorithm code is easily amenable to unit testing. On the other hand, the need for memory allocation associated with this filter model is not ideal for high-performance applications.

Influence


Proxygen allows us to quickly implement the desired functionality, release it in production and immediately get the result. For example, we were interested in evaluating the compression format of the HPACK request headers , but unfortunately we had neither HTTP / 2 clients nor servers, and in general the HTTP / 2 specification itself was still under development. Proxygen allowed us to implement HPACK, try using it on top of SPDY and roll out the release simultaneously to our servers and mobile clients. The ability to quickly experiment with real traffic in the HPACK format gave us the opportunity to understand its real performance and evaluate the benefits of its use.

Open source


The Proxygen code base is under active development and will continue to evolve. If you like the HTTP protocol, high-performance network code, modern C ++, we will be happy to work with you! Please feel free to submit pull requests to Github .

We are actively involved in the development of open source projects and are always looking for the opportunity to share our code with the community. The network infrastructure development team has already posted Thrift and Proxygen , two of Facebook’s important networking components, into the open source . We hope that they will find their application in other projects.

Thanks to all the engineers who will take part in the development of this project: Ajit Banerjee, David Gadling, Claudiu Gheorghe, Rajat Goel, Peter Griess, Martin Lau, Adam Lazur, Noam Lerner, Xu Ning, Brian Pane, Praveen Kumar Ramakrishnan, Adam Simpivar Viswan Woo xie.

Also popular now: