Hasura. High-Performance GraphQL to SQL Server Architecture

From the sandbox

Hi, Habr! I present to your attention the translation of the article “Architecture of high performance GraphQL to SQL engine” .

This is a translation of an article about how it is designed from the inside and what optimization and architectural solutions Hasura has in itself - a high-performance, lightweight GraphQL server that acts as a layer between your web application and the PostgreSQL database.

It allows you to generate GraphQL scheme based on an existing database or create a new one. Supports GraphQL Subscriptions out of the box based on Postgres-triggers, dynamic control of access rights, automatic generation of joines, solves the problem of N + 1 requests (batching) and much more.

You can use foreign keys constraints in PostgreSQL to get hierarchical data in a single query. For example, you can execute this query in order to get albums and corresponding tracks (if a foreign key is created in the “track” table pointing to the “album” table)

{
  album (where: {year: {_eq: 2018}}) {
    title
    tracks {
      id
      title
    }
  }
}

As you may have guessed, you can request data of any depth. This API, combined with access control, allows web applications to request data from PostgreSQL without writing their own backend. It is designed to quickly perform requests, have high bandwidth, while saving processor time and memory consumption on the server. We will talk about the architectural solutions that allowed us to achieve this.

Query life cycle

The request sent to Hasura goes through the following stages:

Receiving sessions : The request goes to the gateway, which checks the key (if any) and adds various headers, such as the user ID and role.
Parsing requests : Hasura receives a request, parses the headers to get information about the user, creates GraphQL AST based on the request body.
Validation of requests : It checks whether the request is semantically correct, then the access rights corresponding to the user's role are applied.
Query execution : The query is converted to SQL and sent to Postgres.
Response generation : The result of the SQL query is processed and sent to the client (the gateway can use gzip if necessary ).

Goals

The requirements are approximately as follows:

The HTTP stack should add minimal overhead and allow processing of multiple simultaneous requests for high bandwidth.
Quick SQL generation from GraphQL query.
The generated SQL query must be effective for Postgres.
The result of the SQL query must be effectively transferred back from Postgres.

Processing GraphQL Request

There are several approaches to obtaining the data required for a GraphQL query:

Conventional resolvers

Performing GraphQL queries usually involves calling a resolver for each field.
In the sample request, we get albums released in 2018, and then for each of them we request the corresponding tracks — the classic problem of N + 1 requests. The number of queries grows exponentially with increasing depth of the query.

The requests made to Postgres will be:

SELECTid,title FROM album WHEREyear = 2018;

This request will return all albums to us. Let the number of albums that the request returned be equal to N. Then for each album we would execute the following request:

SELECTid,title FROM tracks WHERE album_id = <album-id>

In total, you get N + 1 requests to get all the necessary data.

Batching requests

Tools like dataloader are designed to solve the problem of N + 1 requests with batching. The number of SQL queries on the embedded data no longer depends on the size of the initial sample, since now it is affected by the number of nodes in the GraphQL query. In this case, you will need 2 requests to Postgres to obtain the required data:

Get the albums:

SELECTid,title FROM album WHEREyear = 2018

We get the tracks to the albums that we received in the previous request:

SELECTid, title FROM tracks WHERE album_id IN {the listof album ids}

In total, 2 requests are received. We avoided performing SQL queries on the tracks for each individual album, instead using the WHERE clause to get all the necessary tracks in one query at once.

Joins

Dataloader is designed to work with different data sources and does not allow exploiting the capabilities of a particular. In our case, the only data source is Postgres and it, like all relational databases, provides the ability to collect data from several tables in a single query using the JOIN operator. We can define all the tables needed for a GraphQL query, and generate one SQL query using JOINs to get all the data. It turns out that the data required for any GraphQL query can be obtained using a single SQL query. This data is converted before it is sent to the customer.

Such a request:

SELECT
  album.id as album_id,
  album.title as album_title,
  track.id as track_id,
  track.title as track_title
FROM
  album
LEFTOUTERJOIN
  track
ON
  (album.id = track.album_id)
WHERE
  album.year = 2018

Return the following data to us:

album_id, album_title, track_id, track_title
1, Album1, 1, track1
1, Album1, 2, track2
2, Album2, NULL, NULL

After which it will be converted to JSON and sent to the client:

[
  {
    "title" : "Album1",
    "tracks": [
      {"id" : 1, "title": "track1"},
      {"id" : 2, "title": "track2"}
    ]
  },
  {
    "title" : "Album2",
    "tracks" : []
  }
]

Response generation optimization

We found that most of the time in processing requests is spent on the function of converting the result of the SQL query to JSON.

After several attempts to optimize this function in various ways, we decided to move it to Postgres. In Postgres 9.4 ( released around the time of the first release of Hasura ), we added a function for JSON aggregation, which helped us make our plans. After this optimization, the SQL queries began to look like this:

SELECT json_agg(r.*) FROM (
  SELECT
    album.title as title,
    json_agg(track.*) as tracks
  FROM
    album
  LEFTOUTERJOIN
    track
  ON
    (album.id = track.album_id)
  WHERE
    album.year = 2018GROUPBY
    album.id
) r

The result of this query will have one column and one row, and this value will be sent to the client without any further conversions. According to our tests, this approach is about 3–6 times faster than the Haskell conversion function.

Prepared statements

The generated SQL queries can be quite large and complex, depending on the nesting level of the query and the conditions of use. Usually, web applications have a set of queries that are re-executed with different parameters. For example, the previous query must be executed for 2017, instead of 2018. Prepared statements are best suited for such cases when there is a recurring complicated SQL query in which only the parameters change.

Suppose such a query is executed for the first time:

{
  album (where: {year: {_eq: 2018}}) {
    title
    tracks {
      id
      title
    }
  }
}

We create a prepared statement for the SQL query instead of executing it:

PREPARE prep_1 ASSELECT json_agg(r.*) FROM (
  SELECT
    album.title as title,
    json_agg(track.*) as tracks
  FROM
    album
  LEFTOUTERJOIN
    track
  ON
    (album.id = track.album_id)
  WHERE
    album.year = $1GROUPBY
    album.

After that, we immediately execute it:

EXECUTE prep_1('2018');

When you need to execute a GraphQL query for 2017, we simply call the same prepared statement with a different argument:

EXECUTE prep_1('2017');

This gives about a 10-20% increase in speed depending on the complexity of the GraphQL query.

Haskell

Haskell works well for several reasons:

Compiled language with excellent performance ( more here ).
Very efficient HTTP stack ( warp , warp's architecture ).
Our previous experience with language.

Eventually

All the above-mentioned optimizations result in quite serious performance benefits:

In fact, low memory consumption and minor delays compared to a direct call to PostgreSQL, in most cases, can replace the ORM in your backend with GraphQL API calls.

Benchmarks:

Test stand:

Laptop with 8GB RAM and i7
Postgres running on the same computer
wrk , was used as a comparison tool and for various types of queries we tried to “maximize” rps
One copy of Hasura GraphQL Engine
Connection pool size: 50
Dataset : chinook

Request 1: tracks_media_some

query tracks_media_some {
  tracks (where: {composer: {_eq: "Kurt Cobain"}}){
    id
    name
    album {
      id
      title
    }
    media_type {
      name
    }
  }}

Requests per second: 1375 req / s
Delay: 17.5ms
CPU: ~ 30%
RAM: ~ 30MB (Hasura) + 90MB (Postgres)

Request 2: tracks_media_all

query tracks_media_all {
  tracks {
    id
    name
    media_type {
      name
    }
  }}

Queries per second: 410 req / s
Delay: 59ms
CPU: ~ 100%
RAM: ~ 30MB (Hasura) + 130MB (Postgres)

Request 3: album_tracks_genre_some

query albums_tracks_genre_some {  
  albums (where: {artist_id: {_eq: 127}}) {
    id
    title
    tracks {
      id
      name
      genre {
        name
      }
    }
  }}

Queries per second: 1029 req / s
Delay: 24ms
CPU: ~ 30%
RAM: ~ 30MB (Hasura) + 90MB (Postgres)

Request 4: album_tracks_genre_all

query albums_tracks_genre_all {
  albums {
    id
    title
    tracks {
      id
      name
      genre {
        name
      }
    }
  }

Requests per second: 328 req / s
Delay: 73ms
CPU: 100%
RAM: ~ 30MB (Hasura) + 130MB (Postgres)

Tags: