Introducing Graph APIs

Original author: Brian Cooksey
  • Transfer
Hello, Habr! We do not stop monitoring the topic of API design after we have come across this book in Manning’s portfolio . Today we decided to publish a review article about the relatively new Graph APIs and suggest once again thinking about what the new APIs will be after the undivided popularity of REST.

Enjoy reading!

If in the past 10 years you have ever consumed an API, I bet it was a REST API. Probably, the data was structured around resources, the responses included id identifying related objects, and using HTTP-commands it was told what to do with the information: read, write and update (yes, I agree, this is a free definition, not Roy’s canonical REST Fielding). For some time, REST-style APIs have been the dominant standard in our industry.

However, REST has its problems. The client may get used to extracting extra data by requesting a whole resource in the case when he needs only one or two pieces of information. Or the client may regularly need several objects at the same time, but he cannot extract them all in one request - then the so-called “under-extraction” of data arises. In terms of support, changes to the REST API may result in the client having to update the entire integration so that the program matches the new API structure or response patterns.

To solve such problems in recent years, fundamentally different APIs called "graphs" have been increasingly developed.

What is the Graph API?


Simplified definition of a graph API: this is an API that simulates data in terms of nodes and edges (objects and relationships) and allows the client to interact with many nodes at once in a single request. Let's say the server contains information about authors, blog posts, and comments on them. If we have a REST API, then to obtain the author's comments, and to a particular post with the client, you may need to make three HTTP-request, for example: /posts/123, /authors/455, /posts/123/comments.

In the graph API, the client formulates the call in such a way that the data from all three resources is pulled in one call. The client can also specify the fields that are really important for him, providing more complete control over the response scheme.
To examine in detail how this mechanism works, consider a couple of cases with a description of live non-fictional APIs.

Case 1: Facebook Graph API Facebook

launched version 1.0 of its API in 2010 and has been designing new versions since then, inspired by the graph database example. There are nodes corresponding, for example, to posts and comments, as well as edges connecting them and indicating that this comment "refers" to this post. This approach provides the entire design with no less high-quality detectability than the typical REST API, however, it still allows the client to optimize data extraction. Let's take a separate post as an example and consider what simple operations can be done with it.

First, the client, using a GET request, selects a post from the root of the API based on the post ID.

GET /

By default, in this case, most of the top-level fields of this post are returned. If the client needs access to only some elements of the post - for example, the title and time of creation - then you can request only these fields, indicating this information as one of the request parameters:

GET /?fields=caption,created_time

To select the required data, the client requests an edge, for example, comments on the post:

GET //comments

So far, all this is reminiscent of the functions of the REST API. Perhaps the ability to specify a subset of fields is new, but in general, data is perceived in many ways as resources. The situation becomes more interesting when the client collects a subquery. Here's how the client can select comments for the post:

GET /?fields=caption,created_time,comments{id,message}

The above request returns a response that contains the time the post was created, its title and a list of comments (only id and message are selected from each message). In REST, you would not be able to do this. The client would need to first select a post, and then comments.

But what if the client needs a deeper investment?

GET /?fields=caption,created_time,comments{id,message,from{id,name}}

In this request, comments for the post are selected, including the id and name of the author of each comment. Consider how this would be done in REST. The client would need to request a post, request comments, and then, in a series of separate requests, extract information about the author of each comment. Many HTTP calls are immediately dialed! However, when designing in the form of a graph, all this information is condensed in one call, and in this call there is only the information that the client needs.

Finally, the last point that should be noted about graph design: any object selected from the edge is itself a node and, therefore, it can be requested directly. Here, for example, is how additional information about a particular comment is selected:

GET /

Please note: the client does not need to collect the URL of the view , as might be required when working with the REST API. This can be useful in situations where the client does not have direct access to the id of the parent object. The same situation occurs when data changes. For example, if we need to update and / or delete an object (say, a comment), a PUT or DELETE request is applied, respectively, sent directly to the endpoint . To create an object, the client can direct the POST to the corresponding edge of the node. So, to add a comment to a post, the client makes a POST request to the edge with comments from this post:/posts//comments/

id

POST //comments
message=This+is+a+comment

Case 2: GitHub V4 GraphQL API

Another competitor to the graph API is a specification called GraphQL. This concept is significantly different from REST, it provides only one endpoint that accepts GET and POST requests. For all interactions with the API, requests are sent that match the GraphQL syntax.

In May 2017, GitHub released the 4th version of its API that meets this specification. To try out what GraphQL is all about, let's look at the individual operations that can be done with the repository.

To select a repository, the client defines a GraphQL query:

POST /graphql
{
  "query": "repository(owner:\"zapier\", name:\"transformer\") {
    id
    description
  }"
}

In this request, the ID and description of the “transformer” repository from the Zapier org resource are selected. Here are a few things to note. Firstly, we read the data from the API using POST, because we send the message body in the request. Secondly, the payload of the request itself is written in JSON format, which is prescribed in the GraphQL standard. Thirdly, the structure of the query will be exactly the same as that indicated in our query {"data": {"repository": {"id": "MDEwOlJlcG9zaXRvcnk1MDEzODA0MQ==", "description": "..."}}} (the root key datais another required element that must be present in GraphQL responses).

To select data related to the repository - for example, tasks and their authors, the client uses a subquery:

POST /graphql
{
  "query": "repository(owner: \"zapier\", name: \"transformer\") {
    id
    description
    issues(last: 20, orderBy: {field: CREATED_AT, direction: DESC}) {
      nodes {
        title
        body
        author {
          login
        }
      }
    }
  }"
}

This request captures the ID and description of the repository, the name and text of the last 20 tasks created in the repository, as well as the login (name) of the author of each task. That is, in each request the mass of information is packed. Imagine what the REST equivalent of such a query would look like - and it becomes clear what possibilities and flexibility GraphQL clients provide in this regard.

When updating data, GraphQL uses a concept called “mutation”. Unlike REST, where an update is performed by PUT or POST a modified copy of a resource to the same endpoint from which the client retrieved it, the GraphQL mutation is an explicit operation defined by the API. If the client needs to adjust the data, then you need to know what mutations are supported on the server. Conveniently, GraphQL allows you to detect them as part of a process called "introspection of the schema."

Before discussing what “introspection” is, the term “scheme” needs to be clarified. In GraphQL, each API defines a set of types used in validating queries. So far on GitHub we have been working with types repository, issueandauthor. Each type describes the data that it contains, as well as the relationship of this type with others. Together, all of these types form an API schema.

Given a detailed schema, GraphQL necessarily requires the client to be able to query this schema in accordance with GraphQL syntax. In this way, the client can learn the capabilities of the API by introspection.

If the client needs to know what mutations are possible on GitHub, you can simply request:

POST /graphql
{
  "query": "__type(name: \"Mutation\") {
    name
    kind
    description
    fields {
      name
      description
    }
  }"
}

Among the mutations listed in the response, we find, for example, addStarthat allows the client to put an asterisk in the repository (or any rated object). To carry out a mutation, a similar query is used:

POST /graphql
{
  "query": "mutation {
    addStar(input:{starrableId:\"MDEwOlJlcG9zaXRvcnk1MDEzODA0MQ==\"}) {
      starrable {
        viewerHasStarred
      }
    }
  }"
}

This request indicates that the client is about to apply the mutation addStarand provides the arguments necessary to perform such an operation; in this case, this is just the repository ID. Please note: in this request, the mutation keyword is used as the prefix of the request. So GraphQL learns that the client is about to complete the mutation. In all previous queries, the query keyword could also be set as a prefix, but it is customary to use it if the type of operation is not specified. Finally, it should be noted that the client fully controls the data contained in the response. In this request, the client requires a field from the repository viewerHasStarred- in this scenario it does not interest us too much, because an asterisk is added during mutation, and we know that it will returntrue. However, if the client made a different mutation - say, created a task, then he can get the generated values ​​in response, for example, task ID or number, as well as embedded data, for example, the total number of open tasks in this repository.

API of the future

I hope these cases demonstrate how API design is developing in the SaaS industry. I'm not trying to say that graph APIs are the future, and REST is dead. Architectures like GraphQL have their own problems. But it’s good that the range of opportunities is expanding, and the next time you need to create an API, you can weigh all the compromises that you have to make with this or that design option and choose the best solution.

Also popular now: