The story of how not to design API

Original author: Rob Konarski
  • Transfer
Once I helped a comrade who needed to integrate data on free and occupied housing from the property management system with the site of his client. To my delight, this system had an API. But, unfortunately, it was arranged very badly.

image

I decided to write this article not in order to criticize the system that will be discussed, but in order to tell about the errors encountered in the development of the API, and suggest ways to correct these errors.

Situation overview


The organization in question used the Beds24 system to manage the living space . Information about what was free and what was busy was synchronized with various housing reservation systems (such as Booking, AirBnB and others). The organization was engaged in the development of the site and wanted the search to display only information about rooms that were free during the specified period of time and were suitable for capacity. This task looked very simple, since Beds24 provides an API for integration with other systems. In fact, it turned out that the developers of this API made a lot of mistakes when designing it. I propose to analyze these errors, identify specific problems and talk about how to approach the development of the API in these situations.

Problem number 1: request body format


Since the client is interested only in information about whether, say, a hotel room is free or busy, we are only interested in accessing the endpoint of the API /getAvailabilities. And, although a call to such an API should lead to obtaining data on the availability of rooms, this call actually looks like a POST request, because the author of the API decided to equip it with the ability to receive filters in the form of a JSON request body. Here is a list of possible query parameters and examples of the values ​​they take:

{
    "checkIn": "20151001",
    "lastNight": "20151002",
    "checkOut": "20151003",
    "roomId": "12345",
    "propId": "1234",
    "ownerId": "123",
    "numAdult": "2",
    "numChild": "0",
    "offerId": "1",
    "voucherCode": "",
    "referer": "",
    "agent": "",
    "ignoreAvail": false,
    "propIds": [
        1235,
        1236
    ],
    "roomIds": [
        12347,
        12348,
        12349
    ]
}

Let's go over this JSON object and talk about what's wrong here.

  1. Dates ( checkIn, lastNightand checkOut) are in format YYYYMMDD. There is absolutely no reason not to use the ISO 8601 standard format ( YYYY-MM-DD) when converting dates to strings, since this is a widely used standard for representing dates. It is familiar to many developers, and it’s exactly what many JSON parsers expect to receive as input. In addition, there is a feeling that the field lastNightis redundant, since there is a field here checkOutthat is always represented by a date one day ahead of the date specified inlastNight. In connection with the drawbacks noted above, I suggest, when designing similar APIs, to strive to always use standard ways of representing dates and try not to burden API users with the need to work with redundant data.
  2. All identifier fields, as well as fields numAdultand numChild, are numeric, but are represented as strings. In this case, there is no visible reason for representing them as strings.
  3. Here you can see the following pairs of fields: roomIdand roomIds, as well as propIdand propIds. The presence of fields roomIdand propIdis redundant, since both of them can be used to transfer identifiers. In addition, you can notice a problem with types here. Note that the field roomIdis a string, and in the array roomIdsyou need to use the numeric values ​​of the identifiers. This can lead to confusion, problems with parsing, and, moreover, it means that on the server some operations are performed with strings, and some with numbers, despite the fact that these strings and numbers are used to represent the same data.

I would like to suggest that API developers try not to complicate the lives of those who will use these APIs, allowing errors in the design of the API to be similar to those described above. Namely, it is worth striving for standard data formatting, so that they would not be redundant, so that different types of data would not be used to represent homogeneous entities. And it is not necessary to represent everything, indiscriminately, as strings.

Problem # 2: response body format


As already mentioned, we are only interested in the end point of the API /getAvailabilities. Let's look at what the answer to this endpoint looks like, and talk about the shortcomings made during its formation. Remember that when accessing the API, we are interested in a list of identifiers of objects that are free in a given period of time and can accommodate a specified number of people. Below is an example of the request body to the API and an example of what it issues in response to this request.

Here is the query:

{
    "checkIn": "20190501",
    "checkOut": "20190503",
    "ownerId": "25748",
    "numAdult": "2",
    "numChild": "0"
}

Here is the answer:

{
    "10328": {
        "roomId": "10328",
        "propId": "4478",
        "roomsavail": "0"
    },
    "13219": {
        "roomId": "13219",
        "propId": "5729",
        "roomsavail": "0"
    },
    "14900": {
        "roomId": "14900",
        "propId": "6779",
        "roomsavail": 1
    },
    "checkIn": "20190501",
    "lastNight": "20190502",
    "checkOut": "20190503",
    "ownerId": 25748,
    "numAdult": 2
}

Let's talk about response problems.

  1. The body of the response properties ownerIdand numAdultsuddenly become numbers. And in the request it was necessary to specify them as strings.
  2. The list of real estate is presented in the form of object properties, the keys of which are room identifiers ( roomId). It would be logical to expect that such data would be output as an array. For us, this means that in order to get a list of available rooms, you need to iterate over the entire object, while checking that the objects nested in it have certain properties, like roomsavail, and not paying attention to something like checkInand lastNight. Then it would be necessary to check the value of the property roomsavail, and, if it is greater than 0, one could conclude that the corresponding object is available for booking. And now let's look at the property roomsavail. Here are some variants of his presentation in the response body: "roomsavail": "0"and"roomsavail": 1. See the pattern? If the rooms are occupied, the value of the property is represented by a string. If free - it turns into a number. This can lead to many problems in languages ​​strictly related to data types, since in them the same property should not take values ​​of different types. In connection with the foregoing, I would like to suggest that developers use arrays of JSON objects to represent certain data sets, and not use uncomfortable constructions in the form of key-value pairs, like the one we are considering here, for this purpose. In addition, you need to ensure that the fields of homogeneous objects do not contain data of different types. A properly formatted server response might look like the one below. Please note that when presenting data in this form, information about the rooms does not contain duplicate data.

{
    "properties": [
        {
            "id": 4478,
            "rooms": [
                {
                    "id": 12328,
                    "available": false
                }
            ]
        },
        {
            "id": 5729,
            "rooms": [
                {
                    "id": 13219,
                    "available": false
                }
            ]
        },
        {
            "id": 6779,
            "rooms": [
                {
                    "id": 14900,
                    "available": true
                }
            ]
        }
    ],
    "checkIn": "2019-05-01",
    "lastNight": "2019-05-02",
    "checkOut": "2019-05-03",
    "ownerId": 25748,
    "numAdult": 2
}

Problem number 3: error handling


Here is how the error handling in the API considered here is organized: the system sends responses with code to all requests 200, even if an error has occurred. This means that the only way to distinguish a normal response from an error response with an error message is to parse the response body and check for the presence of fields in it erroror errorCode. The API provides only the following 6 error codes.


Beds24 API Error Codes I

suggest that everyone reading this should try not to return a response with code 200 (successful request processing) in the event that something went wrong while processing the request. You can take such a step only if it is provided for by the framework on which you are developing the API. Returning adequate response codes allows API clients to know in advance whether they need to parse the response body or not, and exactly how to do this (that is, whether to parse the server’s usual response or an error object).

In our case, the API can be improved in this direction in two ways: you can either provide a special HTTP code in the 400-499 range for each of the 6 possible errors (this is best done), or return the error code 500, which will allow the client, at least, to know before parsing the body of the answer that it contains information about the error.

Problem number 4: "instructions"


Below are the “instructions” for using the API from the project documentation:

Please read the following instructions when using the API.

  1. Calls to the API should be designed so that during their execution you would have to send and receive the minimum amount of data.
  2. API calls are executed one at a time. You must wait for the next call to the API to complete before making the next call.
  3. If you need to make several calls to the API, between them should provide for a pause of several seconds.
  4. API calls need to be performed not too often, keeping the level of calls at the minimum level necessary to solve client tasks.
  5. Excessive use of the API within a 5-minute period will result in your account being locked out without further notice.
  6. We reserve the right to block access to the system for customers who, in our opinion, overuse the API. This is done at our discretion and without additional notice.

While points 1 and 4 look quite reasonable, I cannot agree with other points of this instruction. Consider them.

  1. Item number 2. If you are developing a REST API, then it is assumed that this will be a stateless API. The independence of API calls from previous calls to it is one of the reasons that REST technology has found widespread use in cloud applications. If a certain system module does not maintain state, it can easily be re-deployed in case of an error. Systems based on such modules can be easily scaled when the load on them changes. When designing a RESTful API, you should ensure that it is an API that does not depend on the state, and that those who use it do not have to worry about something like executing only one query at a time.
  2. Item number 3. This item looks rather strange and ambiguous. I cannot understand the reason why this item of the instruction was written, but I get the feeling that it tells us that during the processing of a request, the system performs certain actions, and, if it is “distracting” with another request, sent at the wrong time, it can disrupt her work. In addition, the fact that the author of the manual speaks of "several seconds" does not allow to know the exact duration of the pause that needs to be maintained between successive requests.
  3. Items number 5 and number 6. It says “excessive use of the API”, but no criteria for “excessive use” are provided. Maybe it is 10 requests per second? Or maybe 1? In addition, some web projects can have huge amounts of traffic. If, without any adequate reasons and without notifications, to close them access to the API they need, their administrators will most likely refuse to use such APIs. If you happen to write such instructions, use clear wording in them and put yourself in the place of users who will have to work with your system, guided by your instructions.

Problem number 5: documentation


This is what the API documentation looks like.


Beds24 API documentation

The only problem with this documentation is its appearance. It would look much better if it were well formatted. Especially in order to show the possible appearance of such documentation, I, using Dillinger , and spending less than two minutes on it, made the following version of it. In my opinion, it looks much better than the above.


Improved documentation option

It is recommended to use special tools to create such materials. If we are talking about simple documents similar to the above, then for their design is quite enough something like a regular markdown file. If the documentation is more complicated, then for its design it is best to use tools like Swagger or Apiary .

By the way, if you yourself want to look at the documentation for the API Beds24 - look here .

Problem number 6: security


The documentation for all API endpoints states the following:

To use these functions, access to the API must be enabled. This is done in the menu SETTINGS → ACCOUNT → ACCOUNT ACCESS.

However, in reality, anyone can access this API, and, using some calls, get information from it without providing any credentials. For example, this also applies to requests for the availability of certain accommodations. We are talking about this in another part of the documentation.

Most JSON methods require an API key to access the account. The API access key can be set using the SETTINGS → ACCOUNT → ACCOUNT ACCESS menu.

In addition to an incomprehensible explanation of authentication issues, it turns out that the user must create the key for API access independently (this is done, by the way, by manually filling in the corresponding field, some means for automatic key creation are not provided). The key length must be between 16 and 64 characters. If you allow users to create their own keys to access the API, this can lead to the appearance of very insecure keys that can be easily picked up. In such a situation, there may be problems associated with the contents of the keys, since you can enter anything in the key field. In the worst case, this can lead to an attack on the service using a SQL injection method or something similar. When designing an API, do not allow users to create keys to access the API themselves. Instead, generate keys for them automatically. The user should not be able to change the contents of such a key, but, if necessary, he should be able to generate a new key, recognizing the old key as invalid.

In the case of requests that require authentication, we see another problem. It lies in the fact that the authentication token must be sent as part of the request body. Here is how it is described in the documentation.


An example of authentication in the Beds24 API.

If the authentication token is transmitted in the request body, this means that the server will need to parse the request body before it reaches the key. After that, he retrieves the key, performs the authentication, and then decides - what to do with the request - to fulfill it or not. If the authentication succeeds, the server will not be subject to additional load, since in this case the request body would still have to be parsed. But if you failed to authenticate the request, the valuable processor time will be wasted to parse the request body in vain. It would be better to send an authentication token in the request header using something like the Bearer authentication scheme.. With this approach, the server will need to parse the body of the request only if the authentication is successful. Another reason why it is recommended to use a standard scheme like Bearer for authentication is the fact that most developers are familiar with such schemes.

Problem number 7: performance


This problem is the last one on my list, but it does not diminish its importance. The fact is that it takes a little more than a second to execute a request to the API in question. In modern applications, such delays may be unacceptable. As a matter of fact, here you can advise everyone who is engaged in the development of the API, not to forget about performance.

Results


Despite all the problems that we talked about here, the API in question allowed us to solve the problems facing the project. But it took the developers a lot of time to figure out the API and implement everything they needed. In addition, they had to write rather complicated code to solve simple problems. If this API were designed properly, the work would be done faster, and a turnkey solution would be easier.

Therefore, I would like to ask all those who design APIs to think about how the users of their services will work with it. Make sure that the API documentation fully describes their capabilities, so that it is understandable and well designed. Control the naming of entities, pay attention to the fact that the data that your API issues or accepts is clearly structured so that it is easy and convenient to work with them. In addition, do not forget about security and correct error handling. If, when designing the API, to take into account all that we talked about, then to work with it you will not need to write something like those strange “instructions” that we discussed above.

As already mentioned, this material is not aimed at discouraging readers from using Beds24 or any other system with a poorly designed API. My goal was to, by showing examples of errors and approaches to solving them, give recommendations, following which everyone could improve the quality of their developments. I hope this material will attract the attention of programmers who have read it to the quality of the solutions they develop. And that means there will be more good APIs in the world.

Dear readers! Have you encountered poorly designed APIs?


Also popular now: