Google App Engine - Scalable Applications

    The Google App Engine makes it easy to create applications that work reliably even under heavy load and with large amounts of data. But it’s even easier to create a software monster that will work very slowly or not at all, constantly returning an HTTP 500 error.

    How to write fast and scalable applications - this will be discussed in this article.

    All of the foregoing is primarily related to applications written in Java, but for the most part it should be true for applications written in Python.


    One second


    The Google App Engine allows each application to serve up to 500 requests per second, but only if each request takes an average of no more than one second. Otherwise, the application, as inefficient, will be limited, and even with a relatively small load, some requests will begin to fail with an HTTP 500 error.

    Unfortunately, Google does not disclose a method for calculating the average time for executing a request, therefore, to ensure that the application does not “fall out of favor” ", It is necessary to ensure that, without exception, all requests to the application are completed no more than one second.

    If it is not possible to do without “long” requests at all, it is necessary to minimize the frequency of their appearance — “smudge” among the fast requests — and then experimentally verify that the application is not subject to restrictions. It should be remembered that restrictions are imposed and removed by the system gradually.

    Thirty seconds


    The data storage in App Engine allows applications to efficiently work with a huge amount of data due to its distributed architecture, but because of it, on average, 1 out of 3,000 operations with storage ends in a timeout. Such operations are automatically repeated, but if after several repetitions the operation could not be completed, a DatastoreTimeoutException is thrown, in the absence of which handler the application will fail with an HTTP 500 error.

    For fast queries that take less than a second to process, automatic retries reduce the frequency of a DatastoreTimeoutException. During the execution of “long” requests that work intensively with the repository, the probability of throwing an exception is much higher. Often, such requests generally end with a DeadlineExceededException, because if the unsuccessful "heavy" access to the repetition is repeated many times, the application may run out of the 30-second timeout allocated for processing the request.

    An application can catch and handle both exceptions, but still the best solution would be to completely get rid of “heavy” requests, for example, breaking each such request into several lighter ones. This will not completely eliminate the exceptions, but will make their appearance a very rare event.

    Ten times per second


    All objects in the repository belong to some entity group. Each group contains objects, between which a dependency relationship is defined. An object independent of other objects belongs to a group consisting of itself. Object groups help App Engine store related objects together in distributed storage.

    The distributed storage architecture allows parallelizing operations with objects, but only if they belong to different groups. For objects that are part of the same group, no more than 10 write operations per second are supported.

    If two requests to the application simultaneously try to modify the same object or objects belonging to the same group of objects, then at least one of the write operations will fail, because
    there will be a collision. An error operation will be retried automatically. But if, after several retries, a collision is still present, the application will be interrupted by a DatastoreTimeoutException.

    When designing an application, you need to clearly understand how objects will be combined into groups, and how often each of them will be modified, including taking into account the growing number of users of the application. In order for the application to scale well, you must adhere to small groups of objects, and for frequently updated data, use the sharding technique.

    Summary


    The Google App Engine makes it easy to create applications that work reliably even under heavy load and with large amounts of data, but only if requests to the application are executed quickly, data is exchanged with the storage in small portions, and the data itself is organized into small groups of objects.



    The article was written by Alexander, who is not yet present on Habré.
    Alexander on Habré - windicted

    Also popular now: