How to write a good solution for the Highload Cup, but not good enough to get to the top

Last week, the HighLoad Cup competition ended, the idea of ​​which was to implement an HTTP server for the travelers website. About how to write a decision on Go in 5 days, which will bring 52 place in the overall classification of 295, read under the cut.


Task description


Afinogen user in his articleI have already described the conditions of the competition, but for convenience I will briefly repeat myself. The server must implement the API for the website of travelers and support entities of three types: traveler (User), attraction (Location) and visit (Visit). The API should provide the ability to add any new entities to the database, receive them and update them, as well as do 2 operations on them - getting an average point of interest (avg) and getting places visited by a traveler (visits). Each of these operations also has a set of filters that must be taken into account when forming the response. So, for example, the API allows you to get an average rating of the sightseeing among men from 18 to 25 years old from April 1, 2010. This imposes additional difficulties in the implementation.


Just in case, I’ll give a brief formal description of the API:


  • Get // to get entity data
  • POST // to update
  • POST // new to create
  • Get / users // visits to get a list of visits by the user
  • Get / locations // avg to get the average point of interest

How to store data


The very first question that arises for everyone who has familiarized themselves with the condition of the problem is how to store data. Many (like me) at first tried to use some kind of database with the expectation of a large amount of data that would not fit in memory and not make crutches. However, this approach did not give a high place in the ranking. The fact is that the contest organizers approached the size of the data very democratically, therefore, even without any optimization of the storage structure, all data was easily stored in memory. In total, there were about 1 million travelers, 100 thousand attractions and 10 million visits in the rating bombardment, the identifiers of each entity went in order from 1. At first glance, the volume may seem large, but if you look at the size of the structures that can be used to store data, also the size of the lines in the structures, you can see that the average size is not too large. The structures and sizes of their fields I cited below:


type Visit struct {  // overall 40 bytes
    Id        int    // 8 bytes
    Location  int    // 8 bytes
    User      int    // 8 bytes
    VisitedAt int    // 8 bytes
    Mark      int    // 8 bytes
}
type User struct {    //overall 133 bytes
    Id        int       // 8 bytes
    Email     string    // 22 bytes + 16 bytes
    FirstName string    // 12 bytes + 16 bytes
    LastName  string    // 18 bytes + 16 bytes
    Gender    string    // 1 byte   + 16 bytes
    Birthdate int       // 8 bytes
}
type Location struct { // overall 105 bytes
    Id       int       // 8 bytes
    Place    string    // 11 bytes + 16 bytes
    Country  string    // 14 bytes + 16 bytes
    City     string    // 16 bytes + 16 bytes
    Distance int       // 8 bytes
}

I specified the string sizes in the structure in the format "average string length" + "size of the string object". Multiplying the average size of each structure by the number of objects, we get that purely for data storage we need only about 518 MB. Not there is not much, provided that we can roam as much as 4 GB.


The largest memory consumption, as it would not be strange at first glance, occurs not at the shelling itself, but at the data loading stage. The fact is that initially all the data is packed in a .zip archive and in the first 10 minutes the server needs to download this data from the archive for further work with them. An unpacked archive weighs 200 MB, + 1.5 GB weigh files after unpacking. Without accurate data handling and without a more aggressive garbage collector setting, loading all the data failed.


The second point, which was very important, but not everyone immediately noticed it - the shelling of the server took place so that the race condition could not work out in principle. The server was tested in 3 stages: at the first stage, GET requests were received, which received objects and called the avg (average score) and visits (received user visits) methods, the second stage the data was updated (at this stage there were exclusively POST requests for creating and updating data) and at the last stage GET requests were sent again, only on new data and with a greater load. Due to the fact that GET and POST requests were strictly separated, there was no need to use any stream synchronization primitives.


Thus, if you take into account these two points, as well as remember that the id of the objects of each entity went in order starting from 1, then as a result all the data can be stored like this:


type Database struct {
    usersArray     []*User
    locationsArray []*Location
    visitsArray    []*Visit
    usersMap       map[int]*User
    locationsMap   map[int]*Location
    visitsMap      map[int]*Visit
}

All objects, if the size of the array is enough, are placed in an array corresponding to the type. If the id of the object would be larger than the size of the array, it would fit into the corresponding dictionary. Now, knowing that in the final the data has not changed at all, I understand that the dictionaries are superfluous, but then there were no guarantees.


Indices


Pretty soon it became clear that for the quick implementation of the avg and visits methods, it was necessary to store not only the User and Location structures themselves, but also the id of the user's visits and points of interest along with the structures themselves, respectively. As a result, I added the Visits field, which is a regular array to these two structures, and thus was able to quickly find all the Visit structures associated with this user / attraction.


In the process of testing, I also thought about using "container / list" from the standard library, but knowing the device of this container suggested to me that it would always lose both in access speed to elements and in memory. Its only plus - the ability to quickly delete / add to any point is not very important for this task, since the ratio of the number of visits to users is approximately 10 to 1, we can assume that the Visit containers in the Location and User structures will be approximately 10. A removing an element from the beginning of an array of 10 units is not so expensive against the general background and is not a frequent operation. As for memory, its consumption can be illustrated by the following code:


package main
import (
    "fmt"
    "runtime"
    "runtime/debug"
    "container/list"
)
func main() {
    debug.SetGCPercent(-1)
    runtime.GC()
    m := &runtime.MemStats{}
    runtime.ReadMemStats(m)
    before := m.Alloc
    for i:=0;i<1000;i++ {
        s := make([]int, 0)
        for j:=0;j<10;j++ {
            s = append(s, 0)
        }
    }
    runtime.ReadMemStats(m)
    fmt.Printf("Alloced for slices %0.2f KB\n", float64(m.Alloc - before)/1024.0)
    runtime.GC()
    runtime.ReadMemStats(m)
    before = m.Alloc
    for i:=0;i<1000;i++ {
        s := list.New()
        for j:=0;j<10;j++ {
            s.PushBack(1);
        }
    }
    runtime.ReadMemStats(m)
    fmt.Printf("Alloced for lists %0.2f KB\n", float64(m.Alloc - before)/1024.0)
}

This code gives the following output:


Alloced for slices 117.19 KB
Alloced for lists 343.75 KB

This code creates 1000 arrays and 1000 lists and fills them with 10 elements, as this is the average number of visits. The number 10 is bad for the array, because when adding elements, on the 8th element it will expand to 16 elements and thereby more memory will be spent than necessary. According to the results, you can still see that the solution with slices consumed 3 times less memory, which is consistent with the theory, since each element of the list stores a pointer to the next, previous element and to the list itself.


Other users who reached the finals made several more indexes - for example, the country index for the visits method. These indexes would most likely speed up the solution, but not as much as storing user visits and storing visits to places of interest along with information about this object.


Libraries


The standard library for implementing the http server is quite convenient to use and well distributed, but when it comes to speed, everyone recommends fasthttp. Therefore, the first thing after implementing the logic, I threw out the standard http server and replaced it with fasthttp. The replacement was completely painless, although these two libraries have different APIs.


Further, the standard json encoding / decoding library was used as a replacement. I chose easyjson as it showed excellent speed / memory data + it had a similar to the "encoding / json" API. Easyjson generates its own parser for each data structure, which allows you to display such impressive results. Its only negative is a small number of settings. For example, there were requests in the task in which one of the fields was missing, which should lead to an error, but easyjson silently skipped such fields, which made it necessary to go into the parsers source code.


Other optimizations


Since all API methods except POST methods were implemented without the use of additional memory, it was decided to disable the garbage collector - anyway, if there is enough memory, then why drive it?


Request routing has also been rewritten. Each request was determined by one character, which allowed to save on string comparison. School optimization.


Result


For the last 5 days of the competition, without any experience in participating in such competitions, I wrote a decision on Go, which brought me 52nd place out of 295 participants. Unfortunately, I didn’t have quite a bit before reaching the finals, but on the other hand worthy rivals gathered in the finals, so due to the fact that the data and conditions of the task did not change, it is unlikely that I could rise higher.


Github with solution


Also popular now: