Geo index for finding new friends or revolutionary discovery of scientists from Austria

    As you know, Badoo is a service for finding new people. Among other things, we allow you to search for people around and in the "game" also show people who are close to you. This part "around" is very, very important. After all, a young man from Novosibirsk is much more interesting to meet a girl who lives a five-minute walk from him, and not in Vladivostok.
    We have not yet publicly talked about how we do this. But the new discovery of Austrian scientists made us so happy that we decided to do it. Let's get down to business.
    Badoo works all over the world and our search works in exactly the same way, no matter where in the world you are. How to effectively search for people around among all 200+ million users?
    The solution is nontrivial, to be honest. We need some kind of index, some way to narrow our search immediately. In the case of the globe, the simplest partition is a grid of geographical latitudes and longitudes. However, the problem with this grid is its unevenness. The cell at the north pole and the cell at the equator have very different shapes. Such asymmetric partitioning introduces big problems if we want to evenly distribute the load across the search cluster.

    How to break the surface of the globe so that the cells are equal to each other? At least approximately. We at Badoo remembered one of the regular polygons, namely the icosahedron. Its faces are equal regular triangles. The icosahedron has only 20 faces, which is too small for the effective load sharing on the search cluster. But its faces can also be divided into triangles.

    We project the vertices and edges of the icosahedron onto the sphere of the Earth. We divide recursively the triangles defined by the icosahedron into 4 smaller triangles. And so twice. We get a total of 320 faces. Unfortunately, they are no longer equal. Next - with a python program, we generate ~ 100,000 lines of C code that can do the transformation lon / lat -> triangle.



    Thus, we can compare any coordinate on the globe with a triangle or, in other words, with a cluster shard, which contains all the users living in this area of ​​the globe.
    Such a “large” triangle, depending on the desired accuracy of the search, we can continue to divide into triangles, more and more “small”. When the triangles become small enough in comparison with the radius of the earth - you can already consider them “flat” and build distances in the “heights of the triangles”, and not meters.



    Each of these “small” triangles is uniquely defined by a triple (a, b, c), where a, b, c are the numbers of intersecting lines.

    How to effectively search for people in a small triangle using our coordinate system, we will certainly tell in the future if you are interested, but we would like to return today and talk in more detail about the "big" triangles.

    The solution we proposed for dividing the globe has drawbacks related to both unevenness and accuracy. Despite the 320 shard, we often encounter distortions in the load.

    And so, a couple of weeks ago, our friends contacted us - employees of the Austrian Institute of Science and Technology and told about their discovery.

    Using the technique they developed for searching for symmetric sections of multidimensional lattices and resource-intensive computer calculations, they managed to construct a new symmetric partition of the sphere into hexagons with a sufficiently large number of vertices. The resulting design they called "Hexasphere." An interactive model of this partition has been posted on the site with the announcement of the article. We couldn’t insert it here, so follow the link to play with it.



    Over the past week, we worked closely with the Austrians, finalized our algorithm, and achieved an absolutely uniform load distribution across shards.

    Badoo has become more accurate for its users, for us the system has become more predictable and resistant to stress, and scientists from Austria, we hope, saw an interesting application for their ideas.
    In the next article, we will definitely tell in more detail how this algorithm works and show on the graphs the improvements that we managed to achieve.

    Marco Kevats, Research Developer

    Also popular now: