The base of all settlements and regions of Russia

    For one project I needed to create a base of geographical names of Russia. Of all the sources of such information, the most authoritative seemed to me 2:

    The latter seemed to me simpler, more complete and less redundant, although the names of settlements there are four times as many. I chose OKATO because I found at least some description of the base on Wikipedia, and there was completely incomprehensible information in the postal codes. In this database, it was necessary to weed out unnecessary administrative units from the geographical ones that I required.

    Screening took place in several stages. First, I selected the regions, territories and republics, i.e. top level hierarchy. Then he took up the cities and towns. All screenings were carried out empirically. Revealing patterns, I eliminated all unnecessary husk, such as municipalities and areas of large cities. Describe patterns do not see the point. Each classification level has its own rules for dropping out administrative units, which can be viewed in the source code, in the file below. I note that in the resulting files in the regions I created an identifier in the first field, and in the settlements, a link to it in the last field to import into the database the belonging of the settlements to the region. The format was converted to csv, otherwise the data format remained the same. It should be assumed that there are most likely errors in the database. If someone finds, write in the comments, I will edit,
    So, the code file in python, with the help of which screening and analysis was carried out, and 2 final files after analysis with regions and us. points can be downloaded here . I hope my work will turn out to be useful to someone else.

    Also popular now: