How to understand when proxies are lying: verification of physical locations of network proxies using the active geolocation algorithm
People around the world use commercial proxies to hide their true location or identity. This can be done to solve various problems, including access to blocked information or ensuring privacy.
But how correct are the providers of such proxies when they declare that their servers are located in a certain country? This is a fundamentally important question, the answer to which determines whether it is possible to use a particular service for those customers who are concerned about the protection of personal information.
A group of American scientists from the universities of Massachusetts, Carnegie Mellon and Stony Brook published a study in which they checked the real location of the servers of seven popular proxy providers. We have prepared a brief retelling of the main results.
Proxy operators often do not provide any information that would confirm the accuracy of their server location statements. Databases IP-to-location usually confirm the advertising theses of such companies, but there is a large amount of evidence of errors in these databases.
During the study, American scientists estimated the location of 2269 proxy servers managed by seven proxy companies and located in a total of 222 countries and territories. The analysis showed that at least a third of all servers are not located in those countries that companies declare in their marketing materials. Instead, they are located in countries with cheap and reliable hosting: in the Czech Republic, Germany, the Netherlands, the UK and the USA.
Server location analysis
Commercial VPN and proxy providers can affect the accuracy of IP-to-location databases - companies have the ability to manipulate, for example, location codes in the names of routers. As a result, marketing materials can declare a large number of locations accessible to users, while in reality, to save money and increase reliability, servers are physically located in a small number of countries, although IP-to-location databases indicate the opposite.
To verify the real location of the servers, the researchers used an active geolocation algorithm. It was used to evaluate the roundtrip of a packet sent to the server side and to other well-known hosts on the Internet.
At the same time, only less than 10% of the tested proxies respond to ping, and scientists could not launch any measurement software on the server itself. They only had the ability to send packets through a proxy, so roundtrip to any point in space is the sum of the time it takes for the packet to reach from the test host to the proxy and from the proxy to the destination.
In the course of the study, specialized software was developed based on four active geolocation algorithms: CBG, Octant, Spotter, and the hybrid Octant / Spotter. The solution code is available on GitHub.
Since it was impossible to rely on the IP-to-location of the database, the researchers used the RIPE Atlas list of anchor hosts for experiments - the information in this database is available online, constantly updated, and documented locations are correct, moreover, the hosts from the list constantly send ping signals to each other and update roundtrip data in a public database.
Developed by scientists, this is a web application that establishes secure (HTTPS) TCP connections over an insecure HTTP 80 port. If the server does not listen to this port, then after one request it will fail, however if the server listens to this port, the browser will receive SYN- ACK response with TLS ClientHello packet. This will trigger a protocol error, and the browser will display an error, but only after the second roundtrip.
Thus, a web application can measure the time of one or two roundtrip. A similar service was implemented as a program launched from the command line.
None of the tested providers gives the exact location of their proxy servers. In the best case, cities are mentioned, but most often there is information only about the country. Even when the city is mentioned, incidents can occur - for example, researchers studied the configuration file of one of the servers called usa.new-york-city.cfg, which contained instructions for connecting to a server called chicago.vpn-provider.example. So more or less accurately you can confirm only the server belongs to a specific country.
According to the results of tests using the active geolocation algorithm, the researchers were able to confirm the location of 989 out of 2269 IP addresses. In the case of 642, this was not possible, and 638 are definitely not in the country where they should be according to the assurances of proxy services. More than 400 of these false addresses are actually located on the same continent as the declared country. Correct addresses are located in the countries that are most often used to host servers (by clicking on the picture the full size will open) Suspicious hosts were found in each of the seven tested providers. Researchers requested comments from companies, but they all refused to communicate.