A little about cheating counters site visits

image


In this article, I want to tell you how the visitor counters are wound on sites, demographics, location and other parameters of monitoring services are faked.


How does the counter work?


We place the javascript code, which, when the page loads, starts sending http requests to the counter server.


This can be either a one-time request, in the header of which data is transmitted, or periodic requests that send more statistics.


As a test subject, I took a “simple” hit counter - liveinternet.


Parsing http


When the page loads, js counter sends a GET request to receive a picture with statistics. At the same time in the url, it transmits part of the data about the client.


image


If you decode the query string, you get something like this:


http://counter.yadro.ru/hit?t54.6;rhttp://RefererName.com/;s1920*1080*24;uhttp://SiteName.com/;hSite Header;0.5985211677780615


We see a number of parameters separated by ";", namely: the size of the monitor and its resolution, the transition page, the url and the title of the page from which the request was made and a random number guaranteeing the uniqueness of the visit.


Cookie and User-Agent are also transmitted to the http header , which inform the server about the demographics of the user (not only) and browser versions, respectively.
All these data together identify the user.


From theory to practice


You can create queries using Curl, but there will be problems with js, and for each counter you will have to write individual queries.


I opted for PhantomJS - WebKit in the console.


Let's write a simple script that counts us a unique visit.


var page = require('webpage').create();
var system = require('system');
var url = system.args[1];
page.open(url, function(status) {
    console.log("Status: " + status);
    phantom.exit();
});

Some counters even count a visit, but this is not exactly what was expected.


Install the User Agent and Referer (the page from which the transition was made).


The first is done quite simply:


var userAgent = 'Custom UA';
page.settings.userAgent = userAgent;

With the second task, everything is a little more complicated. The fact is that if you simply register the Referer in the http header, then the counters do not count the transition. For the "real" transition, we just need to click on the link, thereby processing the event js.


Code
var page = require('webpage').create();
var system = require('system');
var url = system.args[1];
var userAgent = 'Simple UA';
page.settings.userAgent = userAgent;
var expectedContent = '<a id="link" href="' + url + '">link</a>'; // Создаем ссылку с нашей «целью»var expectedLocation = system.args[2]; // Устанавливаем наш referer
page.setContent(expectedContent, expectedLocation); // Наполняем страничку содержимым и url
page.firstLoad = true;
page.onLoadFinished = function(status){
    if(page.firstLoad){
        page.firstLoad = page.evaluate(function(){
            console.log('Set Referer');
            document.getElementById('link').click(); // Кликаем по созданной ссылкеreturnfalse;
        });
    }
    else{
        console.log("Status: " + status);
        phantom.exit();
    }
};
functionclick(el){
    var ev = document.createEvent("MouseEvent");
    ev.initMouseEvent( // Выставляем параметры клика"click",
        true, true,
        window, null,
        0, 0, 0, 0,
        false, false, false, false,
        0, null
    ); 
    el.dispatchEvent(ev);
}
page.onConsoleMessage = function (msg){ // выводим лог внутри функцийconsole.log(msg);     
}; 

It's funny that using page.setContent we emulate the domain and content of the page.
In fact, you can simply take js counters, put them in the body of the page and carry out all the manipulations on your web server.


Change the screen resolution
Now we change the additional parameters, such as: screen resolution, the number of colors.
In PhantomJS there is a function with which you can modify traffic "on the fly."


page.new_resolution = "800x600x24".split('x'); // Новое разрешение
page.onResourceRequested = function(requestData, networkRequest){  
    // 1920*1080*32 -  стандартные параметры для моего PhantomJSvar newUrl = requestData.url.replace("1920*1080*32", page.new_resolution[0] + "*" +     page.new_resolution[1] + "*" + page.new_resolution[2]);
    // Меняем разрешение под другие счетчики
    newUrl  = requestData.url.replace("1920", page.new_resolution[0]); 
    newUrl = newUrl.replace("1080", page.new_resolution[1]); 
    newUrl = newUrl.replace("32-bit", page.new_resolution[2] + "-bit"); 
    networkRequest.changeUrl(newUrl); // Производим изменения
};

Unfortunately, the function processes only GET requests, but this was enough for the experiment.



If everything is done with empty cookies, then the counters will block the views and throw us a ban.
And the cookie should be relatively "old" (minimum day).
I wrote a grabber and "walked" on popular sites on the network, saving a bunch of cookies.
In PhantomJS cookie connect with the key - cookies-file .


phantomjs --cookies-file=/path/to/cookies.txt


Everything is quite simple with demographics: you need to log in to some popular resource (I took mail.ru mail accounts), after that our “user” will have gender and age.
What is surprising, when I “walked” through the sites, almost from each of them a cookie from doubleclick.net was saved to me. She is responsible for advertising recommendations (in 2007, this company was bought by Google for $ 3.1 billion).


Change location


With the substitution of location, there is no magic, ip needs to be changed.
PhantomJS supports proxy; need to run the program with the key --proxy .


phantomjs --proxy=ip:port


Total


I put popular counters, such as Google Analytics, Yandex Metric and Liveinternet.
They all counted viewing. In Yandex Metric, you can see the presence of robots, there she will see fake requests.


Who cares: the finished script .


Also popular now: