Creating a summary table of attendance of several sites using Yandex.Metrica

The task was to create a system for registering visitors for a couple of dozen sites. Sites belong to gaming associations (clans) of one gaming community. So to speak, we need a pivot table in which we will immediately see which site is more popular. The customer approved the count of unique visitors.

Since the customer did not have any solid idea - what and how, then I could do anything (there was no technical task either). I wrote an accounting system (PHP, MySQL). Unique hosts were identified by IP, cookie records and DOM storage entries. In fact, it was an experiment that was not brought to an end. A new version of the accounting system was to use the counters of some ready-made system, such as Yandex.Metrica, Google Analytics, LiveInternet and the like. I chose Yandex, because there is an API and sensible help. An example of use peeped on a habr. To work with the metric API, you need an OAuth token. I will not describe the procedure for its receipt, everything is in the certificate.

A bit of specificity.
First, we get a list of counters:, of course, each developer has a unique one.
Next, we obtain for each counter from the list received the number of visitors for the reporting period, say, January 2013: id = 12345678 is the number of the Yandex.Metrica counter.
The resulting data is shown in the pivot table.

It would seem that the task was completed, but it was not there! Malicious elements were found that wanted to distort indicators. These unscrupulous comrades turn to the traffic exchange and buy clicks to any site from the group, or several at once. Traffic does not bring such benefits to the site, people working on these exchanges will not remain on the site in more than 99% of cases. Another point is that in the traffic exchange settings there may be such an item - “do not transfer the referrer”. In this case, the transitions to the site look like they got to the site using bookmarks in the browser. I came up with three options for solving the problem, but so far no one has been commissioned, I can not choose the best.
  1. Count visits that lasted more than 30 seconds.
  2. Count visits in which there were visits to two or more pages.
  3. Create a goal for each metric counter - a visit to two pages and count how many times the goal was achieved at a given visit time, for example, more than 30 seconds. Hybrid of the first two options.

For the first two options, we get the data like this: we add the visits at which more than one page was visited (the object has the name property equal to 1, 2, 3 ... 14, 15+, according to the number of pages viewed). Or we add up visits in which name is equal to the desired visit time ( name has the values ​​“0 - ​​10 sec.”, “11 - 30 sec.” And so on until “10 - 30 min.”, “More than 30 min.”). The third option is the calculation of target visits, for it you need to find the goal id (goal_id) for each counter: then, as in the first option, we add the necessary data (for example, more than 30 seconds per session).
I do not like these methods. Still, I want to consider, not visits, but visitors. There is another way in which you need to add a couple of lines to the metrics counter code in javascript. Changing the code will affect the accounting system. The visitor will not be counted immediately, but after a specified time. That is, if the user left the site before the timer is triggered, then his visit will not be taken into account (defer property and hit method). The option is also not very convenient - change the counters on the heap of sites, then check that the counters do not change again (each site has its own owner). Inconveniently. The administration of the metric promised to create a tool for screening "bots", but when it will be done is not known.

As a result, we got an almost working system. I myself am inclined to use a third, combined method of recording visits. Let's see what the customer says ...

Maybe someone who read this article has already struggled with the "markups" and defeated them? It would be very interesting to know how.

Help on the metric
Help on the API metric

Also popular now: