Monitoring the activity of VK groups. We process data on VKScript

Faced the task of monitoring user activity by a well-known social network. My task was to collect data on the number of users who are online in a particular group or community.

Instruments

Since I myself am engaged in web development, I used such tools
  • PHP 5 (Zend Framework)
  • Vk API
  • Cron

I’ll explain my choice - the Vk API - the fact is that you can get the number of users online without the API, and by sparsing the user search page with a community filter and online tags, however, I preferred not to bother with authorization and parsing tags, but to use the program interface.

Architecture

The implementation can conditionally be divided into 2 parts. The first is a script that finds the number of users online on the group id and writes it to the database. The second is the admin panel, which allows you to add new groups for monitoring and view statistics on already added groups.
In order for the statistics to be relevant, it is necessary to monitor the status of the group at the current time as often as possible. The script is worth hanging in Cron, let it be called every 5 minutes.

Vk API Overview

If everything is more or less clear with the admin panel, then with the script for collecting statistics it’s not quite right. Having familiarized with the methods provided by the API, I come to the first solution.

First decision (wrong)

Using the groups.getMembers , users.get methods , we get a list of group members and their status - online or offline. Next, we count how many users are online. Everything is simple. However, apparent simplicity results in a number of problems.
Everything would be fine if you have small groups (up to 1000 people). Otherwise, we run into API limitations - at a time you can get information about only 1000 users. What is this limitation for us - you can call the method in a loop, but no. Making API calls is allowed no more than 3 requests per second.
We calculate the approximate number of queries that will be needed. Take the habrahabr community Vk. It has more than 40,000 users, therefore, we need ~ 40 requests to get community members and 40 requests - their status.
We set off to look for a new solution.

Second solution (correct)

Find the execute method in the documentation
A universal method that allows you to run a sequence of other methods, storing and filtering intermediate results.

It takes an input line with code written in the so-called VKScript (similar to javascript). The only problem is that there is no sane documentation for this method and the language itself. Probably a solution has been found, so you can go deeper into the study of the Vk API and VKScript in particular.

Work with API

Download the API class offered by the developers. I brought it only to a more acceptable form, so that it fits into the coding style used in the Zend Framework.
Class api
_accessToken = $accessToken;
	}
	public function api($method, $params = array())
	{
		$params['access_token'] = $this->_accessToken;
		$query = $this->_apiUrl. $method . '?' . $this->_params($params);
		$responseStr = file_get_contents($query);
		if(!is_string($responseStr)){
			return null;
		}
		$responseObj = json_decode($responseStr);
		return $responseObj;	
	}
	private function _params($params) {
		$pice = array();
		foreach($params as $k=>$v) {
			$pice[] = $k.'='.urlencode($v);
		}
		return implode('&',$pice);
	}
}

I will not describe authentication and authorization, since it is carried out through OAuth, a lot of information on the Internet, and on the Vk API page.

Make a test call to the API - get the first 20 posts in the habrahabr group
    public function wallsAction()
    {
        //.......
        $api = new Vkapi_Model_Api($accessToken);
        $response = $api->api('wall.get',array('owner_id' => '-20629724'));
        $this->view->walls = $response->response;
    }

image

Now we will do the same, only through the execute method

    public function wallsAction()
    {
        //.......
        $api = new Vkapi_Model_Api($accessToken);
        $code = "
        	var walls = API.wall.get({ owner_id : -20629724 });
        	return walls;
        ";
        $response = $api->api('execute',array('code' => $code ));
        $this->view->walls = $response->response;
    }

As a result, we get the same result.
One thing that’s bad is that we mixed VKScript and PHP code. It looks very bad. Let's do refactoring.
It would be nice if each script was stored in a separate file and it could be called with one function. It is still necessary to provide that later we still need to transfer some data to this script (now, for example, owner_id is hard-coded into the code).

We take out VKScript in separate files

In the root of our module, create a folder called "vkscripts", we will add our scripts to it (for example, getWalls.vks). We will write the path to the scripts in the application.ini config file
vkapi.scripts.path = APPLICATION_PATH "/modules/vkapi/vkscripts"

We need a class that is convenient for invoking scripts located in this directory. We will use the capabilities of PHP5, namely the magic __call method. By the name of the called method, we will search for a script with that name.
Class source
_api = $api;
	}
	public function __call( $methodName, $arguments )
	{
		$script = $this->_getScript($methodName);
		if(count($arguments)){
			$script = $this->_prepareParams($script, $arguments[0]);
		}		
        $response = $this->_api->api('execute', array('code' => $script));
        if( $error = $this->_getError($response) ){
        	throw new Exception($error->error_msg, $error->error_code);
        }
        return $response->response;	
	}
	private function _getError($response)
	{
		if( isset($response->error) ){
			$error = $response->error;
			return $error;			
		}
		return null;	
	}
	private function _getScript( $name )
	{
		$scriptsPath = Zend_Registry::get('vkapi_config')->scripts->path;
		$filePath = $scriptsPath . '/' . $name . '.vks';
		if(is_file($filePath)){
			$script = file_get_contents($filePath);
			return $script;
		}
		return null;
	}
}

So, let's do something with this class.
In the vkscripts folder we put the getWalls.vks file with such contents
var walls = API.wall.get({ owner_id : -20629724 });
return walls;

In the controller:
	public function wallsAction()
	{
		//.......
		$api = new Vkapi_Model_Api($accessToken);
		$executor = new Vkapi_Model_Executor($api);
		$response = $executor->getWalls();
		$this->view->walls = $response->response;
	}

We got the same result, but there are significant advantages: we split the code into separate files, made it more readable, and simplified the call to execute.
The next step is to add the ability to pass parameters to our script. We will use a certain representation for this. In the VKScript code, first, if necessary, we get something to the input as follows:
var groupId = %GROUP_ID%;
var offset = %OFFSET%;
// .... здесь пошел наш код

And in our class, before calling the api with this code, we will replace% VAR_NAME% with the value of the variable.
Let's add our class Executor as follows
Modified class source

_getScript($methodName);
		if(count($arguments)){
			$script = $this->_prepareParams($script, $arguments[0]);
		}		
        $response = $this->_api->api('execute', array('code' => $script));
        if( $error = $this->_getError($response) ){
        	throw new Exception($error->error_msg, $error->error_code);
        }
        return $response->response;	
	}
	// ......
	private function _prepareParams($script, $params)
	{
		foreach ($params as $key => $value){
			$script = str_replace('%' . strtoupper($key) . '%', $value, $script);
		}
		return $script;	
	}
}

In the controller, if it is necessary to transmit a parameter, we write the following
	public function wallsAction()
	{
		//.......
		$api = new Vkapi_Model_Api($accessToken);
		$executor = new Vkapi_Model_Executor($api);
		$response = $executor->getWalls(array(
			'group_id'	=> -20629724,
			'offset'	=> 0
		));
		$this->view->walls = $response->response;
	}

Which, accordingly, will substitute the passed values ​​into% GROUP_ID% and% OFFSET% in our script.
This is what the module structure looks like
image
Get the number of users online

There is a restriction on calling API methods in execute . Limit 22 calls (found almost). Also on the web I did not find information that there are restrictions on other operators (for example, addition, subtraction), but there are some. Since if I ran through an array of users and counted the number of online, I received an error about the exceeded number of operations, it was decided to return the complete list of users from execute, after which it was already on the server side to count their number.
Due to the limited number of API requests in the execute method, we still have to execute at least 1 request for 10,000 group members, because 2 requests are required to process 1,000.
Here is the script that turned out
var groupId = %GROUP_ID%;
var offset = %OFFSET%;
// API call limit
var _acl = 22;
var members = API.groups.getMembers({ gid : groupId }); _acl = _acl - 1;
var count = members.count;
var users = [];
while( _acl > 1 && offset < count){
	var _members = API.groups.getMembers({ gid : groupId, offset : offset }); _acl = _acl - 1;
	users = users + API.users.get({ uids : members.users, fields : "online" }); _acl = _acl - 1;
	offset = offset + 1000;
}
var result = {
	count	: count,
	offset	: offset,
	users	: users@.online
};
return result;

I will comment a little on my code. _Acl counter - to prevent errors due to exceeding the limit of operations with the API. users @ .online - return only the list of values ​​[0,1,1,0,0,0,1,0,1] online offline.
In the controller, we call this script, sequentially increasing offset, until we go over all the members of the group.
	$count = 1;
	$offset = 0;
	$nowOnline = 0;		
	while($count > $offset){
		$users = $executor->getOnline(array(
			'group_id'	=> $groupId,
			'offset'	=> $offset
		));
		$count = $users->count;
		$offset = $users->offset;
		foreach ( $users->users as $online){
			if($online){
				$nowOnline++;
			}
		}
	}	


So, we will protest and see - the data received through the API almost coincides with the data from vk.com, perhaps this inaccuracy is due to caches, or for another reason not visible from the outside.
Remarks

VKScript does not support functions, increment, decrement operators.

Total

We developed tools for working with the vk.com API through the execute method. Using it, you can develop statistics collection applications, etc. and it will look very nice. To screw the interface to all this is already a trivial task. In the end, I note that another social network Facebook provides access to the execution of code written in a language called FQL (Facebook Query Language, similar to SQL), which has clearly more features than VKScript with all its limitations.

References

VK API
The execute method, a brief description and example of VKScript
Facebook Query Language

Also popular now: