Multi-camera video analytics
In the first publication to the Khabrosocommunity we want to talk about an interesting area of work of the Sinesis company - multi-camera video analysis, more precisely, a multi-camera algorithm for tracking objects.

Our team is engaged in applied research in the field of video analytics and is developing high-speed machine vision algorithms for the automatic classification of situations according to stream video. The most interesting results we plan to cover in a corporate blog. We will be grateful for ideas and criticism.
An integral component of almost any video analytic system is the tracking (tracking) algorithm. Why is a smart video surveillance system needed? In general, tracking objects is necessary for automatic recognition of situations according to the rules, for example, a person entered the control zone, stopped, left the subject, or without rules, in self-learning systems. Disruption of tracking almost always leads to either a missed alarm situation or to repeated triggering of video analytics.
Habr already talked about the maintenance of objects in articles on the development of Zdenek Kalal and Microsoft Research . "Single-chamber" tracking, for example, in the MagicBox device works like this:
The result of the “single-chamber” tracking algorithm is a sequence of spatio-temporal coordinates of each object. Trajectory breaks are possible when an object leaves the field of view of the camera or when an object enters behind an obstacle.
The multi-camera tracking algorithm, the subject of this publication, continuously compares data on the position of objects from various cameras, taking into account the relative position and the cameras are linked to a map of the area. The algorithm constructs a generalized trajectory of the object when moving from camera to camera and projects this trajectory onto the map. In this case, the object can be observed by several cameras at the same time or in the blind zone. A multi-camera path allows you to implement geo-visual search tools, automatic angle selection and other security features that are often shown in science fiction films.
Before starting the operation of the system, the surveillance zone of each camera is linked to a map of the controlled area. Our calibration algorithm uses four points, the coordinates of which the user must set simultaneously on the camera frame and on the map:

It is recommended to use nodal points on the surface of the terrain, which are easy to visually determine from various angles, for example, trees, corners of the house and fences. Algorithms calculates the coordinate transformation matrix by the least squares method :

where r are the coordinates on the camera frame, R is the global coordinate on the map, A is the desired transformation matrix.
Thus, the transformation matrix A allows you to display the two-dimensional coordinates of the object in the camera frame in its global coordinates on the map.
So, the stream of spatio-temporal coordinates of moving objects recorded by various cameras comes to the input of the multi-chamber tracking algorithm. Since the “single-chamber” video analysis is not time synchronized, the initial coordinates are reduced to a single time scale by linear interpolation . Then, the coordinates are converted into the global coordinate system using the matrix A .
This is how the object trajectories look after projecting onto the map. The illustration shows the coverage areas of nine cameras, five of which recorded the movement of an object. "Single-chamber" trajectories are highlighted in the same color as the corresponding cameras and their zones of action.

The second step is a rough comparison of the global coordinates of the observed objects, which can potentially be observed by several cameras, but correspond to one physical object. To do this, the distance between the observed objects on the map for the current time is calculated. If the distance is less than the selected threshold, for example, 1 meter, then objects are marked for the next processing step.
If the data for the camera is not available for the considered time (the object is outside the coverage area of this camera), then the location of the object is predicted. It is assumed that the speed of the object outside the camera visibility does not change.
As a result of step 2, a list of observed objects and corresponding “single-chamber” trajectories that can correspond to one physical object is formed.
In the third step, we calculate the Pearson correlation coefficient between the coordinate pairs of two "single-chamber" trajectories. If the correlation coefficient lies in the selected interval of significance, then we assume that two trajectories belong to the same object.
At the fourth step, we combine the “single-chamber” trajectories into the “multi-chamber” ones. In the overlap area obtained in step 3, we calculate the average trajectory of the object. If the camera coverage areas do not overlap, then there is a “stitching” of two trajectories, where the coordinates of the blind zone are interpolated by the boundary coordinates observed in each camera.
Below is a generalized trajectory of the object on the map using multi-camera video analytics.

From a practical point of view, the developed multi-chamber tracking algorithm allows you to:
In the course of the research, an experimental multi-chamber tracking zone of 9 cameras was created . A generalized trajectory of the target’s movement according to several cameras is obtained. The task of future research is to evaluate the effectiveness and accuracy of the developed algorithm.
See also publications on the Synesis website:

Our team is engaged in applied research in the field of video analytics and is developing high-speed machine vision algorithms for the automatic classification of situations according to stream video. The most interesting results we plan to cover in a corporate blog. We will be grateful for ideas and criticism.
Escort in the field of view of one camera
An integral component of almost any video analytic system is the tracking (tracking) algorithm. Why is a smart video surveillance system needed? In general, tracking objects is necessary for automatic recognition of situations according to the rules, for example, a person entered the control zone, stopped, left the subject, or without rules, in self-learning systems. Disruption of tracking almost always leads to either a missed alarm situation or to repeated triggering of video analytics.
Habr already talked about the maintenance of objects in articles on the development of Zdenek Kalal and Microsoft Research . "Single-chamber" tracking, for example, in the MagicBox device works like this:
The result of the “single-chamber” tracking algorithm is a sequence of spatio-temporal coordinates of each object. Trajectory breaks are possible when an object leaves the field of view of the camera or when an object enters behind an obstacle.
Accompaniment in the field of view of several cameras
The multi-camera tracking algorithm, the subject of this publication, continuously compares data on the position of objects from various cameras, taking into account the relative position and the cameras are linked to a map of the area. The algorithm constructs a generalized trajectory of the object when moving from camera to camera and projects this trajectory onto the map. In this case, the object can be observed by several cameras at the same time or in the blind zone. A multi-camera path allows you to implement geo-visual search tools, automatic angle selection and other security features that are often shown in science fiction films.
Spatial Camera Calibration
Before starting the operation of the system, the surveillance zone of each camera is linked to a map of the controlled area. Our calibration algorithm uses four points, the coordinates of which the user must set simultaneously on the camera frame and on the map:

It is recommended to use nodal points on the surface of the terrain, which are easy to visually determine from various angles, for example, trees, corners of the house and fences. Algorithms calculates the coordinate transformation matrix by the least squares method :

where r are the coordinates on the camera frame, R is the global coordinate on the map, A is the desired transformation matrix.
Thus, the transformation matrix A allows you to display the two-dimensional coordinates of the object in the camera frame in its global coordinates on the map.
Step 1: data preprocessing
So, the stream of spatio-temporal coordinates of moving objects recorded by various cameras comes to the input of the multi-chamber tracking algorithm. Since the “single-chamber” video analysis is not time synchronized, the initial coordinates are reduced to a single time scale by linear interpolation . Then, the coordinates are converted into the global coordinate system using the matrix A .
This is how the object trajectories look after projecting onto the map. The illustration shows the coverage areas of nine cameras, five of which recorded the movement of an object. "Single-chamber" trajectories are highlighted in the same color as the corresponding cameras and their zones of action.

Step 2: Pre-Selecting Objects in the Camera Overlap Area
The second step is a rough comparison of the global coordinates of the observed objects, which can potentially be observed by several cameras, but correspond to one physical object. To do this, the distance between the observed objects on the map for the current time is calculated. If the distance is less than the selected threshold, for example, 1 meter, then objects are marked for the next processing step.
If the data for the camera is not available for the considered time (the object is outside the coverage area of this camera), then the location of the object is predicted. It is assumed that the speed of the object outside the camera visibility does not change.
As a result of step 2, a list of observed objects and corresponding “single-chamber” trajectories that can correspond to one physical object is formed.
Step 3: Matching the “single-chamber” paths
In the third step, we calculate the Pearson correlation coefficient between the coordinate pairs of two "single-chamber" trajectories. If the correlation coefficient lies in the selected interval of significance, then we assume that two trajectories belong to the same object.
Step 4 Summarizing the trajectories
At the fourth step, we combine the “single-chamber” trajectories into the “multi-chamber” ones. In the overlap area obtained in step 3, we calculate the average trajectory of the object. If the camera coverage areas do not overlap, then there is a “stitching” of two trajectories, where the coordinates of the blind zone are interpolated by the boundary coordinates observed in each camera.
Below is a generalized trajectory of the object on the map using multi-camera video analytics.

Conclusion
From a practical point of view, the developed multi-chamber tracking algorithm allows you to:
- increase the accuracy of target detection and reduce the number of false positives due to the correlation of metadata of video analytics of adjacent television and thermal imaging cameras;
- compare the image of the target followed, observed simultaneously on television and thermal imaging cameras;
- exclude repeated triggering of video analytics when the target moves from the surveillance zone of one camera to the surveillance zone of another camera;
- display the whole trajectory of a person’s movement on the map of the protected object according to the results of video analysis immediately for all cameras;
- apply the rules to a multi-camera path on the map for more accurate recognition of human behavior and events;
- Automatically select the optimal angle for observing a person as he moves from camera to camera.
In the course of the research, an experimental multi-chamber tracking zone of 9 cameras was created . A generalized trajectory of the target’s movement according to several cameras is obtained. The task of future research is to evaluate the effectiveness and accuracy of the developed algorithm.
Additional Information
See also publications on the Synesis website:
- Algorithms for target tracking in extended object security systems
- Algorithm for multi-camera tracking of a person using data from a video camera and thermal imager
- Maintenance of objects in the conditions of their obscuration by moving and motionless obstacles
- The future of video surveillance systems: multi-camera tracking