NikolaiPtitsyn April 28, 2011 at 16:08

Multi-camera video analytics

In the first publication to the Khabrosocommunity we want to talk about an interesting area of work of the Sinesis company - multi-camera video analysis, more precisely, a multi-camera algorithm for tracking objects.
Multi-chamber human tracking

Our team is engaged in applied research in the field of video analytics and is developing high-speed machine vision algorithms for the automatic classification of situations according to stream video. The most interesting results we plan to cover in a corporate blog. We will be grateful for ideas and criticism.

Escort in the field of view of one camera

An integral component of almost any video analytic system is the tracking (tracking) algorithm. Why is a smart video surveillance system needed? In general, tracking objects is necessary for automatic recognition of situations according to the rules, for example, a person entered the control zone, stopped, left the subject, or without rules, in self-learning systems. Disruption of tracking almost always leads to either a missed alarm situation or to repeated triggering of video analytics.

Habr already talked about the maintenance of objects in articles on the development of Zdenek Kalal and Microsoft Research . "Single-chamber" tracking, for example, in the MagicBox device works like this:

The result of the “single-chamber” tracking algorithm is a sequence of spatio-temporal coordinates of each object. Trajectory breaks are possible when an object leaves the field of view of the camera or when an object enters behind an obstacle.

Accompaniment in the field of view of several cameras

The multi-camera tracking algorithm, the subject of this publication, continuously compares data on the position of objects from various cameras, taking into account the relative position and the cameras are linked to a map of the area. The algorithm constructs a generalized trajectory of the object when moving from camera to camera and projects this trajectory onto the map. In this case, the object can be observed by several cameras at the same time or in the blind zone. A multi-camera path allows you to implement geo-visual search tools, automatic angle selection and other security features that are often shown in science fiction films.

Spatial Camera Calibration

Before starting the operation of the system, the surveillance zone of each camera is linked to a map of the controlled area. Our calibration algorithm uses four points, the coordinates of which the user must set simultaneously on the camera frame and on the map:

Camera calibration: four points on the frame are attached to the map, i.e. global coordinates are set for them

Camera calibration: four points on the frame are attached to the map, i.e. global coordinates are set for them

It is recommended to use nodal points on the surface of the terrain, which are easy to visually determine from various angles, for example, trees, corners of the house and fences. Algorithms calculates the coordinate transformation matrix by the least squares method :

where r are the coordinates on the camera frame, R is the global coordinate on the map, A is the desired transformation matrix.

Thus, the transformation matrix A allows you to display the two-dimensional coordinates of the object in the camera frame in its global coordinates on the map.

Step 1: data preprocessing

So, the stream of spatio-temporal coordinates of moving objects recorded by various cameras comes to the input of the multi-chamber tracking algorithm. Since the “single-chamber” video analysis is not time synchronized, the initial coordinates are reduced to a single time scale by linear interpolation . Then, the coordinates are converted into the global coordinate system using the matrix A .

This is how the object trajectories look after projecting onto the map. The illustration shows the coverage areas of nine cameras, five of which recorded the movement of an object. "Single-chamber" trajectories are highlighted in the same color as the corresponding cameras and their zones of action.

Coverage areas of nine cameras, five of which recorded the movement of the object, the paths are highlighted in the same color as the cameras and their coverage areas.

Step 2: Pre-Selecting Objects in the Camera Overlap Area

The second step is a rough comparison of the global coordinates of the observed objects, which can potentially be observed by several cameras, but correspond to one physical object. To do this, the distance between the observed objects on the map for the current time is calculated. If the distance is less than the selected threshold, for example, 1 meter, then objects are marked for the next processing step.

If the data for the camera is not available for the considered time (the object is outside the coverage area of this camera), then the location of the object is predicted. It is assumed that the speed of the object outside the camera visibility does not change.

As a result of step 2, a list of observed objects and corresponding “single-chamber” trajectories that can correspond to one physical object is formed.

Step 3: Matching the “single-chamber” paths

In the third step, we calculate the Pearson correlation coefficient between the coordinate pairs of two "single-chamber" trajectories. If the correlation coefficient lies in the selected interval of significance, then we assume that two trajectories belong to the same object.

Step 4 Summarizing the trajectories

At the fourth step, we combine the “single-chamber” trajectories into the “multi-chamber” ones. In the overlap area obtained in step 3, we calculate the average trajectory of the object. If the camera coverage areas do not overlap, then there is a “stitching” of two trajectories, where the coordinates of the blind zone are interpolated by the boundary coordinates observed in each camera.

Below is a generalized trajectory of the object on the map using multi-camera video analytics.

Generalized trajectory of the object on the map using multi-camera video analytics

Conclusion

From a practical point of view, the developed multi-chamber tracking algorithm allows you to:

increase the accuracy of target detection and reduce the number of false positives due to the correlation of metadata of video analytics of adjacent television and thermal imaging cameras;
compare the image of the target followed, observed simultaneously on television and thermal imaging cameras;
exclude repeated triggering of video analytics when the target moves from the surveillance zone of one camera to the surveillance zone of another camera;
display the whole trajectory of a person’s movement on the map of the protected object according to the results of video analysis immediately for all cameras;
apply the rules to a multi-camera path on the map for more accurate recognition of human behavior and events;
Automatically select the optimal angle for observing a person as he moves from camera to camera.

In the course of the research, an experimental multi-chamber tracking zone of 9 cameras was created . A generalized trajectory of the target’s movement according to several cameras is obtained. The task of future research is to evaluate the effectiveness and accuracy of the developed algorithm.

Additional Information

See also publications on the Synesis website:

Tags: