130 thousand surveillance cameras - how to make them work?

    Hello, Habr! We want to thank you again for the excellent feedback, we will give detailed answers to a number of your questions, including a detailed discussion of the infrastructure of our system. We ourselves thought in the near future to make such a post, but since you also expressed interest in this topic, we have forced the process a bit.


    Under the cut - right to the point.

    The system architecture is built on a modular basis , which is the standard for systems of this magnitude. But there is one significant difference: these modules in the system are not just many, but very many. Each module performs separate highly specialized tasks related to receiving video images and information from city and departmental information systems (IS), controlling access to processed data, providing photo and video images to users of the ECSD and various external consumers, including residents of the city.

    The modules interact closely with each other based on various technologies (mainly JSON, REST API and SOAP). The modules themselves are implemented in various languages ​​(Java, C #, Java Script and others), using frameworks (ASP.NET Web API, WCF, NLog, Entity Framework, MySQL ADO.NET managed drivers, Unity 3, EPPlus, Json.NET, EmitMapper, DotNetZip, jQuery, knockoutjs, Moment.js, underscore.js and so on).

    All this software diversity operates under various operating systems (Windows Server, SUSE Linux Enterprise Server, CentOS and others) in the VMware vSphere 5 virtualization environment.

    The system architecture is shown in the diagram:



    All modules have their own fault tolerance and balancing mechanisms. For this, cluster technologies, various options for balancing HTTP requests based on nginx, as well as queue and priority management at the application level are used. In addition, all modules are combined according to the principle of weak connectivity, which significantly increases the possibility of development and modification of the system as a whole.

    This approach allows you to make the system not only scalable, but also very flexible in terms of implementing new technological processes or making changes to existing ones, since each module can be developed independently, and support for backward compatibility of the module is a mandatory requirement.

    Description of the main components of the system


    The video core, which provides the receipt and processing of video streams , consists of several hundred video servers running on the resources of dedicated virtual machines running Linux that provide video traffic from more than 145 thousand cameras, encoders and other systems for generating a video stream. Traffic reception is carried out using the RTSP protocol (video in H.264 format), the traffic reception scheme can be used differently - on an ongoing basis with adjustable recording depth (storage period), or on request if you need to simply watch the video. This flexible approach allows you to save both server resources and reduce the load on communication channels.

    TCP is mainly used as a transport layer protocol, but UDP can also be used in some cases. Video servers perform several tasks:

    - Basic: receiving a video stream, recording it and then issuing a streaming or archived video to relay servers or long-term storage of part of the archived video in a separate dedicated storage;
    - A number of additional ones: monitoring the availability of the video source, monitoring the receipt and compliance of the video stream with the specified parameters (fps, bitrate, packet loss), as well as through specialized data exchange buses, information from the video servers enters the service delivery control system for subsequent accounting and transmission to the operation services.

    Video servers are managed through a specialized HTTP API that allows you to fully automate work through control systems. The operation of video servers is controlled through Zabbix, using additional internal mechanisms of video servers.

    Restrimming- Relay server - a connecting link between the video core (which carries out the primary reception of video and its recording) and users (in fact, the main consumers of video content). Built-in re-encapsulation and distribution mechanisms for streaming video allow relaying of live and archive video content to PCs and portable devices (tablets, phones). Restrimming can also provide streaming video content via RTSP for additional systems, for example, for video analytics systems or a transcoding control system - after all, users sometimes need a “compressed” video stream if they suddenly “sank” the channel, but they really want to watch the video :) Also, users can take advantage of a special viewing mode as a slideshow. Relay servers work in a cluster and support dynamic backup mode,

    User interface - a web interface (front-end) provides authorized access for several tens of thousands of registered users to the video surveillance system and provides a wide range of various functionalities available from anywhere in the intracity network, and in some cases through closed dedicated Internet resources.

    Work is provided in the environment of various operating systems (Windows, MacOS, Linux) on all major browsers, which greatly simplifies the organization of workplaces. It also supports work with a number of mobile devices, for example, based on iOS.

    The functionality is quite diverse - from the usual viewing of streaming and archive video (via a flash-player on a PC, and native tools on iOS) and PTZ control-functions of cameras, search engines with links to various layers of cartography and integration with data from urban systems, and before using a flexible role model, the formation of a personal presentation of the user interface and built-in user training mechanisms.

    Front-end uses technologies such as RequireJS, knockout, lodash, leaflet and others. The back-end is built on a modular principle, allowing to provide the necessary level of redundancy and scaling of components; a configuration management system is used. Software and technologies: Java / Tomcat, MariaDB, RabbitMQ and several others.

    Integration gateway- a set of modules of subsystems providing isolation of external information consumers from the core of the system. It provides various information to external systems after authorization and taking into account the set permissions (i.e. the available data volume and functionality). There are three main logical components:

    • The service interaction component is the HTTP API through which the specified set of information is provided (camera names, addresses and location coordinates, screenshots and other accompanying information).

    • Relay component - designed to minimize the load on the video core and provide streaming video streams to external systems (broadcasting to flash, as well as HLS for iOS devices, is supported ). The video core and restrimming server have CDN functionality , and this component works as an additional CDN link for video content distribution.

    • Interface rendering component - designed to be embedded in external systems as a ready-made interface for playing streaming and archived video information, for example, by embedding via iframe and simple customization (visual presentation settings - colors, necessary function buttons, etc.) through simple parameterized calls.

    The use of these components allows external systems not only to access data and create their own interface for visualizing information, but also as easily as possible to implement already prepared components in their interface.



    To provide users with access to the cameras of a video surveillance system, there is a whole range of interconnected components that provide authentication and authorization of users, prioritization of access to PTZ control functions, logging of user actions and interaction of individual modules.

    The system supports two types of authentication: using user accounts of the ECSD and using a single city system for managing access to resources (in fact, this is the possibility of SSOfor citywide IP). At the same time, for one physical user, it is possible to share different types of authentication. The key authentication information of the ECSD users is stored in Active Directory, and all additional information is stored in the Microsoft SQL database.

    All passwords, of course, are transmitted between the modules in encrypted form. Or not transmitted at all, so we can :)

    Granting access privileges and priorities is based on a single role model for all modules, which allows you to provide access not only to surveillance cameras, but also to individual system functions. Currently, the system has more than 100 different permissions: access to live broadcasts, the ability to view and export the archive of photo and video images, PTZ control, make changes to individual system registries and directories, create a schedule for receiving images, create patrol routes , mobile access and more. The system is constantly evolving, new user groups are connecting with their tasks, and the addition of new powers to the system occurs, in fact, on the fly.

    Additionally, there are several priority levels for various user groups and system services, which allows avoiding conflicts between different user groups and “transparently” intercepting PTZ camera control, providing control in patrol modes, or moving the viewing area according to a schedule.

    In order to quickly and easily provide and manage permissions for tens of thousands of users, the system uses various mechanisms: assigning permissions to user groups, using templates with predefined standard permissions, smart groups (when, based on predefined rules, new users are automatically distributed among groups ) Similar mechanisms are used to manage camera registries, but the distribution into groups is mainly implemented by geographical principle, type of service, or membership in a departmental video surveillance system.

    Access control is carried out almost in real time, and any change in the composition of authority instantly affects the user. Additionally, token-based session validation mechanisms are used to control the lifetime of links to live video streams. By the way, it was the lack of this mechanism in the test service for residents of the city that made it possible to gain access to "expired" links to video streams.

    The system will record all user actions and the interaction of modules at all levels, which will allow analysis of any situation. The logging system records the date and time of the event, what actions the user performed (up to pressing the interface buttons), what tasks the scheduler performed in the system, how the information exchange ended, and much more. To date, the system has recorded more than 10 billion records on the implementation of "technological" processes, each of which consists of several separate logged actions. DIT employees daily receive statistics that allow us to determine user activity in various sections: which portals they used, the number of views of live broadcasts, which groups of cameras had the most requests, etc.

    As a "cherry on the cake", the video surveillance system in numbers:

    Number of cameras: more than 145 thousand;
    Number of users: more than 10 thousand;
    The volume of network video traffic: about 120 Gb / s;
    Storage system capacity: 20 Pbytes.

    The official page of the system is here . You can see the technical characteristics of the cameras, learn how to act in case of an incident that could get into the lens, and watch several cameras in public places using the “Window to the City” service .

    Read also in our blog on Habré:
    " Habraeffect for 130 000 cameras of Moscow
    " Information technologies feed more than 750 thousand people in Moscow
    » Blog of the Department of Information Technology of the city of Moscow at Habrahabr

    Thank you for your attention!

    Also popular now: