Why does an increase in quality cause an increase in poor quality, or should the main function work

    Analog videoIt's silly to argue that analog video surveillance is a thing of the past: cheap IP cameras give a picture of comparable quality to expensive analog ones. In addition, IP cameras are not limited from above by anything other than recorder performance, while analog cameras require strict matching of the receiving card, matching signal levels of transmitters / amplifiers / receivers and other shamanism.
    When constructing a system based on IP cameras, you can remove the camera at any time and replace it with a better one - if you save the IP address and username and password, then most likely you will not even have to change the receiver settings - just a better picture will go to the archive.
    On the other hand, this imposes restrictions on the registrar - he must be ready to work with any resolution, any bitrate, any codec and any protocol ... Well, or at least work correctly with the declared one.

    ShivaIn the world of software, there are two ways - there is a linux-way: this is a set of small programs, each of which does one function, but very well; and there is a windows-way: these are huge food processors that can do everything, and a little more. The main linux-way problem is the lack of an interface. To get all the benefits you will have to smoke mana (or at least read --help), and experiment. And also figure out what and with what you can combine and how. The main windows-way problem is the loss of the main function. Very quickly, when additional functional is fouled, tests of the key functional are lost, and over time problems begin even with it. And at the same time, the inertia of thinking begins: “this is the main function, it is tested the most, there can’t be a bug, the user is doing something wrong.”

    Now we turn to our sheep: now there is a constant increase in the quality of IP cameras. Anyone who has seen the difference between a FullHD camera installed in the same place where even the ultra-cool 700TVL had previously stood, will no longer want to go back (especially since the price now is about the same). Further development leads to the fact that 3MP (2048x1536) and 5MP (2592x1944) cameras are no longer a rarity. The only price for higher quality is the increasing cost of storage and transfer. However, the price of a gigabyte on the hard drive has been falling for a long time (and has completely recovered from the flood at the factory), and therefore is not a problem.

    Just today there was a little argument with maxlapshinon the topic whether the software manufacturer should have something to the user, after the sale, or not. Yes, any software is sold “as is” without any promises. Therefore, even having paid whatever it is, you are not a fact that you will even get working software. That's only if the software does not work, and this is known - the flow of customers will dry up at one point. Although they are still buying, the evidence of error correction (and even more so, the implementation of features) is a big question.

    We’ll finish the introduction and watch a small but very revealing video (you don’t even have to watch it - everything is visible on the freeze frame):

    This is a textbook glitch that I observe on almost all software designed for video recording. I saw this on VideoNet, I see it regularly on Axxon Next, as you see in the video - Trassir greeted me with this. The same canoe even in the native camera viewer. You could write off all the problems on camera. Can be attributed to the network. It is possible for a high load on the CPU. It is possible for electromagnetic interference. It may be advisable to check the memory. You can reinstall the system. In general, instead of a trial, ways to make a person bring a toilet ...
    But just connecting to the same rtsp stream through vlc there are no artifacts and failures. On the same computer, I run an injected test script that reads a stream from the camera and writes to disk - and there are no losses and problems, and therefore only one method works - reduce the camera resolution and lower the bitrate.

    That is, despite the flexibility, declared support for a bunch of cameras, work on ONVIF and RTSP ... Anyway, you can’t get any advantages from IP monitoring, since the receiving software was not ready for this.

    The main reason for this behavior, oddly enough, is the IP network, codecs, and ... sewage .

    So, for starters, a brief basic theory on IP networks. On the net, everything runs in the form of bags. Each packet has a maximum size (MTU, 1500 on regular links), sender and receiver. A packet is somehow sent along the way there and as a result should reach the recipient. May not get there. Can cut yourself. A piece may come ... In general, options are possible. Of these packets, transport protocols are wrapped on top: UDP and TCP (from the ones we are interested in). Nothing changes on UDP, only the port of the sender and the port of the receiver appear, so that packets can be divided among themselves; and heap of logic is wrapped up on TCP, which "guarantees delivery". More precisely, it guarantees delivery or error generation if something cannot be delivered. Well, as a guarantee ... it promises (to promise - does not mean to get married;) Any admin has repeatedly seen "hanging" connections,

    How does TCP guarantee delivery? But simply - each package should receive confirmation. There is no confirmation for some time - the packet is lost - we will resend it. But if you wait for confirmation for each packet, the speed will drop monstrously, and the more, the higher the delay between communicating points. Therefore, the concept of “window” is introduced - we can send a maximum of N packets without confirmation, and only then wait for confirmation. But waiting for N confirmations is also a lot - the receiver will also accept and send confirmations not for everyone, but simply “maximum seen”. Then there is less evidence. You can also send confirmation along with the packet being sent, so that you do not get up twice. In general - the logic of the sea, all aimed at fulfilling the promise of delivery, but at the same time maximize utilization of the channel. Window size - variable size, and is selected by the system based on voodoo magic, settings, weather on Mars. Plus, it changes during the work flow. We will touch on this a bit later.

    So, now let's move on to our sheep - H264 on top of RTSP. (In fact, it’s practically unimportant - what kind of codec and what kind of transport protocol. Do not think that if you use any of your ingenious protocol, which is many times simpler than RTSP, that it does not concern you). A stream consists of a periodically repeating key frame, and a stream of changes relative to the current state. When connecting to the video, we need to wait for the key frame, after which we take it for the current state, and then accept the diffs that we overlay and show. What does it mean? This means that once every X seconds LOTS of data arrives - a full frame. And the more, the higher the resolution of the camera and the higher the bitrate (although, to be honest, bitrate has little effect on the size of the keyframe). So, here we have time 0 - the beginning of the keyframe, at which the full frame arrives immediately (for example, we have a camera with all 3 megapixels - it’s 2048x1536 = 3145728 pixels. After compression it’s pathetic ~ 360 kilobytes). In total, we have a stream of 8 megabits = 1 megabytes, a keyframe every 5 seconds, and FPS = 18. Then we will have something like 360k, then 52k every 1/18 second, after 5 seconds again 360k, then again 52k.

    Now, back to UDP and TCP. A packet arriving on a network card is added to the network card buffer, and a flag is set (or an interrupt is called) that there is data. The processor suspends the execution of everything useful, takes a packet from the card, and starts lifting and undressing it on the TCP / IP stack. This process is performed at the highest priority (for work with iron). But we still have Windows or Linux, neither of which is RTOS, so there is no guarantee when exactly the application will be able to get to this package. Therefore, as soon as the system has figured out what kind of packet it is, what connection it belongs to, the packet tries to fit into the buffer.
    On UDP: if there is no space in the buffer, the packet is thrown out.
    On TCP: if there is no space in the buffer, the flow control algorithm is turned on - an overflow signal is sent to the sender, they say shut up until I get free a bit. As soon as the application takes part of the data from the buffer, the system sends an “OK, drove on” sender, communication resumes.

    Now let's put everything together and write down how the data is received from the camera. To begin with - a simple case, on UDP. The camera reads the next frame, puts it into the compressor, extracts the compressed data from the compressor, cuts it into packets, adds headers and sends it to the recipient. The receiver first receives 260 UDP packets, then a pause, another 40 packets, a pause, another 40 packets, etc. The first 260 UDP packets arrive instantly - in about 30 milliseconds; and already at the 55th millisecond the next 40 arrive (for another 4 milliseconds). Let's say the buffer we have is 128 kilobytes. Then, they will clog in 10 milliseconds. And if during this time the application does not empty the buffer in a single impulse that reads everything (but in fact they read one packet at a time ...), we will lose the rest of the key frame. Given we don’t have RTOS, and the application can forcefully "fall asleep" for any reason (for example, while the OS flushes buffers to disk) for the same second, the only way to lose nothing is to have a network buffer larger than we can sleep. That is, ideally, the OS buffer should be set to ~ 2 seconds of the stream, in this case 2 megabytes. Otherwise, losses are guaranteed.

    Okay, but we have TCP! Which guarantees delivery, and will ask you to wait for the sender if that! Let's switch to TCP, and look at the same picture. The extra overhead can be neglected, just let's see what happens. So, we have 360 ​​kilobytes of data flying out. They are sent on a 100mbit channel for about 30 milliseconds. Somewhere at the 10th millisecond, the receiver buffer was full, and the camera was asked to shut up. Suppose, after another 20ms, the application read the entire available buffer (but in fact, read 4 bytes, then 1400, then another 4, another 1400 ...), and the OS asked the camera to continue. the camera sent another third of the keyframe and shut up again. After another 20ms we drove on - but the camera produced more data that went into the buffer of the camera itself. And here we come to a slippery moment - and what is the size of the TCP buffer in the camera? Also,In Windows Server, the default quantum is a fixed 120ms . At 120ms, our maximum speed will be 8.5mbits. That is, on a server OS using a 128-kbyte buffer it will be not only difficult to receive an 8-bit stream, but arch-complicated. On desktop OSs, it’s easier, but still there will be problems with any sneeze. If the buffer is larger, it gets better, but all the same - with any instability of subtraction, problems begin that lead in the simplest case to jerky movement, in some cases to a stream breakdown or a bug similar to this if the TCP buffer inside the camera overflows.

    From where only one conclusion can be drawn - the buffer should ideally also have a margin of about 2 seconds of flow, in this case 2 megabytes. Otherwise, problems are likely.

    Maybe I'm wrong, but if an application whose task is to accept and save a stream from cameras cannot do this, this is a bug. And this bug should be repaired, and not offered to reduce the problem to the already solved, reducing the quality to an analog camera. Dixi

    Also popular now: