
An algorithm has been developed that effectively removes all “boring” fragments from the video
Have you had to follow a link to an interesting video on Youtube to find out that for the sake of a few seconds, where something interesting is really happening, you just spent a few minutes contemplating a completely useless “garbage” just because the author of the video uploaded the whole file from a DVR or smartphone? The number of video cameras is growing rapidly, and the number of people capable of at least trimming a couple of extra fragments seems to remain constant. And the problem is not only in a few minutes of time killed on the Internet - after all, there are more serious cases, for example, tens and hundreds of hours of video from surveillance cameras, which sometimes have to be viewed in order to solve a crime.
Scientists from Carnegie Mellon University have developed an effective algorithm for extracting the most interesting video fragments based on machine learning. The new algorithm, called by them "LiveLight" significantly surpasses analogues in speed and quality of work. LiveLight highlights the characteristic fragments of the video and compiles them as a “dictionary”, and then tries to predict the next frame based on them. If this succeeds with a sufficient degree of accuracy, then this means that the frame does not add almost any new information and it can be excluded. Unlike “mechanical” approaches that respond to any movement in the frame or a sharp change in brightness, color or contrast, LiveLight is quite versatile - it works well both on video shot with a fixed camera and on amateur photography with a shaking smartphone.
To test the algorithm, 20 videos from Youtube and video surveillance cameras were selected, lasting from 12 minutes to an hour and a half. For each video, three people manually compiled a “summary” of the most interesting fragments. On average, in 72.3% of cases, the results of the algorithm coincided with the choice of people. For some videos, the match was more than 90%. The result of LiveLight is 8% better than the closest competitor based on a similar principle, but 10 times slower.
LiveLight is capable of processing real-time video on conventional hardware. Scientists tested the implementation of the algorithm in MATLAB 7.12 on a computer with Intel Core i7 processor at a frequency of 3.4 GHz and 16 GB of memory. Some videos were calculated twice as fast as real time.
A PDF with a detailed description of the algorithm can be downloaded from the link on the project page .
Scientists from Carnegie Mellon University have developed an effective algorithm for extracting the most interesting video fragments based on machine learning. The new algorithm, called by them "LiveLight" significantly surpasses analogues in speed and quality of work. LiveLight highlights the characteristic fragments of the video and compiles them as a “dictionary”, and then tries to predict the next frame based on them. If this succeeds with a sufficient degree of accuracy, then this means that the frame does not add almost any new information and it can be excluded. Unlike “mechanical” approaches that respond to any movement in the frame or a sharp change in brightness, color or contrast, LiveLight is quite versatile - it works well both on video shot with a fixed camera and on amateur photography with a shaking smartphone.
To test the algorithm, 20 videos from Youtube and video surveillance cameras were selected, lasting from 12 minutes to an hour and a half. For each video, three people manually compiled a “summary” of the most interesting fragments. On average, in 72.3% of cases, the results of the algorithm coincided with the choice of people. For some videos, the match was more than 90%. The result of LiveLight is 8% better than the closest competitor based on a similar principle, but 10 times slower.
LiveLight is capable of processing real-time video on conventional hardware. Scientists tested the implementation of the algorithm in MATLAB 7.12 on a computer with Intel Core i7 processor at a frequency of 3.4 GHz and 16 GB of memory. Some videos were calculated twice as fast as real time.
A PDF with a detailed description of the algorithm can be downloaded from the link on the project page .