How to count the sound from the pack from under the chips, or what is a "visual microphone"

    "Visual microphone" is a technique that allows you to restore the audio from silent video. Today we will tell not only about it, but also other methods and technologies that allow you to remotely read and restore music or speech.


    Photo m01229 CC

    Technology predecessors


    One way to record sound at a distance is with lasers. The so-called laser microphones are used to sense vibrations caused by sound waves. For example, you can “capture” the sound in this way from the surface of the window glass, if people are talking in the room or music is playing. The interferometer captures the "movement" of the surface by changing the optical path length of the reflected beam. After that, these deviations are converted into a sound signal using special algorithms.

    The network has audio recordings that show that “laser microphones” allow you to restore sound with fairly good quality. However, this approach has its disadvantage associated with the complexity of the installation of the device.

    You can also “record sound at a distance” usingmicrowave radiation of low intensity, which is used in communications. Similar technologies were used at NASA for capturing and recognizing weak radio signals in space.

    The horn antenna sends microwaves with a frequency of 30-100 GHz into the room through the wall of the building. If people speak or play music indoors, sound waves can be read by microvibrations of light objects and materials - they acquire amplitude modulation in “captured” form . This information is then used to restore the sound acting on the object. Moreover, this object can be any clothing, so this method allows you to "intercept" even the sound of the heartbeat.

    Visual microphone - the decision of scientists from MIT


    Scientists from MIT have proposed another way to read sound from a distance. They proved that it is possible to restore the sound based on the video. To do this, you need to record a video of the object using a camera for high-speed shooting and analyze the microscopic vibrations caused by the propagation of sound waves.

    Based on the video, a controlled image pyramid is built , which is a set of filters that “break” each video frame into complex subranges corresponding to different points on the object under study.

    Scientists have developed a special algorithm (and laid outopen access), which calculates the intensity of sound vibrations in each of the selected points. Local signals are averaged, and on their basis, one common signal is formed, which determines how sound waves act on an object. This signal passes through the Butterworth high pass filter with a cutoff threshold of 20–100 Hz. After that, it becomes possible to restore the audio recording.

    According to the head of research, Abe Davis, the visual microphone allows you to get audio recording of less good quality compared to active techniques (for example, using lasers), but it has its advantages. Their system does not require additional equipment and any detectors - only a high-speed video camera is needed. In this case, the surface from which sound will be “read out” does not have to be mirrored or smooth, as laser microphones often require .

    Abe's team tried to count the sound from a paper bag, a pack of chips and aluminum foil. They are light, because the sound vibrations on them were most noticeable, and the resulting signal is less noisy. Among the test objects was also a home plant and a brick, which, according to scientists, “showed” itself better than they expected.

    The team made a video in which it showed how these or other objects “sound”:


    Scientists note that they plan to continue work in this direction and investigate the possibility of playing audio from any video recordings, and not just those specially prepared with the help of a high-speed camera.

    Technology development


    Other scientists are trying to improve the technology proposed by the MIT team. For example, last year, Iranian researchers presented an algorithm that speeds up sound extraction from “high-speed video” and improves its quality.

    Different areas of the object are affected differently. The intensity of vibration depends on the material from which the object is made, its shape, frequency of the acting sound and the distance to the source. For example, when shooting video at a frequency of 20 kHz, sound waves travel about 17 mm between two frames. Therefore, objects that are farther from the sound source react with a delay.

    All these factors cause different areas of the object to vibrate with different strengths. Therefore, when analyzing images from a camera, scientists take into account only those zones that make the greatest contribution to the formation of the resulting signal — the least “noisy” blocks. In this case, the frequencies forming them have different phase shifts in order to exclude attenuating interference.

    Iranian researchers note that because of this they managed to improve the quality of the reproduced sound, as well as speed up image processing, compared to the original MIT algorithm. They say that their system is able to process the image and restore the sound in real time.

    The potential of visual microphones


    In general, technology is still a pilot and full commercial realization question. But she is already predicting a potential application in the field of law and order - the police will be able to get more information from surveillance cameras.

    There are other options: similar systems will allow analyzing how sound behaves in recording studios and concert halls in order to determine their acoustic properties. Another application is to use the system in the space industry to study sounds in space. By the way, residents of Hacker News have already suggested that in the future "visual microphones" will allow once and for all solve the mystery of the landing on the moon.



    More about sound in our “World Hi-Fi”:


    Our new materials on GT:




    Also popular now: