Image reconstruction: 1 km of optical fiber, artificial neural network and deep learning

    Nowadays, optical fibers have become an integral part of various spheres of human life: from home Internet to endoscopy. The use of optical fibers is due to a number of advantages: transmission speed, physical strength, throughput, information security, etc.

    In order to increase the bandwidth, a multimode optical fiber (MMF) was created when information is transmitted over several parallel channels. Despite all its advantages, MMF has a number of drawbacks, one of which the researchers decided to eliminate in order to improve the process of transferring images. The point is this: when a sample is projected onto the proximal side of the MMF, the image we receive on the distal side is speckle, because its incoming data is distributed over a number of modes with different degrees of propagation along the fiber length. Scientists propose using a combination of multimode fiber and deep learning for artificial neural networks to get accurate images, including when using endoscopy. Let's dig in the report of the researchers and try to understand how it works and what results. Go.

    Basis of the study

    Techniques for using artificial neural networks to decrypt images transmitted via MMF have been developed for a long time. So in the early works a two-layer network was described, capable of recognizing about 10 images that passed through 10 meters of stepped fiber.

    In this study, the system is much more complicated, but, according to scientists, much more efficient. The initial stage was the collection of a large number of speckle samples obtained by passing the image through the MMF. They became the knowledge base for learning DNN (artificial neural network based on deep learning * ).

    Sample speckle image
    Deep Learning * - a combination of machine learning methods based on presentation, rather than a specialized algorithm for a specific task.
    The DNN architecture is very complex and has about 14 hidden layers * .
    Hidden layer * - an artificial neural network consists of computing units (neurons), which are divided into 3 categories: input, hidden and output. Input accept information, hidden perform various calculations, and output transmit information further.
    For experiments on DNN, a database of 20,000 hand-written numbers was created. Next, the database in a random order of division into groups:

    • 16,000 digits - training;
    • 2,000 digits - check;
    • 2,000 numbers - test.

    Preparing for the experiment

    The image below shows the layout of the optical system that was used to collect data.

    Image number 1: installation diagram:

    Laser source - a source of laser radiation (beam);
    HWP - half wave plate;
    M1 is a mirror;
    SLM - spatial light modulator;
    P - linear polarizer;
    L is a lens;
    BS - beam splitter;
    OBJ - microscope lens;
    OF - optical fiber;
    CCD - CCD camera.

    And now in order. A laser beam with a wavelength of 560 nm directs light through a gradient optical fiber * with a core diameter of 62.5 μm and a numerical aperture * 0.275.
    Gradient MMF * is a fiber with a non-uniform refractive profile when the refractive index smoothly decreases from the edge to the fiber axis.

    Comparison of fiber types: step multimode, gradient multimode and single mode (top to bottom).
    Numerical aperture * - the sine of the maximum angle between the beam and the axis. At the same time there is a total internal reflection in the distribution of radiation over the fiber.
    At a specific wavelength, the fiber is able to support about 4500 spatial modes. The input samples (images) are displayed on the spatial light modulator, after which they are redirected by the 4f system to the proximal (close to the center) edge of the MMF. At the far end of the fiber, another 4f system visualizes a speckle emanating from the distal (far from the center) edge of the fiber onto a CCD camera.
    CCD * is a charge-coupled device that implements the technology of controlled charge transfer in the volume of a semiconductor.
    To test the phase and amplitude models as input signals for the gradient MMF, a half-wave plate was installed before SLM, and a linear polarizer after SLM.

    As mentioned earlier, hand-written numbers were used as samples. They were taken from the MNIST database .

    Before being processed by DNN, each of the images recorded on CCD1 or CCD2 was cut to a size of 1024 × 1024 pixels. Next, the resulting speckle images were reduced to 32 × 32 pixels and used as input for DNN.

    Image number 2

    In images 2a and 2b we see samples of numbers (0 and 4). 2c and 2dthese are the same numbers, but after amplitude modulation, when it was the amplitude of the transmitted signal that was affected. 2e and 2f are sample numbers after phase modulation, when the phase of the carrier oscillation changed in direct proportion to the signal. We also see the speckles themselves, which were fixed on the distal edge of the fiber after passing a distance of 2 cm. It is quite difficult to

    distinguish the speckles ( 2g and 2h ). However, if we compare the images 2d and 2h (for example, consider the sample "4"), then we can isolate the difference that the DNN can determine ( 2i ). Thus, these distinctive features will allow the system to distinguish "0" from "4", "2" from "9", etc.

    Data Processing

    The convolutional neural network * of the Visual Geometry Group (VGG) type (3a) has become the basis of the speckle determination system and the reconstructed input images .
    A convolutional neural network * is an INS architecture, characterized by a convolution operation, when each image fragment is multiplied by a convolution matrix element by element, after which the result is summed up and written into the similar position of the output image.

    An example of a convolutional neural network architecture.
    The introduction of such a system made it possible to decipher images with greater accuracy. For the reconstruction of images, a “U-net” type of convolutional neural network with 14 hidden layers was used ( 3b ).

    Image No. 3

    Recall that the base of numbers from 20,000 was divided into three groups (16,000 - training, 2,000 - test and 2,000 - test).

    The training group was processed in batches of 50 pieces for the reconstruction network and 500 each for the definition network. At the same time, the parties were changed to avoid retraining * .
    Re-training * is a case when the system handles examples from a training sample well, but does not do well with examples from a test sample.
    In order to minimize the rms error, an optimization algorithm with a learning rate of 1 x 10 -4 was used .

    Networks passed a training stage no longer than 50 epochs (backward cycles). For each case, the training was repeated 10 times in order to collect statistical data on the accuracy of the training system.

    All DNNs were implemented on the basis of a single NVIDIA GeForce GTX 1080Ti graphics processor using the Python TensorFlow 1.5 library.

    Research results


    The first parameter that scientists decided to consider in more detail was the ability of the system to reconstruct the input data.

    The image above shows the results of the reconstruction of the numbers (0 ... 9), after passing the data through the fiber with a length of 0.1 m, 10 m and 1000 m.

    As we can see, the result of the procedure is very accurate, which confirms the ability of the U-net system to isolate the limiting features of the future image .

    The degree of accuracy of the reconstruction was also checked. This figure decreases with increasing fiber length from 96.9% (0.1 m) to 90.0% (1000 m).

    The decrease in accuracy is due to the fact that with a fiber length of 1 km, temperature irregularities (expansion of the material due to heat and / or a change in the refractive index) occur in it, which change the optical signal path. These processes lead to the fact that the speckle pattern at the distal end becomes unstable, which makes it harder to reconstruct it into the desired image.

    The researchers note that the external impact on the fiber also reduces the degree of accuracy of image reconstruction. Therefore, with further improvement of the system, optical fiber should be provided with thermal insulation and an isothermal environment to achieve the maximum level of reconstruction accuracy.

    The reconstruction procedure also perfectly levels artifacts on the processed image.

    For example, the system isolates the image ( 2a ) from the distal speckle ( 2g ), simultaneously removing defects projected onto the proximal fiber face ( 2c and 2e ). In addition, the system tries to eliminate artifacts caused by contamination or sample defects or structural inaccuracies of the fiber itself.

    The classification of the zirph samples

    Recreate the image of the system can, and the accuracy of this process is very impressive. We now turn to the analysis of how accurately the system is able to determine where what is the image (figure), that is, to classify the data after their reconstruction.

    From the graph and the table above it can be seen that the classification accuracy decreases with increasing length of the fiber involved in the transmission. A similar trend was with the accuracy of reconstruction. Regardless of whether the amplitude model or phase, accuracy decreases. At 2 cm of fiber - accuracy of 90%. This is a good indicator, but the fiber is too short. But with a length of 1 km, the accuracy drops to 30%. Researchers attribute this to an increase in scattering losses, mode connectivity, and drift of the distal speckle. All these “interferences” are caused by an increase in fiber length.

    Changes in the distal speckle

    The recording was made with a frame rate of 83 fps. As an experiment on a fiber of 1 km, an empty image was transmitted.

    (a) and (b) - 2 frames taken from the record above, (c) - their comparison.

    These frames were recorded with a difference of 2 seconds. And as we see in the image (s), the difference between them is quite significant. Such abrupt changes in speckle can be associated with temperature fluctuations of the environment or air flows over the device (image No. 1), which can cause small disturbances in the fiber. But when the fiber length increases, the force of such disturbances becomes palpable.

    It turns out that all the work of the system will be in vain because of these "interference". However, scientists do not stop such difficulties, but rather encourage them to think.

    It was decided to conduct a study on speckle displacement and how they affect the accuracy of image classification. For this, the VGG network was trained on the basis of 10,000 samples (half of the available ones), then testing was conducted, but with the other half of the samples. The process was repeated, changing 2 groups of samples in places. The results showed that there are no significant changes in the accuracy of the classification, since the shift of speckles is not accidental, which means that the INS is able to study, remember and determine it in the process.

    The difference between amplitude and phase modulation was insignificant. With a fiber length of 10 m and phase modulation, the classification was slightly better than with amplitude modulation. This is due to a more uniform distribution of light on the modes of the optical fiber. With amplitude modulation, the number of modes involved in the transmission is limited due to selective spatial excitation of the fibers.

    If we consider the variant of the fiber with a length of 1 km, then the amplitude modulation already exceeds the phase one. When the light passes through a long optical fiber, all modes are involved in the transmission of information at once.

    Error matrices (confusion matrices)

    In order to improve the classification accuracy, the INS was also trained using already reconstructed samples. Error matrices were also applied, which significantly improved the classification accuracy.

    For example, in the case of a fiber with a length of 1 km there is a confusion between the numbers 4 and 9, as well as between 3, 5, 6 and 8.

    To confirm, you just have to look at the results of the reconstruction.

    Figures 4 and 9

    Figures 3, 5, 6 and 8 The

    graphs above show changes in the accuracy of image classification over time:

    a - 10 m of fiber and distal speckles;
    b - 10 m of fiber and reconstructed images;
    с - 1 km of fiber and distal speckles;
    d - 1 km of fiber and reconstructed images.

    For a detailed acquaintance with the nuances of the study I strongly recommend to look at the report of scientists. A PDF version is also available on this page (“Get PDF” button).


    This study showed excellent results, which indicates its future development and practical implementation. The above techniques can be applied to telecommunications (decoding in multiplexing) and even in medicine (endoscopy).

    By calculating the time costs, scientists have found that most of them are spent on the preparation of the system, more precisely on its training. And this suggests that an already trained system can perform its functions incredibly quickly, down to milliseconds. The only limit will be the power of the hardware.

    Of course, there is still much to be studied in the field of artificial neural networks based on deep learning. But their usefulness is visible now. Improving existing systems, whatever their application, is just as important as creating new ones. After all, it is not always necessary to reinvent the wheel, if you can simply improve it. The main thing, as practice has shown, to think outside the box, to learn from our own and others' mistakes, to set ourselves sometimes impossible tasks and believe in our own strength. If an idea can benefit humanity, it must be implemented.

    Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending friends30% discount for Habr users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server correctly? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

    3 months for free if you pay for new Dell R630 for half a year - 2 x Intel Deca-Core Xeon E5-2630 v4 / 128GB DDR4 / 4x1TB HDD or 2x240GB SSD / 1Gbps 10 TB - from $ 99.33 a month , only until the end of August, order can be here .

    Dell R730xd 2 times cheaper? Only we have 2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read aboutHow to build the infrastructure of the building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?

    Also popular now: