Fast registration of special points of images using voting bigraph

Detection and registration of image features has many applications in robotics, video compression, etc. Fast and accurate registration is an unattainable dream of many programmers and users. It is either quick or neat ...

I have been working on image processing for quite some time (about 17 years), including reconstructing 3D mesh from video, and I even have my own company selling such a product. However, I decided to part of the development and lay out the key idea in open access without patent blocking.

The general idea of ​​existing relatively fast algorithms is as follows:

  1. (Feature detection) Find any special points in each picture, preferably with subpixel accuracy.
  2. (Feature description) Build a certain array of features for each point that fully or partially meets the following requirements:
    • Invariance to:
      • (Physics) noise, exposure changes (brightness and contrast), compression artifacts
      • (2D Geometry) rotations, shifts, scaling
      • (3D geometry) projection distortion
    • Compactness (less memory, faster comparison)
    • Varieties (Near the selected point in a certain predefined pattern):
    • Bar graph of gradients, brightness, colors. (SIFT, SURF ...)
  3. Reading values ​​and level normalization. (ORB, BRIFF ...)
  4. For a pair of pictures, find the correspondence of points using the minimum distance (the sum of the absolute differences (L1) or the sum of the squares of the differences (L2)) between the feature arrays, the asymptotic complexity of this step is O (N ^ 2), where N is the number of singular points.
  5. (Optional): Check the geometric compatibility of the pairs using for example RANSAC and repeat step 3


The proposed registration scheme is as follows.
For each picture (detect):


  1. Find feature points
  2. Divide the singular points into 2 groups according to the sign of the difference ( DoG ) between the value at the point and the average in a small neighborhood.
  3. For each point from the first group, find about a dozen neighbors.
    At this stage, we have a bigraph of ~ N / 2 * 10 oriented edges
  4. For each edge, we sample at the points of the pattern scaled and rotated along with the edge
  5. Build a bit (~ 26 bit) hash by comparing samples



To register (bind):

  1. Build a LUT from the edges of the right picture by hash value
  2. For each edge from the left picture we look for O (1) edge (s) with the same hash in LUT
  3. Add 2 index of points from the left edge to 2 index of points from the right
  4. We go through all the points of the right picture and count the number of votes


Result:

for FullHD on the i7-6900K using single core

For approximately 10,000 points per image
detect 29.0556 ms / per image
bind 10.46563 ms / per pair

Advantages: fast, reliable with small perspective distortions (a small number of incorrectly connected points), simple code, is not covered by patents (as far as I know).


Actually the source code
Based on this scheme, I am now writing a blank for the Raspberry Pi SLAM in my spare time.

Also popular now: