3D modeling of the environment using a video camera

Original author: Jennifer Chu, MIT News Office
  • Transfer

New technology from MIT allows you to simulate a 3D-map of the area / premises using a conventional video camera.



The bottom line is that a conventional camera takes pictures of the environment and remembers it using a special algorithm. As soon as the camera, after its round trip, returns to the starting point of the route, the algorithm understands that this is one and the same place and quickly “sews” the ends of space, forming a connected 3D map.


“I have a dream to make a complete model of the entire Massachusetts Institute of Technology,” says John Leonard, who works in computer science at the MIT’s artificial intelligence laboratory. “With this 3D map, applicants could“ swim ”MIT like fish in a large aquarium. There is still much to be done, but I think it’s doable. ”

Leonard, Whelan and other team members - Michael Caess from the Massachusetts Institute of Technology and John MacDonald from the National University of Ireland - will present their work at the 2013 International Conference on Intellectual Work and systems in Tokyo.

Device, principle of operation



The problem of millions of dots

A Kinect camera captures a color image, and a depth sensor measures the distance to each pixel in the resulting image. This data can be processed by the application to create a 3D map.

In 2011, a team from King's College London and Microsoft Research developed an application for similar 3D modeling called KinectFusion, which successfully builds 3D models in real time. It turns out very detailed models of space, with a resolution of less than a centimeter, but only for a limited fixed area in space.

Whelan, Leonard and their team tried to develop a methodology for creating 3D maps with the same high resolution for hundreds of meters, in various conditions and in real time. The goal, they say, was ambitious in terms of data: the simulated environment will consist of millions of 3D points. To create an accurate map, it would be necessary to determine which homogeneous sections can be aligned without compromising the quality of millions of other different sections. Previous research teams have tackled this problem by re-filming - an impractical approach if you want to create a map in real time.

Instead, Whelan and his colleagues came up with a much faster approach, in which the camera takes off space in two stages: the first camera is used on the front of the device and the second camera on the back of the device.

In front of the device, the researchers developed an algorithm for tracking the position of the camera at any time along its route. Since the Kinect camera makes 30 frames per second, the algorithm shows how many and in which direction the camera moves between frames. At the same time, the algorithm creates a 3D model consisting of a cloud of small pieces - cross sections of thousands of 3D points in a direct environment. Each piece of cloud is associated with a specific camera position.

As the camera moves down the corridor, pieces of the cloud are included in the global 3D map.

The camera at the rear of the device again removes the environment, finds familiar pieces corresponding to the position of the camera and completes the missing pieces. Thus, the device automatically assigns the location of the cloud of slices to the position of the camera instead of remembering the location of each of the slices individually.

The team used its technique to create 3D maps of the center of the Massachusetts Institute of Technology, as well as indoor and outdoor locations in London, Sydney, Germany and Ireland. In the future, the group suggests that the method can be used to give robots much richer information about their environment. For example, a 3D map will not only help the robot decide whether to turn left or right, but also provide more detailed information.

“Imagine that a robot could look at one of these cards and tell in a fire where a fire extinguisher is located, or take other reasonable actions based on the situation,” Whelan says. “These are“ saw and did ”systems, and we feel that there is great potential for this type of technology.”

Kostas Danilidis, a professor of computer and information sciences at the University of Pennsylvania, considers the design of a research team as a useful method for using robots in everyday tasks as well as in the construction inspection.

“A system that minimizes errors in real time allows the lawnmower or vacuum cleaner to return to the same position without the use of special markers. The same idea applies to the rover’s navigation, ”says Danilidis, who was not involved in the study. “This device makes it possible to use accurate 3D environment models for validation in architecture or infrastructure. It would be interesting to see how the device behaves in difficult climatic conditions. ”

This study received support from the Irish Science Foundation, the Irish Council for the Research and Management of Naval Research.

Also popular now: