Shape Reconstruction from Pictures

A Generalized Voxel Coloring Implementation


Photography was invented in about 1850. Only ten years later, Oliver Wendell Holmes invented the stereoscope viewer: a way of making two-dimensional photographs a little more three-dimensional. "In the future," he surely thought, "all pictures will be three dimensional." Yet here we are, 140 years later. We have been to the moon, and invented computers -- but our pictures our still paper-thin.

The question at hand is this: how can we derive shapes from a scene? Nature's most prominent solution to three-dimensional perception is based on stereo-vision, but it is also based on the ability to maneuver around an object to get multiple perspectives. Holmes' stereoscope viewer fails us because we can't "move around" in one of his pictures. Two pictures, in other words, is not enough.

Our goal is to reconstruct the geometry and color of an object, from a set of pictures. To make the problem easier, we require some information about each picture (camera position, direction, and either the field of view or the focal length).

Shapes from Silhouettes

The earliest work on this problem used silhouette images of an object:

Given some silhouettes, it's possible to reconstruct the "visual hull" of an object. Making these silhouette images is pretty easy, and fairly precise.

However, using this technique does have a number of downsides:

Color information can be "pasted on" to the final model via texture mapping, which will look fine on objects without concavities. But if the actual object has bowl-shaped concavities, this solution becomes messy very quickly.

Shapes from Color-Consistency

Another approach is to allow the computer to try to match up features in the photograph on its own. This means using the color images as guidelines for the three-dimensional form of the object. Thus the computer can use the following guideline: if two pixels in two pictures are different colors, then they must represent different places in space.

The basic idea of so-called "Space Carving"-related algorithms is as follows:

The original GVC algorithm
initialize SVL
for every voxel V
    carved(V) = false
loop {
    visibilityChanged = false
    compute item buffers by rendering voxels on SVL
    for every voxel V in SVL {
        compute vis(V)
        if (consist(vis(V)) = false) {
            visibilityChanged = true
            carved(V) = true
            remove V from SVL
            for all voxels N that are adjacent to V
                if (carved(n) = false and N is not in SVL)
                    add N to SVL
    if (visibilityChanged = false) {
        save voxel space

(One way about thinking about this is as follows. If you and I look at a hypothetical piece of space in the air, we will probably see different colors; you might see through it to the wall, and I might see through it to a stereo. But if we both look at space that is precisely on the surface of an object, we'll see the same color.)

Color-consistency algorithms differ some in the order in which they do the carving. I have chosen "Generalized Voxel Coloring" (GVC) because it allows arbitrary camera placement, and generally provides better results than some others. The technique was developed by Bruce Culbertson and Thomas Malzbender. Specifics of the algorithm are in the table to the right.

Setting up the camera

To get a set of pictures of an object, puting the object on a turntable is probably the easiest approach. I used a plate that spun on a Lego Mindstorms wheel (a record turntable would be better). Whatever method you use, these techniques require you to know (or at least guess) some things about each picture you take:

For this project, I just eyeballed most of these parameters. You can get more information by using OpenCV, but I warn you, it's a big package, and will take at least an evening to get the parameters you need. It does have a lot of useful-looking code, though, so you might check it out anyway.

The pictures should be saved as numbered files, so they can be referenced easily.

Generating Models with Archimedes

Archimedes uses color-consistency and (optionally) silhouettes to generate rotatable models. These models are made of cubes (voxels), which looks blocky, so there is also an option to polygonize the voxels into something smoother. Below, you can see the original image on the left, the voxel version in the middle, and the polygonized version on the right.

To use Archimedes, you need the following:

Once you get your camera parameters, you have to write a scan file; the file format is fairly simple XML. An example file is included with the distribution.

Once you've done that, just run Archimedes and load your file from the "File" menu. It will ask you to verify your voxel parameters, and then it will render a rotatable model for you. Note that it takes some time to do the carving (anywhere from a minute to a few hours); progress will be indicated on screen. Also: once it has begun carving, it won't respond to user input until it's done.

Further Reading

Anyone interested in these topics should definitely read the following two papers on the state of the art in volumetric reconstruction:


I'd like to thank my teachers, Hanspeter Pfister and Jeroen van Baar, who have taught the great class for which this work was done.

I'd also like to give credit to those whose code I adapted into my own:

- Matt Loper

Speak clearly, if you speak at all; carve every word before you let it fall. -- Oliver Wendell Holmes, Jr.

Generated on Tue May 21 03:34:16 2002 for Archimedes by doxygen1.2.15 Logo