In this last post in my series on using a homemade stereo camera to produce 3d point clouds, I’ll show you how to improve your 3d point clouds in order to get optimal results. I’ll also show you where your passive stereo camera will have the best chances of producing good point clouds. With these steps, you should be able to produce and interactively view colored 3d point clouds. You’ll also be able to generate them in your own Python programs so that you can further manipulate them.
If you haven’t done it yet, get ready to go. Here are the previous steps:
Install my StereoVision package or fork it on Github
Put together your stereo camera
Calibrate the cameras
Once this is finished, you can start producing 3d point clouds immediately, but you might be unsatisfied with the results. You can get better results by tuning your block matching algorithm to produce better disparity maps, which are the prerequisite for the point cloud you want at the end.
Tuning the block matcher to optimize results
Passive stereo vision works by looking at two pictures of the same objects and searching for matching points that can be found in both images. I say passive rather than active stereo vision, because active stereo vision projects a pattern onto the image and then computes how that pattern is deformed when it hits the real objects to create a 3d model. This method is a bit more robust, but it’s not what we’re working with, so we’ll ignore it for now.
Passive stereo vision works best if there are lots of features that you can easily tell apart. If the features on your images are too homogeneous, so that it’s hard to say where they match and where they don’t, the algorithm won’t work well. You can notice the same phenomenon with your own eyes: If you hold a blank sheet of paper in front of your face and try to figure out its distance while moving it back and forth, it’s difficult, unless you use the size of the page or the position of your body as a reference. Unless you teach your computer to do so, it won’t be able to estimate distance to objects by their size, because it doesn’t know how big they are in 3d space. Also, if the features are only visible through one camera, the algorithm will have problems. You can see this problem if you’re trying to judge the distance to something that’s only visible through one eye, or if you’re trying to judge the distance to a curved, reflective surface that reflects differently at each of your eyes. Finally, although you want overlap in your pictures, you need disparity in order to detect 3d structures. Stereovision loses depth perception with distance – you’ll notice that with your eyes, it becomes increasingly difficult to accurately gauge the distance to objects as they move farther away. Of course, your brain can use a whole lot of extra information in order to make its distance estimation more robust, but your computer can’t – at least not right at this moment.
That means that good stereo image pairs have rich textures with lots of detectable features that are visible on both pictures, and that each physical feature imaged on each picture looks roughly the same in both. Imaged features should not be too far away from the camera rig to have detectable disparity. This may sound a bit confusing, but I’ve got some examples coming up.
The other important factor in producing good point clouds from stereo images is making sure that the algorithm is well-adapted to your camera setup. Block matching algorithms have a lot of different settings. We’ll be working with a semi-global block matching algorithm, which has a lot of settings that affect its performance on different image pairs and with different setups. The goal is to find settings that work well with your stereo rig over a wide variety of images.
The BMTuner class
I’ve implemented a graphical way of tuning block matching algorithms on the BMTuner class in StereoVision’s UI utilities. It takes an instance of the BlockMatcher class and comprehends what parameters can be set, then instantiates trackbars so that the user can set them. The callback functions for the trackbars are generated dynamically using decorators. The BMTuner also requires a rectified image pair, which is passed to the BlockMatcher to obtain a new disparity map every time the user sets a new parameter.
Of course, you can work with this in your own programs, but I’ve also written a program that lets you adjust the parameters for your block matchers using this class and save the last options to file so that you can reuse them for producing point clouds.
Tuning your algorithm
Tuning your block matcher is pretty easy:
me@localhost:~> tune_blockmatcher --help usage: tune_blockmatcher [-h] [--use_stereobm] [--bm_settings BM_SETTINGS] calibration_folder image_folder Read images taken from a calibrated stereo pair, compute disparity maps from them and show them interactively to the user, allowing the user to tune the stereo block matcher settings in the GUI. positional arguments: calibration_folder Directory where calibration files for the stereo pair are stored. image_folder Directory where input images are stored. optional arguments: -h, --help show this help message and exit --use_stereobm Use StereoBM rather than StereoSGBM block matcher. --bm_settings BM_SETTINGS File to save last block matcher settings to.
Note that StereoSGBM is the default algorithm, as I have yet to be impressed with the results using OpenCV’s StereoBM. Also remember to use the –bm_settings flag if you want to save the last used settings to a file.
I use the program any time I’ve rebuilt my camera and thus have to recalibrate it. First, I take a few stereo images pair of scenes that are fairly easy to reconstruct, so that I’ll be able to judge the resultant disparity maps well. I adjust the block matcher settings in the GUI until I’m satisfied with the disparity map, then I hold down a key with the focus on the disparity map until the program jumps to the next picture. After all the images have been analyzed, the most common settings are reported. Then I run the program again, adjusting the BlockMatcher to use the most common settings from the last run through and save configuration to disk so I can reuse it later.
Producing 3d point clouds
After you’ve tuned your block matcher, you can take the settings and use them to produce better point clouds. This is done with images_to_pointcloud. Run it like this:
me@localhost:~> images_to_pointcloud --help usage: images_to_pointcloud [-h] [--use_stereobm] [--bm_settings BM_SETTINGS] calibration left right output Read images taken with stereo pair and use them to produce 3D point clouds that can be viewed with MeshLab. positional arguments: calibration Path to calibration folder. left Path to left image right Path to right image output Path to output file. optional arguments: -h, --help show this help message and exit --use_stereobm Use StereoBM rather than StereoSGBM block matcher. --bm_settings BM_SETTINGS Path to block matcher's settings.
The resultant point cloud can be viewed with Meshlab.
A discussion on results
I used this program to tune my block matcher for five test images and then used the block matcher settings to produce point clouds for the training images and a set of independent images. I find the results quite I’ve explained it using a video, since it’s easier to explain the results visually: